Udoo Quad test drive
Here is a brief update regarding my experiences so far with the Udoo Quad board. I call this kicking the tires, but it simply amounts to tinkering with the board and getting a better understanding of it’s capabilities.
My choice of OS for this round of testing is Ubuntu Studio 12.04 armHF, which I obtained from the Udoo Community site downloads page.
As the Udoo Quad includes an on-board SATA connected, I followed the necessary steps to install the OS to the external disk, and to boot from it by selecting the appropriate device from the U-Boot environment. I used the following page as a high-level guide.
The disk in this case was an older ~80GB Hitachi disk that I had in my spares and suitable for the intended purpose. With the system booted up, here is what we see:
root@udoo-studio-hfp:~# uname -a
Linux udoo-studio-hfp 3.0.35 #1 SMP PREEMPT Mon Dec 16 14:46:12 CET 2013 armv7l armv7l armv7l GNU/Linux
root@udoo-studio-hfp:~# cat /proc/cpuinfo
Processor : ARMv7 Processor rev 10 (v7l)
processor : 0
BogoMIPS : 1988.28
processor : 1
BogoMIPS : 1988.28
processor : 2
BogoMIPS : 1988.28
processor : 3
BogoMIPS : 1988.28
Features : swp half thumb fastmult vfp edsp neon vfpv3
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10
Hardware : SECO i.Mx6 UDOO Board
Revision : 63012
Serial : 0000000000000000
root@udoo-studio-hfp:~# lsscsi
[0:0:0:0] disk ATA Hitachi HTS54128 HP3O /dev/sda
Using the trusty gnome-disk-utility, the read benchmark returns the following results. If this all looks a bit Mac OS X ish - don’t be alarmed. I’m connecting to my Udoo from my Macbook and tunneling X over ssh. Again keep in mind here that this is an old disk.
I was surprised to find the the cpufreq utilities all worked as expected on the system also. By default, the system booted in a conservative mode (~396 MHz) and with cpufreq-set I successfully enabled the performance governor.
root@udoo-studio-hfp:/usr/bin# ./cpufreq-info
cpufrequtils 007: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
driver: imx
CPUs which run at the same hardware frequency: 0 1 2 3
CPUs which need to have their frequency coordinated by software: 0 1 2 3
maximum transition latency: 61.0 us.
hardware limits: 396 MHz - 996 MHz
available frequency steps: 996 MHz, 792 MHz, 396 MHz
available cpufreq governors: interactive, conservative, ondemand, userspace, powersave, performance
current policy: frequency should be within 396 MHz and 996 MHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 996 MHz (asserted by call to hardware).
cpufreq stats: 996 MHz:8.10%, 792 MHz:0.63%, 396 MHz:91.27% (172036)
....
As I indicated at the outset, the system has been installed with a ARM HF
prepared Linux distribution. This implies that the distro has been compiled
with the appropriate flags to enable hardware Floating Point Unit support.
Which should help us to attain better performance for applications which make
use of floating point arithmetic.
The system readelf tool can be used to interrogate a binary for architecture information. In this case, I’ve installed the OS supplied HPC Challenge package to give the board it’s baptism into the world of Technical Computing.
root@udoo-studio-hfp:/etc/apt# dpkg --get-selections |grep hpcc
hpcc install
root@udoo-studio-hfp:/etc/apt# readelf -A /usr/bin/hpcc
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_FP_arch: VFPv3-D16
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
Tag_ABI_VFP_args: VFP registers
Tag_CPU_unaligned_access: v6
Tag_DIV_use: Not allowed
Now that we’re done kicking the tires, lets take it for a drive!
The intent here was not for a Top 500 run. Rather, just to stress the Udoo Quad with a more intensive workload. For this purpose, I wrote a small Qt program to display the CPU temperature. I was curious to understand how the system would heat up given that it’s passively cooled (with a nice heatsink).
The output from my Linpack run is below:
================================================================================
HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008
Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================
An explanation of the input/output parameters follows:
T/V : Wall time / encoded variant.
N : The order of the coefficient matrix A.
NB : The partitioning blocking factor.
P : The number of process rows.
Q : The number of process columns.
Time : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.
The following parameter values will be used:
N : 7000
NB : 90 192 110
PMAP : Row-major process mapping
P : 2
Q : 2
PFACT : Right
NBMIN : 4
NDIV : 2
RFACT : Crout
BCAST : 1ringM
DEPTH : 1
SWAP : Mix (threshold = 64)
L1 : transposed form
U : transposed form
EQUIL : yes
ALIGN : 8 double precision words
--------------------------------------------------------------------------------
- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be 1.110223e-16
- Computational tests pass if scaled residuals are less than 16.0
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR11C2R4 7000 90 2 2 133.23 1.717e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0033466 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR11C2R4 7000 192 2 2 130.95 1.747e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0034782 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR11C2R4 7000 110 2 2 137.24 1.667e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0034961 ...... PASSED
================================================================================
Finished 3 tests with the following results:
3 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------
During the runs of HPCC (in particular the HPLinpack portion), I observed the CPU temperature climb to ~60 degrees Celsius.
I produced a short video showing a run of HPCC along with the Qt CPU temperature app that I created.
That wraps up a successful first test drive. What’s next? OpenCL sees like the next logical step.