hpc.social


High Performance Computing
Practitioners
and friends /#hpc
Share: 
This is a crosspost from   Blogs on Technical Computing Goulash Recent content in Blogs on Technical Computing Goulash. See the original post here.

Standing up a IBM Spectrum LSF Community Edition cluster on Arm v8

So, you’ve got yourself a shiny new (or maybe not) system based upon a 64-bit Arm (Arm v8) processor that you want to put through it’s paces. For me, this happens to be a MACCHIATObin board powered by a Marvell ARAMADA 8040 based on ARM Cortex A-72 cores and installed with Ubuntu 16.04.3 LTS. You can read about my shenanigans running HPL on my system with a passively cooled CPU and running up against some overheating conditions - much like the head gasket failure in my car this past summer - but I digress!

While I wait for the Noctua cooling fan to arrive, it’s given me the opportunity to revisit installing a job scheduler on the system. I’ll be using this as a way to manage access to the system resources which will be necessary to arbitrate the various benchmark jobs that I expect to be running over time. There exists a number of workload schedulers today, from open source to closed source proprietary. I’ve selected IBM Spectrum LSF Community Edition as it’s free to download and use (with restrictions) and supports Linux on Arm v8. Did you know that Spectrum LSF (known previously as Platform LSF) has been around for 25 years? That’s quite a pedigree and because it’s shipped as binaries, I won’t have to muck about compiling it - which is an added bonus. To download IBM Spectrum LSF Community Edition follow the QR code below :)

Below I walk through the steps to install IBM Spectrum LSF Community Edition on my Arm v8 based system. The steps should be the same for the other platforms supported by IBM Spectrum LSF Community Edition including Linux on POWERLE and Linux on x86-64. The procedure below assumes that you have a supported OS installed and have configured networking, and necessary user accounts. This is not meant to be an exhaustive tutorial on IBM Spectrum LSF Community Edition. If you’re looking for help, check out the forum here.

1. Download and extract

We begin by downloading the armv8 IBM Spectrum LSF Community Edition package and quick start guide. We expand the gzipped tarball to get the installer and “armv8” binary compressed tarballs. Next, we extract the lsfinstall tarball. This contains the installer for IBM Spectrum LSF Community Edition.

root@flotta:/tmp# ls
lsfce10.1-armv8.tar.gz  lsfce10.1_quick_start.pdf
root@flotta:/tmp# gunzip lsfce10.1-armv8.tar.gz 
root@flotta:/tmp# tar -xvf lsfce10.1-armv8.tar 
lsfce10.1-armv8/
lsfce10.1-armv8/lsf/
lsfce10.1-armv8/lsf/lsf10.1_lnx312-lib217-armv8.tar.Z
lsfce10.1-armv8/lsf/lsf10.1_no_jre_lsfinstall.tar.Z

root@flotta:/tmp/lsfce10.1-armv8/lsf# zcat lsf10.1_no_jre_lsfinstall.tar.Z | tar xvf -
lsf10.1_lsfinstall/
lsf10.1_lsfinstall/instlib/
lsf10.1_lsfinstall/instlib/lsflib.sh
lsf10.1_lsfinstall/instlib/lsferror.tbl
lsf10.1_lsfinstall/instlib/lsfprechkfuncs.sh
lsf10.1_lsfinstall/instlib/lsflicensefuncs.sh
lsf10.1_lsfinstall/instlib/lsfunpackfuncs.sh
lsf10.1_lsfinstall/instlib/lsfconfigfuncs.sh
lsf10.1_lsfinstall/instlib/resconnectorconfigfuncs.sh
lsf10.1_lsfinstall/instlib/lsf_getting_started.tmpl
....
....

2. Configure the installer

After extracting the lsfinstall tarball, you’ll find the installation configuration file install.config. This file controls the installation location, LSF administrator account, name of cluster, master node (where scheduler daemon runs), location of binary source packages among other things. I’ve run a diff here to show the settings. In brief, I’ve configured the following:

root@flotta:/tmp/lsfce10.1-armv8/lsf/lsf10.1_lsfinstall# diff -u4 install.config install.config.org 
--- install.config 2017-08-30 20:17:13.148583971 -0400
+++ install.config.org 2017-08-30 20:15:30.283904454 -0400
@@ -40,9 +40,8 @@
 #     (During an upgrade, specify the existing value.)
 #**********************************************************
 # -----------------
 # LSF_TOP="/usr/share/lsf"
-LSF_TOP="/raktar/LSFCE"
 # -----------------
 # Full path to the top-level installation directory {REQUIRED}
 #
 # The path to LSF_TOP must be shared and accessible to all hosts
@@ -51,9 +50,8 @@
 # all host types (approximately 300 MB per host type).
 #
 # -----------------
 # LSF_ADMINS="lsfadmin user1 user2"
-LSF_ADMINS="gsamu"
 # -----------------
 # List of LSF administrators {REQUIRED}
 #
 # The first user account name in the list is the primary LSF
@@ -69,9 +67,8 @@
 # Secondary LSF administrators are optional.
 #
 # -----------------
 # LSF_CLUSTER_NAME="cluster1"
-LSF_CLUSTER_NAME="Klaszter"
 # -----------------
 # Name of the LSF cluster {REQUIRED}
 #
 # It must be 39 characters or less, and cannot contain any
@@ -85,9 +82,8 @@
 #**********************************************************
 #
 # -----------------
 # LSF_MASTER_LIST="hostm hosta hostc"
-LSF_MASTER_LIST="flotta"
 # -----------------
 # List of LSF server hosts to be master or master candidate in the
 # cluster {REQUIRED when you install for the first time or during
 # upgrade if the parameter does not already exist.}
@@ -96,9 +92,8 @@
 # cluster. The first host listed is the LSF master host.
 #
 # -----------------
 # LSF_TARDIR="/usr/share/lsf_distrib/"
-LSF_TARDIR="/tmp/lsfce10.1-armv8/lsf"
 # -----------------
 # Full path to the directory containing the LSF distribution tar files.
 #
 # Default: Parent directory of the current working directory.

3. Install IBM Spectrum LSF Community Edition

With the installation configuration complete, we can now invoke the installer script. Note that I had to install JRE on my system (in my case apt-get install default-jre) as it’s a requirement for IBM Spectrum LSF Community Edition.

root@flotta:/tmp/lsfce10.1-armv8/lsf/lsf10.1_lsfinstall# ./lsfinstall -f ./install.config

Logging installation sequence in /tmp/lsfce10.1-armv8/lsf/lsf10.1_lsfinstall/Install.log

International License Agreement for Non-Warranted Programs

Part 1 - General Terms

BY DOWNLOADING, INSTALLING, COPYING, ACCESSING, CLICKING ON 
AN "ACCEPT" BUTTON, OR OTHERWISE USING THE PROGRAM, 
LICENSEE AGREES TO THE TERMS OF THIS AGREEMENT. IF YOU ARE 
ACCEPTING THESE TERMS ON BEHALF OF LICENSEE, YOU REPRESENT 
AND WARRANT THAT YOU HAVE FULL AUTHORITY TO BIND LICENSEE 
TO THESE TERMS. IF YOU DO NOT AGREE TO THESE TERMS,

* DO NOT DOWNLOAD, INSTALL, COPY, ACCESS, CLICK ON AN 
"ACCEPT" BUTTON, OR USE THE PROGRAM; AND

* PROMPTLY RETURN THE UNUSED MEDIA AND DOCUMENTATION TO THE 

Press Enter to continue viewing the license agreement, or 
enter "1" to accept the agreement, "2" to decline it, "3" 
to print it, "4" to read non-IBM terms, or "99" to go back 
to the previous screen.
1
LSF pre-installation check ...

Checking the LSF TOP directory /raktar/LSFCE ...
... Done checking the LSF TOP directory /raktar/LSFCE ...
You are installing IBM Spectrum LSF - 10.1 Community Edition.

Checking LSF Administrators ...
   LSF administrator(s):       "gsamu"
   Primary LSF administrator:  "gsamu"
Checking the configuration template  ...
    Done checking configuration template ...
    Done checking ENABLE_STREAM ...

[Wed Aug 30 20:36:46 EDT 2017:lsfprechk:WARN_2007]
    Hosts defined in LSF_MASTER_LIST must be LSF server hosts. The
    following hosts will be added to server hosts automatically: flotta.

Checking the patch history directory  ...
Creating /raktar/LSFCE/patch ...
... Done checking the patch history directory /raktar/LSFCE/patch ...

Checking the patch backup directory ...
... Done checking the patch backup directory /raktar/LSFCE/patch/backup ...


Searching LSF 10.1 distribution tar files in /tmp/lsfce10.1-armv8/lsf Please wait ...

  1) linux3.12-glibc2.17-armv8

Press 1 or Enter to install this host type: 1

You have chosen the following tar file(s):
    lsf10.1_lnx312-lib217-armv8

Checking selected tar file(s) ...
... Done checking selected tar file(s).


Pre-installation check report saved as text file: 
/tmp/lsfce10.1-armv8/lsf/lsf10.1_lsfinstall/prechk.rpt.

... Done LSF pre-installation check.

Installing LSF binary files " lsf10.1_lnx312-lib217-armv8"...
Creating /raktar/LSFCE/10.1 ...

Copying lsfinstall files to /raktar/LSFCE/10.1/install
Creating /raktar/LSFCE/10.1/install ...

....

....

lsfinstall is done.

To complete your LSF installation and get your 
cluster "Klaszter" up and running, follow the steps in 
"/tmp/lsfce10.1-armv8/lsf/lsf10.1_lsfinstall/lsf_getting_started.html".

After setting up your LSF server hosts and verifying 
your cluster "Klaszter" is running correctly, 
see "/raktar/LSFCE/10.1/lsf_quick_admin.html" 
to learn more about your new LSF cluster.

After installation, remember to bring your cluster up to date 
by applying the latest updates and bug fixes. 

4. Siesta time!

Wow, that was easy. IBM Spectrum LSF Community Edition is now installed. Pat yourself on the back and grab your favourite BEvERage as a reward.

5. Fire it up!

Now that IBM Spectrum LSF Community Edition is installed, we can start it up so that it’s ready to accept and manage work! As the root user we source the environment for IBM Spectrum LSF Community Edition which sets the PATH and other needed environment variables. Next, we issue 3 commands to start up the IBM Spectrum LSF Community Edition daemons.

root@flotta:/raktar/LSFCE/conf# . ./profile.lsf

root@flotta:/raktar/LSFCE/conf# lsadmin limstartup
Starting up LIM on <flotta.localdomain> ...... done
root@flotta:/raktar/LSFCE/conf# lsadmin resstartup
Starting up RES on <flotta.localdomain> ...... done
root@flotta:/raktar/LSFCE/conf# badmin hstartup
Starting up slave batch daemon on <flotta.localdomain> ...... done

With IBM Spectrum LSF Community Edition now running, we should be able to query the cluster for status. Note that we’ve setup a single node cluster. IBM Spectrum LSF Community Edition allows you to build up clusters with up to 10 nodes. We run a series of commands to check if the cluster is alive and well.

root@flotta:/raktar/LSFCE/conf# lsid
IBM Spectrum LSF Community Edition 10.1.0.0, Jun 15 2016
Copyright IBM Corp. 1992, 2016. All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

My cluster name is Klaszter
My master name is flotta.localdomain

 

root@flotta:/raktar/LSFCE/conf# lsload
HOST_NAME       status  r15s   r1m  r15m   ut    pg  ls    it   tmp   swp   mem
flotta.localdom     ok   0.0   0.3   0.4  13%   0.0   1     0 1444M    0M  3.3

 

root@flotta:/raktar/LSFCE/conf# bhosts
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV 
flotta.localdomain ok              -      4      0      0      0      0      0

The above commands show that the IBM Spectrum LSF Community Edition cluster is up and running. The batch system is now ready to accept workload!

6. Now what?

We are ready to rock n' roll! To christen this environment, I decided to run some MPI tests. Coincidentally, MPI is also celebrating a silver anniversary this year.

And what better MPI tests to run on my Arm system than the Intel MPI Benchmarks :) Of course, the Intel MPI Benchmarks have to be compiled. To keep things simple, I only compiled the MPI1 benchmark set. This required me to change the CC designation in the make_ict makefie from mpiicc to mpicc (as I am obviously not using Intel Compilers).

root@flotta:/raktar/imb/imb/src# gmake -f make_ict IMB-MPI1
sleep 1; touch exe_mpi1 *.c; rm -rf exe_io exe_ext  exe_nbc exe_rma
gmake -f Makefile.base MPI1 CPP=MPI1
gmake[1]: Entering directory '/raktar/imb/imb/src'
mpicc    -DMPI1  -c IMB.c
mpicc    -DMPI1  -c IMB_utils.c
mpicc    -DMPI1  -c IMB_declare.c
mpicc    -DMPI1  -c IMB_init.c
mpicc    -DMPI1  -c IMB_mem_manager.c
mpicc    -DMPI1  -c IMB_parse_name_mpi1.c
mpicc    -DMPI1  -c IMB_benchlist.c
mpicc    -DMPI1  -c IMB_strgs.c
mpicc    -DMPI1  -c IMB_err_handler.c
mpicc    -DMPI1  -c IMB_g_info.c
mpicc    -DMPI1  -c IMB_warm_up.c
mpicc    -DMPI1  -c IMB_output.c
mpicc    -DMPI1  -c IMB_pingpong.c
mpicc    -DMPI1  -c IMB_pingping.c
mpicc    -DMPI1  -c IMB_allreduce.c

....

....
mpicc    -o IMB-MPI1 IMB.o IMB_utils.o IMB_declare.o  IMB_init.o IMB_mem_manager.o IMB_parse_name_mpi1.o  IMB_benchlist.o IMB_strgs.o IMB_err_handler.o IMB_g_info.o  IMB_warm_up.o IMB_output.o IMB_pingpong.o IMB_pingping.o IMB_allreduce.o IMB_reduce_scatter.o IMB_reduce.o IMB_exchange.o IMB_bcast.o IMB_barrier.o IMB_allgather.o IMB_allgatherv.o IMB_gather.o IMB_gatherv.o IMB_scatter.o IMB_scatterv.o IMB_alltoall.o IMB_alltoallv.o IMB_sendrecv.o IMB_init_transfer.o  IMB_chk_diff.o IMB_cpu_exploit.o IMB_bandwidth.o   
gmake[1]: Leaving directory '/raktar/imb/imb/src'

So we have our MPI benchmark compiled and our workload scheduler up and running. Let’s get busy! As user gsamu we source the environment for IBM Spectrum LSF Community Edition, and submit a 4-way instance of the Intel MPI Benchmark MPI1 test suite for execution on our cluster.

gsamu@flotta:/$ . /raktar/LSFCE/conf/profile.lsf 
gsamu@flotta:/$ lsid
IBM Spectrum LSF Community Edition 10.1.0.0, Jun 15 2016
Copyright IBM Corp. 1992, 2016. All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

My cluster name is Klaszter
My master name is flotta.localdomain

Drumroll please…The MPI benchmark runs through successfully. Note that we’ve submitted the job to IBM Spectrum LSF Community Edition interactively - with the -I parameter. Jobs can also be run non-interactively and users can peek at the standard output during runtime using the bpeek command.

gsamu@flotta:/$ bsub -I -q interactive -n 4 mpirun -np 4 /raktar/imb/imb/src/IMB-MPI1
Job <1396> is submitted to queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on flotta.localdomain>>
#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2017 update 2, MPI-1 part    
#------------------------------------------------------------
# Date                  : Thu Aug 30 21:46:28 2017
# Machine               : aarch64
# System                : Linux
# Release               : 4.4.8-armada-17.02.2-g4126e30
# Version               : #1 SMP PREEMPT Sat May 27 18:52:53 CDT 2017
# MPI Version           : 3.0
# MPI Thread Environment: 


# Calling sequence was: 

# /raktar/imb/imb/src/IMB-MPI1

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE 
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM  
#
#

# List of Benchmarks to run:

# PingPong
# PingPing
# Sendrecv
# Exchange
# Allreduce
# Reduce
# Reduce_scatter
# Allgather
# Allgatherv
# Gather
# Gatherv
# Scatter
# Scatterv
# Alltoall
# Alltoallv
# Bcast
# Barrier

#---------------------------------------------------
# Benchmarking PingPong 
# #processes = 2 
# ( 2 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         1.03         0.00
            1         1000         1.17         0.85
            2         1000         1.18         1.69
            4         1000         1.19         3.36
            8         1000         0.70        11.43
           16         1000         0.67        23.89
           32         1000         0.68        47.37
           64         1000         0.70        90.84
          128         1000         0.72       176.78
          256         1000         0.80       319.61
          512         1000         1.12       455.51
         1024         1000         1.89       540.52
         2048         1000         2.20       932.37
         4096         1000         3.93      1042.37
         8192         1000         5.93      1380.77
        16384         1000         8.76      1869.69
        32768         1000        14.92      2195.65
        65536          640        24.37      2689.26
       131072          320        41.37      3168.39
       262144          160        81.48      3217.12
       524288           80       193.81      2705.22
      1048576           40       443.74      2363.05
      2097152           20       860.30      2437.71
      4194304           10      1692.45      2478.24

#---------------------------------------------------
# Benchmarking PingPing 
# #processes = 2 
# ( 2 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000         0.81         0.00
            1         1000         0.85         1.17
            2         1000         0.85         2.34
            4         1000         0.85         4.68
            8         1000         0.86         9.34
           16         1000         0.89        17.96
           32         1000         0.89        35.99
           64         1000         0.92        69.71
          128         1000         0.97       132.63
          256         1000         1.01       254.50
          512         1000         1.31       391.16
         1024         1000         2.45       418.61
         2048         1000         3.06       670.15
         4096         1000         5.27       776.63
         8192         1000         8.09      1012.73
        16384         1000        11.56      1417.31
        32768         1000        18.10      1810.29
        65536          640        31.07      2108.97
       131072          320        67.01      1955.95
       262144          160       134.24      1952.73
       524288           80       358.88      1460.91
      1048576           40       895.85      1170.49
      2097152           20      1792.75      1169.79
      4194304           10      3373.79      1243.20

....

....

# All processes entering MPI_Finalize

There you have it. if you’re after more information about IBM Spectrum LSF, visit here. .