1.4.3.4. OpenMPI with CVMFS with AVX-512 CPU instructions

This is similar to the previous example but with AVX-512 instructions instead. Please refer to the last example for more explanations. Below we will only show the example.

1.4.3.4.1. Deploying MPI applications using the provided wrapper

Create a file mpi.ini:

universe                = parallel
executable              = /opt/simonsobservatory/cbatch_openmpi
arguments               = env.sh mpi.sh
machine_count           = 2
should_transfer_files   = yes
when_to_transfer_output = ON_EXIT
transfer_input_files    = env.sh,mpi.sh
request_cpus            = 16
request_memory          = 32999
request_disk            = 32G

# contraining CPU to match the environment using in env.sh
# Requirements          = (Arch == "INTEL") && (Microarch == "x86_64-v4")
# currently the only attributes that is exposed at Blackett is
Requirements            = has_avx512f && has_avx512dq

log                     = mpi.log
output                  = mpi-$(Node).out
error                   = mpi-$(Node).err
stream_error            = True
stream_output           = True

queue

In the first file env.sh,

#!/bin/bash -l

# helpers ##############################################################

COLUMNS=72

print_double_line() {
	eval printf %.0s= '{1..'"${COLUMNS}"\}
	echo
}

print_line() {
	eval printf %.0s- '{1..'"${COLUMNS}"\}
	echo
}

########################################################################

CONDA_PREFIX=/cvmfs/northgrid.gridpp.ac.uk/simonsobservatory/pmpm/so-pmpm-py310-mkl-x86-64-v4-openmpi-latest

print_double_line
echo "$(date) activate environment..."
source "$CONDA_PREFIX/bin/activate"
print_line
echo "Python is available at:"
which python
echo "mpirun is available at:"
which mpirun

Then in mpi.sh,

#!/usr/bin/env bash

# helpers ##############################################################

COLUMNS=72

print_double_line() {
	eval printf %.0s= '{1..'"${COLUMNS}"\}
	echo
}

print_line() {
	eval printf %.0s- '{1..'"${COLUMNS}"\}
	echo
}

########################################################################

print_double_line
set_OMPI_HOST_one_slot_per_condor_proc
echo "Running mpirun with host configuration: $OMPI_HOST" >&2

print_double_line
echo 'Running TOAST tests in /tmp...'
cd /tmp
mpirun -v -host "$OMPI_HOST" python -c 'import toast.tests; toast.tests.run()'

Lastly, submit the job as usual by

condor_submit mpi.ini

After waiting for a while as the job finished, you can see what happened by reading the contents of log, output, and error as specified in the ClassAd.

See Monitor your jobs to see how to monitor the status of your job. For advance use, use this command instead,

condor_submit mpi.ini; tail -F mpi.log mpi-0.out mpi-0.err mpi-1.out mpi-1.err

and see Streaming stdout & stderr with tail for an explanation on what it does.