Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

Modules required

  1. lang/Perl/5.28.1-GCCcore-6.3.0
  2. data/netCDF-WRF/C-4.6.2_CXX-4.3.0_F-4.4.2_p-1.9.0-intel-2018.5.274
  3. toolchain/intel/2018.5.274
  4. devel/CMake/3.12.1-intel-2018.5.274
  5. data/XML-LibXML/2.0206-GCCcore-6.3.0

Preparation

The required modules have taken care of several of the setup that i are unique to certain climate models, such as CESM, where it expects libraries to be placed.  For example, the netCDF-WRF module takes care the fact that CESM 1.x expects netCDF and the different variants of netCDF to all be within the same directory tree.  As our modules do not typically do this, and these days netCDF comes as distinct libraries, a module was created that does this for you.  As the module name suggests, it also works in the case you wish to build your own copy of WRF instead of using the provided module of WRF.


As the dependencies are all available on Mana already, you just need to download and unpack the CESM source code into your user home on Mana following the download instructions for CESM https://www.cesm.ucar.edu/models/current.html .


Configuration Files

Once the source is acquired, changes need to be made to the config_machines.xml and a mkbatch, and env_mach_specific need to be made per new machine.   These files would be added and modified in "scripts/ccsm_utils/Machines/"

Below you can find the changes and contents of the files that need to be created.

config_machines.xml

 

Info

As Mana has two different Infiniband networks (QDR and HDR), two different machine entries are created.  This also means that subsequent files will also come in duplicate, with only changes in selecting the right network to use.

config_machines.xml


Code Block
languagexml
themeMidnight
titleconfig_machines.xml
collapsetrue
<machine MACH="uhhpc_qdr">
<DESC>User Defined Machine</DESC> <!-- can be anything -->
<OS>LINUX</OS> <!-- LINUX,Darwin,CNL,AIX,BGL,BGP -->
<COMPILERS>intel</COMPILERS> <!-- intel,ibm,pgi,pathscale,gnu,cray,lahey -->
<MPILIBS>impi</MPILIBS> <!-- openmpi, mpich, ibm, mpi-serial -->
<CESMSCRATCHROOT>~/lus_scratch/cesm/case</CESMSCRATCHROOT> <!-- complete path to the 'scratch' directory -->
<RUNDIR>$CASEROOT/run</RUNDIR> <!-- complete path to the run directory -->
<EXEROOT>$CASEROOT/bld</EXEROOT> <!-- complete path to the build directory -->
<DIN_LOC_ROOT>~/cesm/input</DIN_LOC_ROOT> <!-- complete path to the inputdata directory -->
<DIN_LOC_ROOT_CLMFORC>USERDEFINED_optional_build</DIN_LOC_ROOT_CLMFORC> <!-- path to the optional forcing data for CLM (for CRUNCEP forcing) -->
<DOUT_S>TRUE</DOUT_S> <!-- logical for short term archiving -->
<DOUT_S_ROOT>$CASEROOT/output</DOUT_S_ROOT> <!-- complete path to a short term archiving directory -->
<DOUT_L_MSROOT>USERDEFINED_optional_run</DOUT_L_MSROOT> <!-- complete path to a long term archiving directory -->
<CCSM_BASELINE>USERDEFINED_optional_run</CCSM_BASELINE> <!-- where the cesm testing scripts write and read baseline results -->
<CCSM_CPRNC>USERDEFINED_optional_test</CCSM_CPRNC> <!-- path to the cprnc tool used to compare netcdf history files in testing -->
<BATCHQUERY>squeue -a</BATCHQUERY>
<BATCHSUBMIT>sbatch</BATCHSUBMIT>
<SUPPORTED_BY>uh</SUPPORTED_BY>
<GMAKE_J>8</GMAKE_J>
<MAX_TASKS_PER_NODE>19</MAX_TASKS_PER_NODE>
</machine>


<machine MACH="uhhpc_hdr"> 
<DESC>User Defined Machine</DESC> <!-- can be anything --> 
<OS>LINUX</OS> <!-- LINUX,Darwin,CNL,AIX,BGL,BGP --> 
<COMPILERS>intel</COMPILERS> <!-- intel,ibm,pgi,pathscale,gnu,cray,lahey --> 
<MPILIBS>impi</MPILIBS> <!-- openmpi, mpich, ibm, mpi-serial -->
<CESMSCRATCHROOT>~/lus_scratch/cesm/case</CESMSCRATCHROOT> <!-- complete path to the 'scratch' directory -->
<RUNDIR>$CASEROOT/run</RUNDIR> <!-- complete path to the run directory --> 
<EXEROOT>$CASEROOT/bld</EXEROOT> <!-- complete path to the build directory --> 
<DIN_LOC_ROOT>~/cesm/input</DIN_LOC_ROOT> <!-- complete path to the inputdata directory --> 
<DIN_LOC_ROOT_CLMFORC>USERDEFINED_optional_build</DIN_LOC_ROOT_CLMFORC> <!-- path to the optional forcing data for CLM (for CRUNCEP forcing) -->
<DOUT_S>TRUE</DOUT_S> <!-- logical for short term archiving --> 
<DOUT_S_ROOT>$CASEROOT/output</DOUT_S_ROOT> <!-- complete path to a short term archiving directory -->
<DOUT_L_MSROOT>USERDEFINED_optional_run</DOUT_L_MSROOT> <!-- complete path to a long term archiving directory -->
<CCSM_BASELINE>USERDEFINED_optional_run</CCSM_BASELINE> <!-- where the cesm testing scripts write and read baseline results -->
<CCSM_CPRNC>USERDEFINED_optional_test</CCSM_CPRNC> <!-- path to the cprnc tool used to compare netcdf history files in testing -->
<BATCHQUERY>squeue -a</BATCHQUERY>
<BATCHSUBMIT>sbatch</BATCHSUBMIT>
<SUPPORTED_BY>uh</SUPPORTED_BY>
<GMAKE_J>8</GMAKE_J>
<MAX_TASKS_PER_NODE>39</MAX_TASKS_PER_NODE> 
</machine>

...

Code Block
themeMidnight
titlemkbatch.uhhpc_qdr
collapsetrue
#! /bin/csh -f

source /etc/profile.d/lmod.csh
#################################################################################
if ($PHASE == set_batch) then
#################################################################################

source ./Tools/ccsm_getenv || exit -1

module load lang/Perl/5.28.1-GCCcore-6.3.0
set ntasks = `${CASEROOT}/Tools/taskmaker.pl -sumonly`
set maxthrds = `${CASEROOT}/Tools/taskmaker.pl -maxthrds`
module purge
@ nodes = $ntasks / ${MAX_TASKS_PER_NODE}
if ( $ntasks % ${MAX_TASKS_PER_NODE} > 0) then
@ nodes = $nodes + 1
@ ntasks = $nodes * ${MAX_TASKS_PER_NODE}
endif
@ taskpernode = ${MAX_TASKS_PER_NODE} / ${maxthrds}
set qname = batch
set tlimit = "3-00:00:00"

if ($?TESTMODE) then
set file = $CASEROOT/${CASE}.test
else
set file = $CASEROOT/${CASE}.run
endif

cat >! $file << EOF1
#!/bin/csh
#SBATCH --job-name=${CASE}
#SBATCH --constraint="ib_qdr"
#SBATCH --distribution="*:*:*"
#SBATCH --partition=exclusive
#SBATCH --time=$tlimit
#SBATCH --job-name=${CASE}
#SBATCH --ntasks=$ntasks
#SBATCH --cpus-per-task=$maxthrds
#SBATCH --output=${CASE}.%A.out


# Configure the Intel MPI parameters
setenv I_MPI_FABRICS "shm:ofi"
setenv I_MPI_PMI_LIBRARY "/lib64/libpmi.so"
# ### FOR QDR NETWORK #####
setenv FI_PROVIDER "psm"
setenv FI_PSM_TAGGED_RMA 0
setenv FI_PSM_AM_MSG 1
setenv FI_PSM_UUID \`uuidgen\`
# # ###### ######## ###### ##
source /etc/profile.d/lmod.csh
module purge

EOF1

#################################################################################
else if ($PHASE == set_exe) then
#################################################################################
module load lang/Perl/5.28.1-GCCcore-6.3.0
set maxthrds = `${CASEROOT}/Tools/taskmaker.pl -maxthrds`
set maxtasks = `${CASEROOT}/Tools/taskmaker.pl -sumtasks`
module purge


cat >> ${CASEROOT}/${CASE}.run << EOF1
# -------------------------------------------------------------------------
# Run the model
# -------------------------------------------------------------------------

sleep 25
cd \$RUNDIR
echo "\`date\` -- CSM EXECUTION BEGINS HERE"
setenv OMP_NUM_THREADS ${maxthrds}
module load data/netCDF-Fortran/4.4.5-intel-2018.5.274
module load data/netCDF/4.6.2-intel-2018.5.274
module load toolchain/intel/2018.5.274
srun --ntasks=${maxtasks} --cpu_bind=sockets --cpu_bind=verbose --kill-on-bad-exit \$EXEROOT/cesm.exe >&! cesm.log.\$LID
wait
echo "\`date\` -- CSM EXECUTION HAS FINISHED"

EOF1


#################################################################################
else if ($PHASE == set_larch) then
#################################################################################

#This is a place holder for a long-term archiving script

#################################################################################
else
#################################################################################

echo " PHASE setting of $PHASE is not an accepted value"
echo " accepted values are set_batch, set_exe and set_larch"
exit 1

#################################################################################
endif
#################################################################################

...

Code Block
themeMidnight
titlemkbatch.uhhpc_hdr
collapsetrue
#! /bin/csh -f

source /etc/profile.d/lmod.csh
#################################################################################
if ($PHASE == set_batch) then
#################################################################################

source ./Tools/ccsm_getenv || exit -1
maxtasks
module load lang/Perl/5.28.1-GCCcore-6.3.0
set ntasks = `${CASEROOT}/Tools/taskmaker.pl -sumonly`
set maxthrds = `${CASEROOT}/Tools/taskmaker.pl -maxthrds`
module purge
@ nodes = $ntasks / ${MAX_TASKS_PER_NODE}
if ( $ntasks % ${MAX_TASKS_PER_NODE} > 0) then
@ nodes = $nodes + 1
@ ntasks = $nodes * ${MAX_TASKS_PER_NODE}
endif
@ taskpernode = ${MAX_TASKS_PER_NODE} / ${maxthrds}
set qname = batch
set tlimit = "3-00:00:00"

if ($?TESTMODE) then
set file = $CASEROOT/${CASE}.test
else
set file = $CASEROOT/${CASE}.run
endif

cat >! $file << EOF1
#!/bin/csh
#SBATCH --job-name=${CASE}
#SBATCH --constraint="ib_hdr"
#SBATCH --distribution="*:*:*"
#SBATCH --partition=exclusive
#SBATCH --time=$tlimit
#SBATCH --job-name=${CASE}
#SBATCH --ntasks=$ntasks
#SBATCH --cpus-per-task=$maxthrds
#SBATCH --output=${CASE}.%A.out


###### ######## ###### ##
# Libfabric method
###### ######## ###### ##
# Configure the Intel MPI parameters
setenv I_MPI_FABRICS "shm:ofi"
setenv I_MPI_PMI_LIBRARY "/lib64/libpmi.so"
 setenv I_MPI_HYDRA_TOPOLIB "ipl" # May be required if newer libfabric and intel MPI is used
### FOR HDR NETWORK #####
# https://ofiwg.github.io/libfabric/master/man/
# https://ofiwg.github.io/libfabric/v1.9.1/man/
setenv FI_PROVIDER "shm,verbs;ofi_rxm"
setenv FI_MR_CACHE_MONITOR "disabled" # currently a bug exists that a segfault could happen
setenv FI_VERBS_MR_CACHE_ENABLE "0" # currently a bug exists that a segfault could happen
setenv  FI_VERBS_INLINE_SIZE "256"
setenv FI_UNIVERSE_SIZE "${maxtasks}" # should equal at least the max number of tasks one task will communicate with
setenv FI_VERBS_IFACE "i"
###### ######## ###### ##

###### ######## ###### ##
# DAPL method (deprecated but not gone in Intel 2018)
###### ######## ###### ##
# Configure the Intel MPI parameters

#setenv I_MPI_FABRICS "shm:dapl"
#setenv I_MPI_PMI_LIBRARY "/lib64/libpmi.so"

###### ######## ###### ##
source /etc/profile.d/lmod.csh
module purge

EOF1

#################################################################################
else if ($PHASE == set_exe) then
#################################################################################
module load lang/Perl/5.28.1-GCCcore-6.3.0
set maxthrds = `${CASEROOT}/Tools/taskmaker.pl -maxthrds`
set maxtasks = `${CASEROOT}/Tools/taskmaker.pl -sumtasks`
module purge


cat >> ${CASEROOT}/${CASE}.run << EOF1
# -------------------------------------------------------------------------
# Run the model
# -------------------------------------------------------------------------

sleep 25
cd \$RUNDIR
setenv OMP_NUM_THREADS ${maxthrds}
module load data/netCDF-Fortran/4.4.5-intel-2018.5.274
module load data/netCDF/4.6.2-intel-2018.5.274
module load toolchain/intel/2018.5.274
echo "\`date\` -- CSM EXECUTION BEGINS HERE" 
srun --ntasks=${maxtasks} --cpu_bind=sockets --cpu_bind=verbose --kill-on-bad-exit \$EXEROOT/cesm.exe >&! cesm.log.\$LID
wait
echo "\`date\` -- CSM EXECUTION HAS FINISHED"

EOF1


#################################################################################
else if ($PHASE == set_larch) then
#################################################################################

#This is a place holder for a long-term archiving script

#################################################################################
else
#################################################################################

echo " PHASE setting of $PHASE is not an accepted value"
echo " accepted values are set_batch, set_exe and set_larch"
exit 1

#################################################################################
endif
#################################################################################

...