Installed Applications
Following is the list of few of the applications from various domains of science and engineering installed in the system.
| HPC Applications | ||
| Bio-informatics | MUMmer, HMMER, MEME, Schrodinger, PHYLIP, mpiBLAST, ClustalW, | |
| Molecular Dynamics | NAMD (for CPU and GPU), LAMMPS, GROMACS | |
| Material Modeling, Quantum Chemistry | Quantum-Espresso, Abinit, CP2K, NWChem, | |
| CFD | OpenFOAM, SU2 | |
| Weather, Ocean, Climate | WRF-ARW, WPS (WRF), ARWPost (WRF), RegCM, MOM, ROMS | |
| Deep Learning Libraries | cuDNN, TensorFlow, Tensorflow with Intel Python , Tensorflow with GPU, Theano, Caffe , Keras , numpy, Scipy, Scikit-Learn, pytorch. | |
| Visualization Programs | GrADS, ParaView, VisIt, VMD | |
| Dependency Libraries | NetCDF, PNETCDF, Jasper, HDF5, Tcl, Boost, FFTW |
Standard Application Programs on PARAM Rudra
The purpose of this section is to expose the users to different application packages which have been installed on PARAM Rudra System. Users interested in exploring these packages may kindly go through the scripts, typical input files and typical output files. It is suggested that at first, the users may submit the scripts provided and get a feel of executing the codes. Later, they may change the parameters and the script to meet their application requirements.
LAMMPS Applications
LAMMPS is an acronym for Large-scale Atomic/ Molecular Massively Parallel Simulator. This is extensively used in the fields of Material Science, Physics, Chemistry and many others. More information about LAMMPS may please be found at https://lammps.sandia.gov .
⦁ The LAMMPS input is in.lj file which contains the below parameters.
Input file = in.lj
# 3d Lennard-Jones melt
variable x index 1
variable y index 1
variable z index 1
variable xx equal 64*$x
variable yy equal 64*$y
variable zz equal 64*$z
units lj
atom_style atomic
lattice fcc 0.8442
region box block 0 ${xx} 0 ${yy} 0 ${zz}
create_box 1 box
create_atoms 1 box
mass 1 1.0
velocity all create 1.44 87287 loop geom
pair_style lj/cut 2.5
pair_coeff 1 1 1.0 1.0 2.5
neighbor 0.3 bin
neigh_modify delay 0 every 20 check no
fix 1 all nve
run 1000000
- THE LAMMPS RUNNING SCRIPT
#!/bin/sh
#SBATCH -N 8
#SBATCH --ntasks-per-node=40
#SBATCH --time=08:50:20
#SBATCH --job-name=lammps
#SBATCH --error=job.%J.err_8_node_40
#SBATCH --output=job.%J.out_8_node_40
#SBATCH --mem=178G
#SBATCH --partition=standard #This is the default case, kindly verify the partition list using 'sinfo'
spack load intel-oneapi-compilers /di5wvj5
spack load intel-oneapi-mpi/qu6bm3l
spack load gcc@13 /3wdpf6l
source /home/apps/spack/opt/spack/linux-almalinux8-cascadelake/oneapi-2025.0.1/intel-oneapi-mkl-2024.2.2-z2opgtwnptqz4za5nqrqngqp4pvibuyu/setvars.sh intel64
export I_MPI_FALLBACK=disable
export I_MPI_FABRICS=shm:ofa
#export I_MPI_FABRICS=shm:tmi
#export I_MPI_FABRICS=shm:dapl
export I_MPI_DEBUG=5
#Enter your working directory or use SLURM_SUBMIT_DIR
cd /home/testuser/NEW_LAMMPS/lammps-7Aug19/bench
export OMP_NUM_THREADS=1
time mpiexec.hydra -n $SLURM_NTASKS -genv OMP_NUM_THREADS 1 <path of lammps executable> -in in.lj
- LAMMPS OUTPUT FILE.
LAMMPS (29 Aug 2024 - Update 1)
using 1 OpenMP thread(s) per MPI task
Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
Created orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924)
8 by 4 by 6 MPI processor grid
Created 32000 atoms
using lattice units in orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924)
create_atoms CPU = 0.006 seconds
Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
update: every = 20 steps, delay = 0 steps, check = no
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 2.8
ghost atom cutoff = 2.8
binsize = 1.4, bins = 24 24 24
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair lj/cut, perpetual
attributes: half, newton on
pair build: half/bin/atomonly/newton
stencil: half/bin/3d
bin: standard
Setting up Verlet run ...
Unit style : lj
Current step : 0
Time step : 0.005
Per MPI rank memory allocation (min/avg/max) = 2.605 | 2.608 | 2.614 Mbytes
Step Temp E_pair E_mol TotEng Press
0 1.44 -6.7733681 0 -4.6134356 -5.0197073
100000 0.69368588 -5.6730434 0 -4.6325471 0.7209302
Loop time of 14.8034 on 192 procs for 100000 steps with 32000 atoms
Performance: 2918242.631 tau/day, 6755.191 timesteps/s, 216.166 Matom-step/s
99.6% CPU use with 192 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 6.3039 | 6.6525 | 7.2306 | 6.5 | 44.94
Neigh | 1.2263 | 1.268 | 1.3339 | 2.3 | 8.57
Comm | 6.0525 | 6.6639 | 7.0573 | 7.5 | 45.02
Output | 0.00012919 | 0.0001504 | 0.0001727 | 0.0 | 0.00
Modify | 0.13617 | 0.15052 | 0.19459 | 2.2 | 1.02
Other | | 0.06833 | | | 0.46
Nlocal: 166.667 ave 179 max 157 min
Histogram: 5 17 25 32 40 39 13 14 2 5
Nghost: 1130.05 ave 1153 max 1112 min
Histogram: 14 15 24 27 47 18 22 15 2 8
Neighs: 6251.41 ave 6849 max 5865 min
Histogram: 13 20 36 35 36 21 14 8 7 2
Total # of neighbors = 1200271
Ave neighs/atom = 37.508469
Neighbor list builds = 5000
Dangerous builds not checked
Total wall time: 0:00:16
GROMACS APPLICATION
GROMACS
GROningen MAchine for Chemical Simulations (GROMACS) is a molecular dynamics package mainly designed for simulations of proteins, lipids, and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centres worldwide. GROMACS is one of the fastest and most popular software packages available, and can run on central processing units (CPUs) and graphics processing units (GPUs).
Input description of Gromacs
Input file can be download from ftp://ftp.gromacs.org/pub/benchmarks/water_GMX50_bare.tar.gz
The mdp option used is pme with 50000 steps
Submission Script:
#!/bin/sh
#SBATCH -N 10
#SBATCH --ntasks-per-node=48
##SBATCH --time=03:05:30
#SBATCH --job-name=gromacs
#SBATCH --error=job.16.%J.err
#SBATCH --output=job.16.%J.out
#SBATCH --mem=178G
#SBATCH --partition=standard #This is the default case, kindly verify the partition list using 'sinfo'
source /home/apps/spack/share/spack/setup-env.sh
spack load intel-oneapi-compilers /di5wvj5
spack load gromacs /msmx6d6
#Enter your working directory or use SLURM_SUBMIT_DIR
cd /home/testuser/water-cut1.0_GMX50_bare/3072
export I_MPI_DEBUG=5
export OMP_NUM_THREADS=1
mpirun -np 4 gmx_mpi grompp -f pme.mdp -c conf.gro -p topol.top
time mpirun -np $SLURM_NTASKS gmx_mpi mdrun -s topol.tpr) 2>&1 | tee log_gromacs_40_50k_mpirun
Sample SLURM Script using GPU:
#!/bin/sh
#SBATCH -N 10
#SBATCH --ntasks-per-node=48
##SBATCH --time=03:05:30
#SBATCH --job-name=gromacs
#SBATCH --error=job.16.%J.err
#SBATCH --output=job.16.%J.out
#SBATCH --partition=gpu
#SBATCH --gres gpu:2 #maximum GPUs available per node is 2
#SBATCH --mem=178G
source /home/apps/spack/share/spack/setup-env.sh
spack load gromacs /shfs5ge
#Enter your working directory or use SLURM_SUBMIT_DIR
cd /home/testuser/water-cut1.0_GMX50_bare/3072
export I_MPI_DEBUG=5
export OMP_NUM_THREADS=1
mpirun -np 4 gmx_mpi grompp -f pme.mdp -c conf.gro -p topol.top
time mpirun -np $SLURM_NTASKS gmx_mpi mdrun -s topol.tpr) 2>&1 | tee log_gromacs_40_50k_mpirun
Output Snippet:
:-) GROMACS - gmx mdrun, 2024.4-spack (-:
Executable: /home/apps/spack/opt/spack/linux-almalinux8-cascadelake/oneapi-2025.0.1/gromacs-2024.4-msmx6d6pm64ypy3qgotsicjmo3qlcsm5/bin/gmx_mpi
Data prefix: /home/apps/spack/opt/spack/linux-almalinux8-cascadelake/oneapi-2025.0.1/gromacs-2024.4-msmx6d6pm64ypy3qgotsicjmo3qlcsm5
Working dir: /home/cdacapp01/Tests/GROMACS/water-cut1.0_GMX50_bare/3072
Command line:
gmx_mpi mdrun -s topol.tpr
Reading file topol.tpr, VERSION 2024.4-spack (single precision)
Changing nstlist from 10 to 50, rlist from 1 to 1.121
Using 144 MPI processes
Non-default thread affinity set, disabling internal thread affinity
Using 1 OpenMP thread per MPI process
starting mdrun 'Water'
33 steps, 0.1 ps.
Writing final coordinates.
NOTE: 23 % of the run time was spent in domain decomposition,
0 % of the run time was spent in pair search,
you might want to increase nstlist (this has no effect on accuracy)
NOTE: 44 % of the run time was spent communicating energies,
you might want to increase some nst* mdp options
Core t (s) Wall t (s) (%)
Time: 1467.469 10.194 14395.2
(ns/day) (hour/ns)
Performance: 0.576 41.643
GROMACS reminds you: "Go back to the rock from under which you came" (Fiona Apple)