Running calculations
Introduction
Jobs (both batch and interactive sessions) on EDI should be run through slurm resource manager. For the quick overview of slurm you can refer to the video: link
Information
Slurm details:
- Two partitions are available -
cpu
andgpu
. - GPU partition has higher priority.
- No limits are currently enforced on the time of execution.
- Constraints (
rtx2080
,gtx1080
) can be used to select certain GPU architectures.
Example
To get the information on the currently running jobs run squeue
:
~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
87719 gpu interact username R 11-18:07:21 1 edi08
sinfo
:
~$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
cpu up infinite 1 drain* edi03
cpu up infinite 1 mix edi08
cpu up infinite 7 idle edi[00-02,04-07]
gpu up infinite 1 drain* edi03
gpu up infinite 1 mix edi08
gpu up infinite 6 idle edi[01-02,04-07]
Interactive sessions
EDI is commonly used for interactive work with data, e.g. performing ad-hoc analyses and visualizations
with python and jupyter-notebooks.
To facilitate allocating resources for interactive sessions a convenient wrapper (alloc
) has been prepared.
You can tweak your allocation depending on work needs, see the following table for details and examples.
alloc
options are as follows:
Argument | Description |
---|---|
-n | Number of cores used allocated for the job (default = 1, max = 36) |
-g | Number of GPUs allocated for the job (default = 0, max = 2) |
-m | Amount of memory (in GBs) per allocated core allocated for the job (default = 1, max = 60) |
-w | Host to start your session (default = host you are running alloc on) |
Example
To obtain an allocation on edi02 with 1 gpu and 6 cores and a total of 12 GB of memory:
alloc -n 6 -w edi02 -g 1 -m 2
Batch jobs
Longer, resource demanding jobs typically should be scheduled in SLURM batch mode. Below you can find the example of the SLURM batch script that you can use to schedule a job:
Example
Suppose the following job.sh
batch file:
#!/bin/bash
#SBATCH -p gpu # GPU partition
#SBATCH -n 8 # 8 cores
#SBATCH --gres=gpu:1 # 1 GPU
#SBATCH --mem=30GB # 30 GB of RAM
#SBATCH -J job_name # name of your job
your_program -i input_file -o output_path
sbatch
command:
~$ sbatch job.sh
Submitted batch job 1234