Slurm
Marlowe uses SLURM, a job scheduling system, to run jobs. There are three main Account types: Basic/Preempt, Medium/Batch, and Large/Hero.
The Slurm commands srun, salloc, and sbatch will take you far!
Marlowe accounts are start with marlowe- followed by the project ID. So if your project ID was m223813, your account would be marlowe-m223813. To use the batch or hero partitions, you will need to add a suffix to your account, like marlowe-m223813-pm01. Read more about it in the Accounts section below.
optional: enter your information below and click the Generate button to generate copy & paste commands with your information pre-filled
SRUN:
SALLOC:
SBATCH:
Check GPU allocation usage
Use the form below to generate the sreport command.
To see a medium project’s usage, use its pm suffix. To see a large project’s usage, use its pl suffix.
sreport cluster UserUtilizationByAccount -T gres/gpu Start=[start of billing cycle] End=now account=[your project account] -t hours
Accounts
Each allocation is given a project ID. This project ID corresponds to a job account on Marlowe.
One of the requirements (for accounting purposes) is for each job to be credited to a job account. If you don’t add a valid account, you will see the following error message when submitting jobs:
srun: error: ACCOUNT ERROR: Did you remember to set your account?
Medium and large projects are given a GPU hours allocation, tied to a suffix on the main project ID. The suffix is required when using the batch/medium and hero/large partitions. Users can keep track of their allocation using the sreport command.
The suffix will be something like pm01 for a medium project, or pl01 for a large project. Check your Marlowe welcome email if you need your project’s suffix.
Note: You will be charged against your GPU hours allocation if you submit a job with a medium/large project suffix to the preempt partition. The generator above assumes that you do not want to use your GPU hours allocation for your preempt partition job.
Available Partitions
1) If you have a medium project allocation, you should submit to the batch partition
2) If you have a large project allocation, you should submit to the hero partition
3) For basic access, you can only submit to the preempt partition
Partition Limits
Hero: 25 nodes, 24 hours
Batch: 16 nodes, two days
Preempt: 8 nodes, 12 hours
Note: Any jobs in the preempt queue can be preempted within 15 minutes if a job in a higher priority partition (batch or hero) requests the node that the preempt job is running on.