...
It is not permitted the execution of cpu-bound programs on this node, if some compilation needs much more cputime than the permitted, this needs to be done through the batch queue system. It is not possible to connect directly to the compute nodes from the login node, all resource allocation is done by the batch queue system.
Compute Node
As you already know Altamira includes 158 main compute nodes where all the executions must be done. For security reasons, it is not possible the connection directly to the worker nodes and all the executions must be allocated in the nodes through the batch system queue (see below how to submit jobs).
Running Jobs
As is defined above SLURM is the utility used at Altamira for batch processing support, so all jobs must be run through it. This part provides information for getting started with job execution at Altamira, (see the official slurm documentation to get more information on how to create a job) .
NOTE: In order to keep the login nodes in a proper load, a 10 minutes limitation in the CPU time is set for processes running interactively in these nodes. Any execution taking more than this limit should be carried out through the queue system.
...
- sbatch <job script>: submits a job script to the queue system (see below for job script directives).
- squeue [-u user ]: shows all the jobs submitted by all users or by the user if you specify the -u option.
- scancel <job id>: removes his/her job from the queue system, canceling the execution of the job if it was already running.
On the other way, you can also launch your jobs in an interactive way to run your session in one of the compute node you have requested (used eventually for graphical applications, etc) :
Code Block | ||
---|---|---|
| ||
srun -N 2 --ntasks-per-node=8 --pty bash |
this requests 2 nodes (-N 2) and we are saying we are going to launch a maximum of 8 tasks per node (--ntasks-per-node=8). We are saying that you want to run a login shell (bash) on the compute nodes. The option --pty is important. This gives a login prompt and a session that looks very much like a normal interactive session but it is on one of the compute nodes.
Job directives
A job must contain a series of directives to inform the batch system about the characteristics of the job and you can configure them to fit your needs. These directives appear as comments in the job script and usually at the top just after the shebang line, with the following syntax:
Code Block | ||
---|---|---|
| ||
#SBATCH option=value |
Note that these directives:
- start with the #SBATCH prefix
- are always lowercase
- have no spaces in between.
- don't expand shell variables (they are just shell comments)
This table describes the common directives you can define in your job (see below an example):
Directive | Description | Default value |
---|---|---|
--job-name=value | The name of the job that appears in the batch queue | script_name |
Job examples
Example for a sequential job:
Code Block | ||
---|---|---|
| ||
#!/bin/bash
#
#SBATCH --job-name=hello
#SBATCH --output=hello.out
#
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100
srun hostname
srun sleep 60
|
Software
Modules Enviroment
The Environment Modules package provides for the dynamic modification of a user's environment via modulefiles. Each modulefile contains the information needed to configure the shell for an application or a compilation. Modules can be loaded and unloaded dynamically and atomically, in a clean fashion. All popular shells are supported, including bash, ksh, zsh, sh, csh, tcsh, as well as some scripting languages such as perl.
The most important commands of module tool are: list, avail, load, unload, switch and purge
module list shows all the modules you have loaded
module avail shows all the modules that user is able to load
module load let user load the necessary environment variables for the selected modulefile (PATH, MANPATH, LD_LIBRARY_PATH...etc)
module unload removes all environment changes made by module load command
module switch acts as module unload and module load command at same time
Job submitting with Modules
We need to do the loading of the needed applications inside the job script to load them in the worker nodes once the jobs require them.
Getting Help
IFCA provides to users consulting assistance. User support consultants are available during normal business hours, Monday to Friday, 09 a.m. to 17 p.m. (CEST time).
...