Matilda HPC Feature Updates

Overview

Significant features additions and changes related to the Matilda HPC cluster are provided herein by the date they were introduced (descending order).

April 26-27, 2023

In addition to upgrades to the operating system, firmware, and SLURM resource manager, several other notable changes were implemented that users should be aware of during the maintenance outage.

Scavenger Queue

A new job queue ("scavenger") has been added to Matilda. This queue may be used for test jobs, researcher training, and in cases where PI account billing allotments are expended. Resources on the scavenger queue are limited on a per-user basis to a maximum of:

* 2 concurrently running jobs * 8 cores per job * 2 queued jobs which accrue priority (i.e. any queued jobs over that number accrue no priority)

All nodes are accessible via the "scavenger" queue, except hpc-largemem-p01 (a buy-in node), and walltime can be up to the maximum of 168 hours (7 days).

To use "scavenger", simply include the following line in your job batch script or on the command line:

#SBATCH -q scavenger

OR

[someuser@hpc-login-p01]$ sbatch -q scavenger myjobscript.sh

TMP Space

The size of the /tmp directory has been increased on login and all other nodes from 2GB to 10GB. This should aid in alleviating some application runs and installation issues that were previously experience due to a shortage of available /tmp space.

Maximum Open File Limit

The maximum number of open files on the Matilda nodes has been increased to address problems some users were having with certain application runs that generate a large number of temporary file handles.

August 24-25, 2022

Walltime

Previously, if a user did not specify "--time" in their job script or srun session, they automatically received the maximum run time allocation of 7-days. This has been changed so that if no "--time" is specified, a run time of only 1 minute is allocated.

To specify walltime for your job in a job script (e.g. 1 day, 10 hours):

#SBATCH --time=1-10:00:00

To specify the same walltime with "srun" (as an interactive job run):

srun -N 1 -c 1 -t 1-10:00:00 --pty /bin/bash --login

GPU Specification

Before the cluster update, users could explicitly specify a run on a GPU node without providing a "gres" resource request for a GPU. This most commonly happened with "srun" interactive jobs. This caused a situation where users submitted regular job scripts (and requesting GPU resources) might land on a node where another person was running interactively without specifying a GPU quantity. This sometimes resulted in the failure of the job submitted by the user running a script, since SLURM did not "know" (and track) about the resource usage.

From now on, users must specify GPU resources when explicitly trying to run a job on one of the GPU nodes. For example using "srun" interactively:

srun -N 1 -c 1 -t 10:00:00 --gres=gpu:1 --nodelist=hpc-gpu-p01 --pty /bin/bash --login

In a job script, simply add:

#SBATCH --gres=gpu:1

Job Queues

The previous cluster configuration included a single default job queue. This upgrade introduced 2 new primary queues, and queues for PI's with buyin nodes. These job queues are:

  • general-short - jobs <= 4 hours; all nodes

  • general-long - jobs >= 4 hours; all nodes except buyin nodes

In most instances, users shouldn't have to do anything to have their job properly assigned. By specifying "--time" SLURM will select the appropriate queue. When running the "squeue" command, you may notice the presence of the new partitions.

For users with buyin nodes, your job queue name will be related to your project account name. Your buyin partition should be automatically selected based on your account name. Users who are not part of the buying partition account, will not be able to access this job queue. If you want to manually define your buyin partition, please use:

#SBATCH -p <name of buyin partition>

Or interactively:

srun -p <name partition> -N 1 -c 1 -t 10:00:00 --pty /bin/bash --login

Node Features

In order to simplify how users request certain types of nodes, we have added "Features" to certain node groups. Feature requests may be used in lieu of a node list to open access to any node that possess the desired feature characteristic. The following Features have been added:

  • bigmem - for high memory nodes, including "hpc-bigmem-p01->p04" and "hpc-largemem-p01" (short jobs only for non-buyin accounts)

  • gpu - for nodes containing gpu's and includes "hpc-gpu-p01->p03"

  • quick - for "fast" short running jobs, includes "hpc-throughput-p01->p10" (max 8 cores per node)

To request access to nodes with a particular Feature, you can use the following in your job script:

#SBATCH --constraint=bigmem

Or on the interactive command line:

srun -N 1 -c 1 -t 10:00:00 --constraint=bigmem --pty /bin/bash --login

Requesting a "Feature" causes SLURM to assign a node to your job from the pool of nodes designated with that particular feature.

Largemem Node

The Matilda node hpc-largemem-p01 is a buyin node containing 1.5TB of RAM. A "buyin" node is one purchased by a PI for inclusion in the cluster. Members of the PI's account group are granted up to 7 days of runtime on that node. Non-members are permitted to run on these nodes for a maximum of 4 hours. While this was not previously enforced, it is under the newest cluster configuration.

Users desiring high-memory for their jobs should make use of the new node "Feature" request when requesting resources (see section above). Under that scenario, your job will be assigned to the first available "bigmem" node that can meet your job specification requirements. You may also run show jobs (<= 4 hours) on hpc-largemem-p01.

For cases where the memory on the bigmem nodes may be inadequate for jobs of runtime > 4 hours, please contact us at: [email protected] for assistance.

MATLAB Changes

Because the default walltime for cluster jobs is now 1 minute (01:00) users will need to set a walltime for their jobs as described previously. For MATLAB jobs which spawn parallel workers on multiple nodes, the walltime is NOT currently passed from the job script to the worker nodes. Thus, walltimes will default to 1:00 for workers. To avoid this problem, please refer to the following kb article on setting default worker walltimes in MATLAB.

We are currently consulting with Mathworks to determine if a more seamless method is available, perhaps by making alterations in the Mathworks-supplied integration scripts. Please watch this page for updates.


HPC