HelpdeskDocsTemplate/logo.png HelpdeskDocsTemplate/UTS.png

MSU Institute for Cyber-Enabled Research (ICER) High-Performance Computing Cluster (HPCC)

Oakland University has 4 buy-in nodes as part of the MSU Institute for Cyber-Enabled Research (ICER) High-Performance Computing Cluster (HPCC).
OU Faculty can get priority access on the buy-in nodes, but also access resources on the entire cluster.


General Information

For general information about ICER, see the ICER About page.
You may find these pages useful as well:

Getting User Accounts

Account Request

To utilize Oakland University's ICER nodes, you will need 2 accounts:

  1. MSU Community ID
  2. MSU ICER NetID

Instructions for obtaining these accounts can be found below.

MSU Community ID

An MSU Community ID will allow you to access various community web portals.

For users at Oakland University, an MSU Community ID is needed to request an MSU ICER HPCC NetID.
Your MSU Community ID will also allow you to initiate any follow-up requests (Example: The Primary Investigator needs to request a shared research space for a group of researchers).
These requests can be made by visiting the ICER Contact Us page.

The MSU Community ID username will be your Oakland University e-mail address.

  1. The first step is to request an MSU Community ID here: MSU Guest - Community ID.

    1. Select the option to "Create a Guest Account (Community ID)".
    2. Make sure to use your real name and Oakland University e-mail address when submitting the request.

  2. You should receive an e-mail from Community ID <help@msu.edu> with a subject of MSU Community ID Account Created - Please Complete Setup to your Oakland e-mail address.

    1. Follow the link in that e-mail to create a password for your MSU Community ID account.

MSU ICER HPCC NetID

An MSU ICER HPCC NetID will allow you to access the HPCC via a couple of different methods.

One method being through an SSH Client and the other using your Web Browser.

  1. Once you have an MSU Community ID, you need to request an account for the ICER HPCC.
    1. Log in through MSU Authentication with your MSU Community ID.

    2. In the "Community ID" tab, fill out the Principal Investigator name (optional) and click "Submit".
      1. If you are a student please put the name of the faculty or staff member you are working with in the Principal Investigator Name field.

  2. ICER will create the account and you will receive confirmation of the creation in your OU e-mail account.

Note: Your user account will be delivered to you in a confirmation email.
The username will be found in the cc: email header field -- not in the body of the email -- and will be a randomized set of letters and numbers (e.g. fgiz3dm4).

Accessing the ICER HPCC

Nodes

Once you have your accounts, you will then be able to log into OU's buy-in nodes on the ICER cluster.

The cluster is primarily accessed by means of the Secure Shell (SSH) network protocol. An SSH connection can be established to hpcc.msu.edu from a terminal prompt (Mac/Linux) or a program like PuTTY (Windows).
Windows users may find using MobaXterm more feature rich than PuTTY, as it supports graphical applications.
See the Install SSH Client page for details.

Note: Please review OU Software Regulations Policy 870 and submit a Software & Hosted Solution Purchasing Checklist form prior to installing software on any Oakland University equipment.

See the HPC's entire layout at ICER page for a brief description of the system.

gateway Node

When you make your initial connection to the ICER HPPC, you will find yourself on one of the gateway nodes.
(e.g. ssh fgiz3dm4@hpcc.msu.edu)

The gateway nodes are simply login nodes for users to enter into the ICER HPPC. Once a user has connected to the gateway, they can continue to connect to one of the development nodes.
The gateway nodes are not meant for running software, connecting to scratch space, or compute nodes.
Gateway nodes do have access to the Internet.

rsync gateway Node

You are also able to connect to an rsync gateway node from your station using an SSH client.
(e.g. ssh fgiz3dm4@rsync.hpcc.msu.edu)

The rsync gateway node is meant for transferring files and is able to connect to scratch. Users are not able to access compute nodes from an rsync node.
Rsync nodes do have access to the Internet.

Note: Users cannot connect directly to an rsync gateway node using the Web Browser connection method.

Development Node

From a gateway or rsync gateway node, you can further SSH into one of the development nodes listed upon log in.
(e.g. ssh dev-intel16)

Development nodes are available for users to compile their code and do short tests to estimate run-time and memory usage.
These short tests should not take longer than 2 hours.

File Systems

See the page on File Systems for a list of various file systems on the cluster.

To determine which file system to use, see the Guidelines for Choosing File Storage and I/O document.

Home Space

The first file system that you will access on the HPCC is the Home Space.

This is typically referred to as your home directory and is the beginning working directory after you log in to any node in the cluster. There is a 50GB limit for storage space and can not contain more than 1 million files.

Note: Users can request to increase their Home Space by completing the Quota Increase Request form.

Research Space

Research space can be created upon request from a principal investigator.

For more information about this space, see Research Space.

Scratch Space

Another set of important file systems are the Scratch File Systems.
The scratch file systems are spaces that are designated for temporary data file storage. Files saved in these locations have no back-up and may be deleted if they are not modified in 45 days. The scratch spaces are also not available from the gateway nodes.
You should save your results or data back into the Home or Research file systems after your job has finished running.

When on an rsync or development node:

  • You can reference the ls15 scratch space with variable $SC15.

  • You can reference the gs18 scratch space with variable $SCRATCH.

Local Files Systems

Local File Systems are available on each cluster compute node and development node.

While these spaces are good for fast temporary storage while running a job, there are some things to be aware of; such as files over 2 weeks old are deleted, and in the situation that the local space is over 90% full, unused files may be deleted without notice.

Running Jobs

Viewing HPCC Commands and Examples

One of the first things you may wish to do if you are new to the cluster is to load the powertools modules and view the available examples and commands.

Assuming you are already on a gateway node (lines beginning with # are comments and should not be executed):

# Load the powertools module
module load powertools

# SSH to a random dev node
dev

# Run the powertools command to print a list of available commands to your terminal
powertools

# Run the getexample command for a list of possible examples to download
getexample

Job Submission examples

Submitting a job to the cluster is done via the sbatch command.

If may be helpful to view some of the examples to get an understanding of job submission.

To get the helloworld example, try the following on a development node:

# Load the powertools module
module load powertools

# Get the helloworld example
getexample helloworld

# change to the helloworld directory
cd helloworld

# Compile the helloworld.c source code
gcc hello.c -o hello

# Run hello.sb using the sbatch command
sbatch hello.sb

# Check the status of your submission using squeue
squeue -u $USER

You may find it helpful to get a few other examples and go through the READMEs, as well as the other files that each README references.

Buy-in Nodes

The 4 buy-in nodes that we have are part of the laconia (intel16) cluster and are considered "Intel16 — Large Memory" nodes; as referenced in Compute Resources.

You should be able to check the status of the nodes and the jobs running on them with the following:

# Load the powertools module
module load powertools

# Check jobs running on all buyin nodes associated with your user
prs

Laconia (intel16), Intel16 — Large Memory Nodes

  1. lac-311
  2. lac-312
  3. lac-313
  4. lac-314


TSSHowTo TechnicalServiceSystem