Scratch Space Monitoring

Overview

Files on Matilda HPC scratch spaces are purged every 45 days if they have not been accessed within that period. It is possible to create a script to monitor your scratch spaces and provide advanced warning of potential deletions. This script can be activated using a personal "cron" job which can be run daily. This tutorial provides one example of how to accomplish this.

Example Monitoring Script

In the example provided below, we change to the user scratch space, check for files older than 40 days. We can then change to a project scratch space and repeat the same procedure (if applicable). Let us call this script "scratchCheck.sh".

### Example Scratch Monitoring Script
#!/bin/bash

cd /scratch/users/<username>
lfs find . -atime +40 -type file
cd /scratch/projects/<projectname>
lfs find . -atime +40 -type file

Modify the terms "<username>" and "<projectname>" with the actual user username and the name of the project space (if applicable), respectively. The term "+40" may also be modified to adjust the number of days warning you will receive (e.g. 44 days would be 1 day before deletion). This script will produce a list of all files that are 40 days or older in the searched scratch spaces.

After saving the script, make sure it is executable:

chmod 755 scratchCheck.sh

The Cron Job

Cron jobs are run at an interval specified by the user. On Matilda, it is possible to create account level cronjobs that will run on the login node at the desired time. In this example, we want to run our "checkScratch.sh" monitoring script, and email the output to a valid email address. To start, lets open the cron job editor:

crontab -e

The command above will open the "vi" editor on the login node and create a personal "crontab" file. Presented below, is an example of a cronjob that will run every day at 4:00am and send us an email with a list of any files that are 40+ days since last accessed:

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
[email protected]

# For details see man 4 crontabs

# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name  command to be executed
0 4 * * * /home/u/username/checkScratch.sh

Please replace the "username" entries with the actual usernames. Also, please note the structure of the home directory (in this example /home/u/username). To obtain your precise home directory path, enter the following commands:

cd
pwd

Use the resulting home directory path where indicated.

When you are finished editing the personal "crontab" file, save the file and close the editor. To make changes at any time, simply enter:

crontab -e

Make sure you place your scratch monitoring script in a non-volatile location (e.g. your home directory or projects space). Make sure to adjust the path in the crontab accordingly (in the example above, we've simply placed the script in the home directory).

Conclusion and More Information

This tutorial does not provide a comprehensive explanation of cronjobs or crontab formatting. For a more complete treatment of this subject, you may find the following resources useful:


CategoryHPC