Page tree
Skip to end of metadata
Go to start of metadata



Foreword 

We are happy to announce the release of Pawsey's Technical Newsletter; one that will be published every month, usually after the day of its Scheduled Maintenance. 

This newsletter is designed for Pawsey Users, and will include:

  • Information and announcements about recent changes to Pawsey systems and services,
  • Information about recent changes to user environments - new tools, new versions of software, changes applied during maintenance,
  • Description and links to the most recent best practices, how-to or troubleshooting articles published under Knowledge Base
  • Description and links to the most recent changes to the Pawsey User Support Documentation

All Researchers and Pawsey Users are very welcome to provide contributions and suggestions about the Technical Newsletter. Please email us at help@pawsey.org.au. 



Topaz in production


Earlier this year, Pawsey annouced the the arrival of Topaz, the Centre's new GPU-based cluster, an extension to our GPU service, which is now in production.

Topaz provides users with enhanced GPU capabilities, in particular, AI, computational work, machine learning workflows and data analytics.

Topaz is a 42 node cluster composed of 22 compute and 20 visualisation nodes.

Topaz GPU Compute Nodes

Each compute node in the Topaz cluster has the following configuration:

  • 2x Intel Xeon Silver 4215 CPUs (8 cores in each CPU, 16 cores in total),
  • 2x NVIDIA V100 GPU cards (16GB HBM2 memory each),
  • 192 GB RAM (with four nodes with 384 GB),
  • 100Gbit Infiniband between compute nodes.

Each GPU can be allocated to jobs individually, in contrast with Zeus where whole nodes are allocated to jobs.

Access to the Topaz compute nodes is provided to projects that have an active allocation at Pawsey and:

  • used Zeus GPU-based resources in the past. You will receive an allocation invitation soon.
  • would like to migrate their compute jobs to GPU-based resources. You can request access by contacting the Pawsey Service Desk.  

We thank the researchers who took part in the Topaz Early Adopters program and their effort and feedback in developing the software stack and environment.

Topaz GPU compute nodes are currently being used by 30 projects.

Topaz GPU Visualisation Nodes

Each remote visualisation node in the Topaz cluster has the following configuration:

  • 2x Intel Xeon Silver 4215 CPUs (8 cores in each CPU, 16 cores in total),
  • NVIDIA QUADRO RTX5000 GPU card (16GB DDR6 memory),
  • 192 GB RAM, and
  • 100Gbit Infiniband between compute nodes.

Access to the Topaz remote visualisation nodes is provided to projects that have an active allocation at Pawsey.  Remote Visualisation users are already successfully using the new Topaz resources. If you want to access this remote visualisation service, visit https://remotevis.pawsey.org.au/. For detailed instructions on accessing the service, refer to the Remote Visualisation Documentation

For users who are still using the older, Zeus-based remote visualisation, visit https://zeus.pawsey.org.au:3443/

Zeus based remote visualisation nodes will be decommissioned on 7 April 2020.



Take advantage of the new High Priority Mode

If your project is below 100% usage for the quarter, your project can use up to 5% of its quarterly allocation in high priority mode.  Full details are available at Queue Policies and Limits#HighPriorityMode.  This mode is more flexible than using the debug queue, where normal limits on job size and length apply.  High priority mode is ideal for users wanting to test a few iterations of a simulation before running the full job.  We remind that the debug queue is for debugging code and jobscripts.



Discover the SLURM environment changes implemented on January maintenance

In the January maintenance we introduced the SBATCH_EXPORT=NONE and SRUN_EXPORT_ENV=ALL defaults across all Pawsey supercomputers.  This was implemented to improve job provenance to improve reproducibility of science, assist Pawsey staff in reproducing issues, make it easier to share jobscripts with colleagues, and save typing.  If you have not seen these parameters already, refer to Jobscript Reproducibility and Shell Initialisation Scripts, which also assist job provenance.

The above parameters work for the majority of users.  However, if you are chaining multiple jobs together using sbatch (running sbatch from within a jobscript), then some environment variables may not be passed to child jobscripts.  Workflow engines such as Nextflow rely on the use of passing environment variables this way.  The solution for these is to "unset SBATCH_EXPORT" in the jobscript before running Nextflow or similar.  An example workaround is at Nextflow sbatch Job Error.



Interactive computing with JupyterHub and RStudio examples are now available

Pawsey User Support Documentation has been recently updated with template scripts for using JupyterHub and RStudio interactively on Zeus and Topaz. All scripts involve:

  • submitting a job to the queue that will launch a Jupyter notebook or RStudio server with the use of Singularity,
  • creating an SSH tunnel from your local machine to Zeus or Topaz; and
  • connecting to the Jupyter or RStudio server via a web browser on your local machine.

We provide a few examples, one of which shows how to use GPUs through Jupyter notebook running NumPy and Numba Python code.

Refer to the following pages for more details:


Python 2 end of life

As announced in December 2019, we have started a migration process from Python 2, which is no longer supported by Python Software Foundation, to Python 3 on all Pawsey systems: Python 2 End of Life. More specifically:

  • Starting from December 2019, we have stopped supporting Python 2 and all Python 2 packages.   
  • Starting from the last Maintenance (March, 2020), we have changed the default Python module on all Pawsey systems to Python 3.

These changes have a few consequences and potential actions for Pawsey users, one of which is loading the default Python module:

$> module load python

This code will load the Python 3 module. However, we encourage all users to specify the exact tool version for all modules available on Pawsey systems.

Note that Python 3 binary should be called by:

$> python3

After receiving maintenance feedback from users and reviewing https://www.python.org/dev/peps/pep-0394/, we have decided to provide a python symlink. After loading the default Python module, this symlink will point to the python3 binary. We believe that this will make it easier for some users to adapt to the recent change. 

Good Practice: However, we want to stress that it is considered good practice to call a particular version of Python from within scripts, i.e., calling python3 instead of python.



New Cray Development Toolkit (CDT) on Magnus and Galaxy

Cray CDT Background info

Cray usually releases a new CDT every month.

New CDTs typically have one or more updated components that Cray developers release to improve the performance and functionality of user codes that are compiled against them. Compiling user codes against newer libraries and/or with newer/multiple compilers, is an important part of code development and testing.

New CDT 20.02

A new Cray Developer Toolkit (CDT), 20.02, was made available to Pawsey users on Magnus and Galaxy during the last maintenance. The notes for this version of the CDT are available on the Cray Documentation Portal: https://pubs.cray.com/content/00775747-DD/DD00775746

Most importantly CDT 20.02:

  • contains the new Cray Compiling Environment (CCE) 9.1.2 as well as the Cray Environment Setup (craype) 2.6.4 and Compiling support (cdt-prgenv) 6.0.6,
  • no longer supports Python 2 (https://www.python.org/doc/sunset-python-2/),
  • includes the GCC v.8.3.0 compiler, and
  • includes new Python 3 packages within cray-python, which use libsci libraries and their OpenMP multithreaded versions; it is recommended to set the number of desired threads with the OMP_NUM_THREADS environment variable.

Also note that beginning with release 9.x.x (9.1.1 and 9.1.2 on Magnus and Galaxy), CCE is divided into two modules:

  • cce/9.x.x, which includes CrayLibs, the Cray Fortran Compiler, and the new Clang (LLVM) C, C++, and UPC compiler, and
  • cce/9.x.x-classic, which includes CrayLibs, the Cray Fortran Compiler, and the legacy Cray Classic C, C++, and UPC Compiler.

This is a significant change as Cray C/C++ and UPC compilers are shifting towards a new compiler base - Clang (LLVM). Users developing C/C++ codes are encouraged to use the new CCE to start the migration of their codes to the new Cray compiler suite.  



Feedback

As this is our first newsletter, we would like to improve the information in future newsletters that we provide to you.  We have created a short survey and we would appreciate it if you take a few minutes of your time to complete it.  Thank you.  The link to the survey is  Feedback on Pawsey technical newsletter



  • No labels