Page tree
Skip to end of metadata
Go to start of metadata

This page outlines main changes in Pawsey's next-generation supercomputing services. 

On this page:

Background

In 2018 the Australian Government awarded $70 million to upgrade Pawsey’s supercomputing infrastructure. As part of the Pawsey Capital Refresh project, Pawsey will offer, over the next year, the following new resources to researchers.

Figure 1. Supercomputing, storage and network infrastructure after the Pawsey Capital Refresh project

What will change in practice?

The Capital Refresh of Pawsey and the introduction of our new computing system Setonix and new storage system Acacia, will translate into 5 practical changes that researchers must be aware of when applying for a new allocation for the year 2022.

Change 1: The supercomputer

Setonix will accommodate all new allocations starting from 2022. Topaz will support GPU workflows with the primary purpose of preparing for their migration to the Setonix GPU partition.

Magnus and Zeus are scheduled to be shut down throughout Q1 2022. At this time, existing computational workflows will be migrated from Magnus and Zeus and supported on Setonix. They will be gradually shut down over the course of Q1 2022 when the use migration to the new system will take place. Existing computational workflows running on Magnus and Zeus will be migrated and supported on Setonix. 

There are two major changes researchers will encounter with the support of Pawsey staff:

  • The switch from an Intel to an AMD processor architecture
  • The replacement of a filesystem for project-length storage, such as /group, with an object storage system (more on this later on this page).

Although software currently available on Magnus is expected to be compatible with Setonix's new architecture, the list of software officially supported by Pawsey and the level of support will change. Changes on the software stack:

  • There will be the latest versions of each supported software, abandoning older ones still present on Magnus (exceptions may apply).
  • Software not supported by Pawsey can still be installed by users. Notably, the quantum chemistry package Siesta won't be installed system-wide.
  • New applications and libraries have entered the list of supported software.

Visit the Setonix section for more information about the new supercomputer.

Change 2: The allocation schemes

Compute-time merit allocations on Setonix may be obtained through the following schemes:

  • The National Computational Merit Allocation Scheme (NCMAS) – This scheme operates annual allocation calls open to the whole Australian research community and provides substantial amounts of compute time for meritorious, computational-research projects.
  • The Pawsey Partner Merit Allocation Scheme – This scheme operates annual calls open to researchers in Pawsey Partner institutions and provides significant amounts of compute time for meritorious, computational research projects. The Partner institutions are CSIRO, Curtin University, Edith Cowan University, Murdoch University and The University of Western Australia. There is an out-of-session application process for newly eligible project leaders.

The Pawsey Energy & Resources Merit Allocation Scheme will be discontinued. Researchers from the Australian energy and resources research community are encouraged to apply through the NCMAS and Pawsey Partner schemes. 

Change 3: The accounting model

With Setonix, Pawsey is moving from an exclusive node usage to a proportional node usage accounting model. While the Service Unit (SU) is still mapped to the hourly usage of CPU cores, users are not charged for whole nodes irrespective of whether they are been fully utilised. With the proportional node usage accounting model, users are charged only for the portion of a node they requested.

Each compute node of Setonix can run multiple jobs in parallel, submitted by a single user or many users, from any project. Sometimes this configuration is called shared access.

A project that has entirely consumed its service units (SUs) for a given quarter of the year will run its jobs in low priority mode, called extra, for that time period. Furthermore, if its service unit consumption for that same quarter hits the 150% usage mark, users of that project will not be able to run any more jobs for that quarter.

Change 4: The project storage

On the new system, there will be no /group. Research projects with allocations on Pawsey systems have been using the /group filesystem to store both software and data for a duration spanning their lifetime.

With Setonix, user-private and project-wide software will be installed on a new dedicated filesystem, /software.

Project data to be stored in the long term will have to be transferred to and from Acacia, the new object storage system procured by Pawsey. For more information about Acacia and object storage, see Acacia.

Data will still be read and written by jobs on the /scratch filesystem, the fast, distributed Lustre filesystem. Once the data has been dealt with, it must be moved to Acacia.

Change 5: The filesystems

The new supercomputer comes with new filesystems. Some of them will have the same purpose as their predecessor, just bigger and faster. These are:

  • /software : new LustreFS parallel file system to store both Pawsey-maintained software and software installed by research projects and users. The total capacity is 307Tb.

  • /scratch :  similar to existing system. LustreFS parallel file system to temporarily store data needed or produced by jobs. It will have a total capacity of 14Pb, of which 2.7Pb are SSD over NVMe and 11.3Pb are SAS HDD.
  • /home  : same as existing system. NFS storage for user configuration files.

Detailed changes

Setonix supercomputer

The Setonix supercomputer is a heterogeneous system consisting of CPUs and GPUs, with AMD providing both types of hardware, and it is based on the HPE Cray EX architecture. After its complete delivery, Setonix will have more than 200,000 CPU cores and 750 GPUs, with a peak computational power of 50 petaflops. Nodes will be interconnected using the Slingshot interconnect, providing a 100Gb/s bandwidth, later to be upgraded to 200Gb/s. The AMD Infinity Fabric interconnect provides a direct channel of communication among GPUs, as well as among CPUs and GPUs.

The system will be delivered to the Pawsey Supercomputing Centre by HPE in two phases, Phase 1 and Phase 2.

Figure 2. Setonix and Acacia timeline


Phase 1

Phase 1 will provide all the filesystems, one-third of the CPU-only compute nodes, half of the visualisation and high-memory nodes. The system has a capacity of 2.4 PFLOPs.

The system is predominantly CPU-only, each node equipped with two AMD Milan CPUs for a total of 128 cores and 256GB of RAM. A small number of GPU nodes are provided to users for training and technical experimentation purposes.


Table 1. Phase 1 of Setonix

Purpose

Nodes

CPU

Cores Per Node

RAM Per Node

GPU

Log In

4

AMD Rome (later upgraded to AMD Milan)2x 64256GB

CPU computing

504

AMD Milan (2.45GHz, 280W)

2x 64

256GB
CPU High mem8AMD Milan (2.45GHz, 280W)2x 641TB
Data movement8

AMD 7502P

1x 32128GB
Visualisation16AMD Rome (later upgraded to AMD Milan)2x 64512GB2 x Nvidia Quadro RTX A6000 Graphics Accelerator

Phase 2

Phase 2 deployment will upgrade Setonix to its full computational capacity by adding over 1000 CPU nodes, more than 750 AMD Instinct™ GPUs as well as login, visualisation and data mover nodes.

Software environment

Setonix will leverage the Linux-based operating system CrayOS and the HPE Cray Programming Environment, which is fully supported by the vendor. Here are some of its key characteristics:

  • Job scheduler: Setonix adopts the Slurm job scheduler to manage resources and to grant users fair access to those.
  • Software stack: Pawsey installs and maintains a predefined set of applications and libraries optimised for Setonix, using Spack.
  • Programming environments: There are three available programming environments, PrgEnv-cray, PrgEnv-gnu, and PrgEnv-aocc, that respectively give access to Cray (loaded by default), GNU and AMD compilers, along with a consistent set of libraries.
  • Module system: Software is organised in modules. LMOD is the module system of choice.
  • Vendor-provided libraries: HPE Cray provides hardware optimised libraries such as MPICH, FFTW, CBLAS and HDF5.

Allocation schemes

NCMAS and Pawsey Partner schemes

Researchers will experience a substantial increase of computational resources available in both NCMAS and Pawsey Partner schemes due to the delivery of Setonix Phase 2 in second half of 2022. 

In 2022 researchers applying through NCMAS and Pawsey Partner schemes will apply for allocations on for Setonix Phase 1 (1st Request) and Setonix Phase 2 (2nd Request) separately. 

The applications will be open only for Setonix CPU partition allocations.

Resources available for first and second requests are presented in table 1. 


Table 2. Resources available for first and second requests

Scheme

1st Request

full year

2nd Request

2H 2022 pro rata

National Computational Merit Allocation Scheme

Scheme total capacity

100M Service Units

max. 155M Service Units
Minimum request size250k Service Units1M Service Units
Pawsey Partner Merit Allocation SchemeScheme total capacity110M Service Units

max. 190M Service Units

Minimum request size100k Service Units1M Service Units

At the time of writing, the exact date of Setonix Phase 2 availability for researchers is not known. The maximum capacity in each scheme was calculated based on assumption that Setonix Phase 2 will become available from the beginning of second half of 2022. The actual allocation sizes for secondary requests will be scaled pro rata based on Setonix Phase 2 availability. Here are two example scenarios:

Example 1

Research group A applied and was awarded 2M SUs on Setonix Phase 1 and 10M SUs on Setonix Phase 2. Due to delays Setonix, Phase 2 became available for researchers on the first day of Q4 2022.  

The real allocation of research group A is:

  • 2M SUs on Setonix Phase 1 available throughout the year,
  • 5M SUs on Setonix Phase 2 available in Q4 2022.

Example 2

Research group B applied and was awarded 1M SUs on Setonix Phase 1 and 5M SUs on Setonix Phase 2. Setonix Phase 2 became available for researchers on the first day of 2H 2022. The real allocation of research group B is:

  • 1M SUs on Setonix Phase 1 available throughout the year,
  • 5M SUs on Setonix Phase 2 available in 2H 2022.

Setonix and Gadi Service Unit models

Pawsey and NCI centres are using slightly different accounting models. Researchers applying for allocations on Setonix and Gadi should refer to table 3 when calculating their allocation requests. 


Table 3. Setonix and Gadi service unit models

Resources usedService Units

Gadi

48 Intel Cascade Lake cores per node

Setonix

128 AMD Milan cores per node

1 CPU core / hour21
1 CPU / hour4864
1 CPU node / hour96128

Energy & Resources Scheme 

The Pawsey Energy & Resources Merit Allocation Scheme will be discontinued. There will be no Energy & Resources Scheme call for applications for 2022 allocation round onwards. 

Researchers from the Australian energy and resources research community are encouraged to apply through NCMAS and Pawsey Partner schemes. 

Acacia storage

The upgrade of the Pawsey’s computing infrastructure will also include the deployment of large-scale object storage Acacia for scientific data. Each supercomputing project will be allocated project storage of 1 terabyte by default, and up to 10 terabytes can normally be accommodated. Project storage allocations will be limited to the duration of compute allocation. In addition, researchers can apply for managed storage allocations, separately from merit allocation processes. Managed storage access is intended for storing larger data collections with demonstrable research value according to a curated lifecycle plan.

What is object storage?

Object storage is a storage architecture where each unit of storage is an object. An object is made of:

  • Data, representing the information to be stored. Most of the time it is a file, but it might also be a portion of a file or simply a sequence of bytes.
  • Metadata, containing information about what the data is.
  • A globally unique identifier for the object to be found.

This is in contrast with the block storage that is the foundation of more common filesystems, such as /home. In block storage, the unit of storage is the byte. Filesystems add complexity organising bytes in files, and files in a hierarchy of folders.

In object storage, there is no hierarchical organisation. An object is an atomic unit, and objects are grouped in a flat structure.

Interacting with Acacia

Dedicated clients will be provided to transfer objects (files, most of the time) between Acacia and /scratch, or between Acacia and third-party systems (for instance, the user laptop).

Frequently Asked Questions

How can I apply for Setonix GPU resources?

In 2022 researchers applying through NCMAS and Pawsey Partner allocation schemes can only apply for time on a Setonix CPU partition. There will be a separate Early Science call for access to Setonix GPU partitions. Details of the application process will be announced in 1H 2022.

External links

  • No labels