Page tree
Skip to end of metadata
Go to start of metadata

Pawsey User Forum (Melbourne), Monash, 23rd November 2017


How should I determine the best number of cores to use for my parallel code?

The first and most important step is to benchmark the code for your research problem. This will give an understanding of how the performance of the code scales as the number of cores increases. For a code that scales linearly, using more cores can result in a faster run time. For codes that lose efficiency as they scale, there is a trade off between faster run times and the amount of allocation consumed. An informed decision should then be made balancing the run time and allocation consumption. One important consideration is whether the code will finish or reach a checkpoint within the 24 hour maximum wall time. It is also worth noting that once the run time is shorter than the time it takes the job to schedule and start running, efficient use of allocation should be the main focus.

The mid-year break is not a good time for attending events. The mid-semester breaks and the week before first semester starts would be best. Ideally at least 3 months notice would be preferred.

The training schedule for 2018 is still being developed and should be available soon. We have endeavoured to ensure events are scheduled around these times. However, with the large number of training courses, road shows, clinics, and user forum events that Pawsey operates it it unfortunately necessary to hold some of them during semester.

It would be useful for training to be available at least twice a year at each location. It's not feasible to have students travel interstate for training. Alternately, it would be good to have more frequent web casting of courses.

Pawsey currently aims to hold training events at least once a year at various locations around Australia. It is very useful feedback that there is a desire for more frequent training. We are also investigating online training options to supplement our standard training courses.

What is Pawsey's relationship with NCI?

Pawsey and NCI are the two peak HPC facilities in Australia. While the governance of Pawsey and NCI are independent of each other, we regularly collaborate in a number of areas. Time on Magnus (at Pawsey) and Raijin (at NCI) is made available along with other systems from around Australia through the annual National Computational Merit Allocation Scheme (NCMAS). We also co-host booths at various international supercomputing conferences such as ISC and SC.

What conferences does Pawsey aim to have a presence at?

There are a number of technical conferences that Pawsey staff attend to ensure our skill sets and industry contacts are maintained, to ensure we have the knowledge and capability to procure, operate, and assist our users with cutting edge technology and computational techniques. Internationally this includes the ISC and SC conferences, and nationally the eResearch Australasia conference. While we also attend a number of domain specific computational conferences in Australia, we would like to have a better coverage of these events. The main obstacle to our attendance can often be knowing about them. If you are organising such an event, and believe it would be helpful for Pawsey staff to attend or even provide some training, please don't hesitate to get in touch.

As a user that has served on a merit allocation review panel, it seems that new users do not have a good understanding of the architecture of the supercomputers and how their codes are suited to them.

A strong merit allocation should include scaling and performance benchmarks of the project's codes that have been run on the system that is being applied for, either through time from a previous project or via a Director Share project well in advance of the allocation call. Providing these benchmarks demonstrates to the review panel a number of things:

  • the project group's codes can run successfully on the system
  • the codes are suitable for the particular system
  • the time requested is based on calculated estimates rather than guessing
  • and that the group is ultimately capable setting up their work flow

Applications that do not include this information tend to receive smaller allocations, or are rejected entirely.

We have previously run merit allocation workshops to help users, and are happy to assist users prepare their applications if they contact us well ahead of the application deadline.

Could Pawsey staff provide benchmarks for common codes for use in merit applications?

For the majority of the scientific codes that use Pawsey supercomputers, the application performance is very dependant on the particular dataset or problem that is being computed, and so generic benchmarks are not necessarily representative of the actual work flows. While Pawsey staff are highly skilled in their field, they also may not have the specific science domain knowledge to know how to set up benchmarks in this manner, such as providing correct parameters and options to the codes.

Sometimes the modules I want are not available for a particular compiler.

We provide a number of compiler options on Pawsey supercomputers. Some of these are common to most of the systems, such as the Intel and GNU compiler suites, while others may be system specific such as the Cray compilers. For the selection of commonly used applications and libraries that are provided through the system module system, our staff endeavour to provide versions compiled with all of the available compilers for the system. Unfortunately however, many codes have been written to conform to a single compiler, rather than an actual language standard, and will only compile with particular compilers and in some cases only particular versions of those compilers. For these codes it is simply not possible to provide versions for all compilers, and their module files should produce conflict errors if the wrong compiler environment module is loaded. We strongly recommend that any users developing code choose and adhere to a language standard, and regularly test compilation of their code with a number of different compilers and versions.

I am encountering a memory error on Magnus that I am having trouble debugging with DDT, and the code runs fine on my workstation.

In you encounter issues such as these, don't hesitate to get in touch via the Pawsey User Support Portal with full details of the errors you are encountering so our staff can investigate further and provide assistance.

The Magnus login nodes keep logging me out after 20 minutes of inactivity, this seems quite short and is disruptive to my work.

It is important to clean up sessions that may not have cleanly disconnected, such as loss of wireless connectivity, to ensure the login node resources are available for those that are actively using them. SSH clients can be configured at the user end to periodically send empty packets over the connection, sometimes referred to as ghost packets, to maintain a connection.

Who is responsible for removing data off /scratch?

As part of the work flow for using the supercomputer, a user should manage moving data and results off the /scratch file system to an appropriate place for longer term storage either at Pawsey or on institutional resources. On previous systems, we have had quotas per project for the /scratch file system, but this means that much of the space remains unused. Removing quotas from /scratch for Magnus has largely been a positive one, as there is a much higher utilisation of the file system and significantly reduced overhead for both users and staff around managing quotas on a per project basis. However, we do occasionally have issues when the entire file system gets too close to full. This can occur either through the total amount of data becoming too large, or less intuitively, there can be too many files regardless of size. While we do run servers that purge files older than 30 days, this is a very intensive process and it would be better for the performance of the file system for users to remove files as they are no longer needed.

How do I manage cleaning up my data on /scratch?

As with most aspects of HPC work flows the best approach is to automate it. Note that it would be a waste of allocation to include scripting to clean up and move data as part of a large multi-node job. It is more efficient to use job dependencies to schedule a second, smaller job that is dependant on the main job so it will not start too early. This dependant job can remove temporary files, compress many small files into a single tar or zip file, and move data off /scratch.

How do I attend training?

Our training dates and registration details are provided on the Pawsey website, usually several months ahead of the sessions.

It would be good to include more detail on scripting in the training courses.

This is useful feedback, as we are currently updating our training material for next year.

There is a need for good citizenship on HPC systems to ensure good usage of the system for all.

A supercomputer is a shared resource, and Magnus has several hundred projects with over thirteen hundred users. There are processes and safeguards for the fair use of the system, and to prevent misuse. These are generally lightweight, as more draconian measures start to negatively effect the genuine scientific research activity occurring on the system. Some user activity occasionally effects other users as a result, often unintentionally, and these cases are dealt with through user education. You can help by engaging with Pawsey training courses and documentation, sharing information with colleagues, and encouraging others to do the same.

What information is needed to get a Pawsey cloud computing instance on Nimbus, and how much is available?

The two main pieces of information required is the size of the instance and the duration for which it is needed for. Cloud instances are available for Australian researchers from all scientific domains, on a case by case basis depending on availability.

How does Pawsey's cloud link to the storage infrastructure?

Nimbus currently has its own file system, approximately 288 terabytes in size. Investigations are currently under way into integration with Pawsey's other file systems, including the supercomputing spaces and long term storage. The intent of these investigations is to avoid unnecessary movement of data in the centre. This is something that will be carefully considered in any future capital refresh.

Why is there no Python 3 system module on Magnus?

We have been waiting for the upcoming operating system upgrade to CLE6 on Magnus. We plan to have a system module for Python 3 available early next year.

Back to Top

Pawsey User Forum (Sydney), UNSW, 30th October 2017


It is easier to attend Pawsey events such as roadshows, training, and user forums during non-teaching weeks.

Pawsey is currently preparing the schedule for 2018, and the teaching schedule of the Australian universities is a consideration. We do try to make sure our events coincide with non-teaching weeks. However, it can be difficult we typically hold more events than there are non-teaching weeks, and they are not always aligned between institutions in an area.

As someone new the Australian computational research community, what HPC resources are available?

Contacting the relevant staff at your local institution is the best first step to find out what is available at your institution. For larger HPC allocations, the National Computational Merit Allocation Scheme (NCMAS) provides time on various resources including Raijin at NCI and Magnus at the Pawsey Supercomputing Centre. Calls for time through this scheme occur in September and October for allocations that run for the subsequent calendar year.

I'd like to run a particular licensed software on Magnus, why is it not available?

Pawsey licenses a number of software products to provide to users. However, these are typically compilers, profilers, numerical libraries and debuggers. Currently, the policy is for application specific software licenses to be provided by the researcher, and any necessary license servers to be operated by the researcher's institution. We are happy to coordinate with the organisation granting the license and institutional staff to help enable such software to operate. However, for some software the terms of the license prevent it being run at Pawsey, or may provide additional limitations. Examples include licenses that only allow for software to be run on systems owned by the researcher's institution, or by the person that installed the software, or even within a set physical distance between the user and the system.

The /group file system is really useful, especially for sharing software installs with other members of our project.

It's great to hear that it has been useful. It's also a good place for job scripts and other files for sharing with other project members.

It is difficult to copy and paste information into the Pawsey application portal.

Historically, a lot of applicants left off critical information from their application. Examples include estimates of the number and size of jobs they intended to run, details of the code and dependencies, sizes of data products, memory footprints, and data transfer details. Fields have been added over time to encourage applicants to provide sufficient information to assess.

It may be possible in the future to provide fields at a coarser level to make it easier to insert information.

There are so many schemes to apply for it can be confusing to a new user.

The various schemes are due to the nature of the funding for the infrastructure. Through co-investment from various parties, a larger system can be obtained. Schemes provide the ability to have parts of the allocated time with different eligibility criteria based on the funding secured.

If you are having difficulty deciding which scheme to apply for, our staff are happy to guide you through the process.

We were really happy with the responsiveness of our Director Share application when we needed access quickly.

Access to Magnus via the Pawsey Director Share Scheme is open throughout the year. Typically, this takes a couple of weeks from application to a project having access. If there is a reason why access is needed sooner, such as to prepare for a conference paper deadline, you can let us know and we can try to expedite the process. This can be helped be providing sufficient information in the application.

Why is there a 24 hour limit to jobs on Magnus? I would like to run jobs for longer.

There are a number of factors that led to the 24 hour limit. From an operational point of view, nodes on Magnus are serviced in groups of four which means that if one node is faulty, the jobs running on the other three will complete within 24 hours to allow the node to be swiftly returned to service.

From a scheduling point of view, it also means that all nodes will have a job finish at least once a day, providing an opportunity for other jobs to start running. A longer wall time would lead to less node turnover, and increase the time for jobs to start running.

From a resource management point of view, when running jobs on hundreds or thousands of cores the chance of a failure is much higher. By having codes checkpoint progress at least once every 24 hours, the loss of large amounts of processing time due to component failure is mitigated.

Pawsey can provide reservations with longer wall times for exceptional cases. We are also evaluating the creation of a long queue on Zeus next year to allow jobs to run for several days.

We would appreciate any further feedback regarding a preference for the duration of the long queue. However, workflows that run for weeks or months without check-pointing can't be supported, and may be good candidates for parallelisation.

This was also raised at the previous user forum, so we recognise this is an important issue for our users.

My group is interested in machine learning, what resources are available at Pawsey?

Our advanced technology cluster, Athena, in currently being tested by early adopters. It has 11 nodes each with 4 NVIDIA Pascal P100 GPUs, and would be an excellent test bed for a machine learning workload. We expect to open the system to our users over the next couple of months.

We are also investigating adding some GPU nodes to Nimbus, Pawsey's cloud service, in the near future. This will be a prime development platform for machine learning. 

It seems that there is a high frequency of incidents on Magnus.

Magnus did have some initial issues as it was brought into production, particularly with the /scratch file system. However, it has since been very stable. Our operations team have decided to make information on all incidents that may impact users available, even if they are quite minor.

How do I get access to GPU systems at Pawsey?

The project leader of allocations via merit processes such as the NCMAS, Energy and Resources, and Pawsey Partner schemes can get in touch via the help desk with a short justification of why access is needed. This is to ensure that the GPU-enabled nodes are used for GPU jobs rather than general processing.

Will a similar system be in place for the long queue?

The method of access to the long queue is still under consideration. It is possible a similar ask for access mechanism will be used. Other centres take the approach of charging the allocation at a higher rate for cores in the long queue. We would appreciate further feedback from users if there is a strong preference.

Back to Top

Pawsey User Forum (Perth), Murdoch University, 6th October 2017


What can I do to increase the chance my jobs on Magnus will run sooner?

The scheduler is effectively playing Tetris with the jobs a couple of times a minute. Providing the scheduler with jobs that fit the available time may allow it to fit your job into a gap. For this reason it is important to set the wall time of jobs close to the expected run time with a modest margin for it taking longer than expected, rather than the queue limit of 24 hours.

Sometimes our jobs fail because we do not have enough licenses available on our license server.

It may be possible to configure the scheduler to check your license server for available licenses before commencing your job. The feasibility depends on the type of licenses and availability of the license server, and implementation may take time depending on staff availability. Users should contact Pawsey via the help desk to discuss in more detail for particular cases.

Will other jobs running on the system effect my run times?

The most common point of contention for the system are the shared filesystems (/scratch and /group). There is 70GB/s of bandwidth but often the slowdown is caused by a heavy load on the metadata server, rather than the object servers. Due to the version of Lustre supported on the /scratch filesystem it is limited to one metadata server. Pawsey is currently evaluating the acquisition of a second metadata server for /group, and will ensure future filesystems have additional metadata servers as well.

I received an email about the filesystem being full last week, and removed some files, is there enough space now?

The filesystem consists of a number of storage targets that are used in a round robin scheme. The scheme should avoid using storage targets that are approaching capacity, but was not configured to do so. This was addressed in the recent maintenance on Tuesday. It is always helpful for users to remove files from the scratch filesystem once they are no longer needed, as it frees capacity for other users and reduces the load on the purge daemon.

How is the schedule for training determined? It would be helpful if it coincided with non-teaching weeks for universities.

We are currently looking at the training schedule for 2018. Historically we have held training every two months locally, in addition to several times a year nationally. We try to align training with non-teaching weeks where possible, however it is complicated by the breaks for different universities not aligning.

Why is there a chipset difference between the Magnus login and compute nodes?

The first phase of Magnus was two cabinets of Intel Sandy Bridge processors, with matching login nodes. When Magnus underwent its petascale expansion, these were replaced with the current 8 cabinets of Haswell processors, but the login nodes were not upgraded. Pawsey has since acquired Haswell login nodes, which are currently undergoing configuration. These login nodes are part of the investigation Pawsey is conducting on the feasibility of moving Magnus from CLE 5 to CLE 6. In the longer term, as the nodes of supercomputers become more heterogeneous, the login nodes can only match a subset of the nodes and some compilations may have to occur using compute nodes in the debug queue.

Why is there a 24 hour limit to jobs on Magnus? I would like to run jobs for longer.

There are a number of factors that led to the 24 hour limit. From an operational point of view, nodes on Magnus are serviced in groups of four which means that if one node is faulty, the jobs running on the other three will complete within 24 hours to allow the node to be swiftly returned to service.

From a scheduling point of view, it also means that all nodes will have a job finish at least once a day, providing an opportunity for other jobs to start running. A longer wall time would lead to less node turnover, and increase the time for jobs to start running.

From a resource management point of view, when running jobs on hundreds or thousands of cores the chance of a failure is much higher. By having codes checkpoint progress at least once every 24 hours, the loss of large amounts of processing time due to component failure is mitigated.

Pawsey can provide reservations with longer wall times for exceptional cases. We are also evaluating the creation of a long queue on Zeus next year to allow jobs to run for several days.

We would appreciate any further feedback regarding a preference for the duration of the long queue. However, workflows that run for weeks or months without check-pointing can't be supported, and may be good candidates for parallelisation.

How are jobs prioritised on Magnus?

The scheduler weights a number of factors to determine the priority of a job, including:

  • The fraction of the allocation the project has used for that quarter. Projects that have used less of their allocation will receive a higher priority than those that have used more.
  • The size of a job, larger jobs receive a higher priority. This is both to encourage users to scale their workflows, and to counteract the fact that the scheduler can more easily place smaller jobs.
  • The time in the queue. Jobs that have been waiting in the queue longer receive a modest increase to their priority.
  • The quality of service. Projects that have used all of their allocation receive a much lower priority, but may still run if there is time available.

Exceptional increases in priority can be requested, for example if results are needed in time for a paper submission deadline.

Generally, groups that ensure their jobs are queued make good use of allocations.

It would be helpful if we could divide our allocation between the members of our group.

This is something that Pawsey is looking into, however it is non-trivial and may take some time to develop an interface to facilitate per-person allocations.

It would be useful for institutions to have an easy way to find out the Pawsey resources used by its staff and students.

Currently, such requests are processed manually. We are looking into the development of an automated solution for reporting to institutions, however such interfaces take time to develop.

Is a quarterly reset of allocations the right time frame?

Historically, we have previously used an annual reset, which caused issues towards the end of the year with groups that had not managed their allocations unable to run jobs for a long period of time. The quarterly reset has been a significant improvement in this regard. A monthly reset was discussed at the user forum, but was seen to be too frequent as individual months of allocation may be lost to leave or teaching duties.

Back to Top

  • No labels