Page tree
Skip to end of metadata
Go to start of metadata

Maintenance of all Pawsey systems happens on a ongoing basis.  Maintenance, whether requiring an outage or not, allows Pawsey to take preventative action towards mitigating any hazards or risks that might affect the functionality of those systems and/or to upgrade the capabilities of its systems . Maintenance typically includes software/hardware updates, routine performance checks and faulty component replacements. 

Pawsey schedules "systems at risk times" so that our user community can plan for outages that may be required but which have not yet been communicated, and aims to carry out any maintenace activity which may affect our user community within those times.

The current scheduled "systems at risk time" is the first Tuesday of each month, however maintenance outages will normally be comfirmed, along with an estimated timeframe , by email, the week before

Incidents are, by their nature, unscheduled and, furthermore, service outages can arise from incidents with infrastructure beyond the Pawsey systems themselves (eg, Power, Cooling)

Maintenance and Incident wiki pages

Pawsey maintains Maintenance and Incident pages, within its public facing wiki, that seek to provide information to its user community.

Pages will contain the date of the Maintenance or Incident in a format following the ISO 8601 standard, YYYY-MM-DD

Pages are typically prefixed with either 'M-' (Maintenance) or  'I-' (Incident).

Pages may be suffixed so as to indicate the systems affected: '-All';  '-SC' (Supercomputing); '-Data'; '-Nimbus (Cloud),or '-Vis' (Visualisation),
whilst an un-suffixed Maintenance page will usually be found to contain generic details of an wider maintenance outage . 

Progress updates

Pawsey staff will try to provide progress updates on these pages , as workloads allow, however when incident outages have occurred,  we do ask our user community for their patience and understanding with regard to updates, as staff focus will be on getting the systems back into service while sustaining all the jobs in the queues.

For a list of Scheduled Maintenances, and of recent/ongoing Maintenances or Incidents, please see the automatically generated index below.
Note that older pages can be accessed from within the tree view visible from any individual Maintenance or Incident page.

  • No labels