Points of Contact:
- Paul Newman
- Ugo Varetto
firstname.lastname@example.org <email@example.com>; Randall Wayth <R.Wayth@curtin.edu.au>; Andrew Williams <Andrew.Williams@curtin.edu.au>; firstname.lastname@example.org; pawsey_users <email@example.com>; James.Dempsey@csiro.au
- 07:30 Cluster being worked on, checked and restarting nodes.
- 08:30 Most services up, checking.
- 09:30 All nodes powered off. Full cluster restart.
- 11:40 All services checked. Outage resolved.
- 11:41 email to all users sent out.
- The MDS (Meta Data Server) became unresponsive, but not sufficiently so to trigger an HA fail over.
- The root cause is yet to be determined.