TensorFlow is an open-source software library for machine learning.By taking advantage of multiple GPU nodes, it is possible to run machine learning and deep learning tasks at scale. Pages in this section of the Pawsey documentation will cover details on how to do it.
The Pawsey staff would like to thank Joel Geoffrey Dunstan, a UWA student who worked with TensorFlow on Topaz during an internship, for having shared his acquired expertise on how to perform distributed computations using the tool. His feedback was the right starting point that led to the creation of this tutorial.
TensorFlow on Setonix
To learn how to run TensorFlow on Setonix, please visit Running TensorFlow on Setonix.
Prerequisites and Remarks for using TensorFlow on Topaz
- The proposed solution is going to use Python >= 3.6, TensorFlow 2.1 and Horovod 0.19.0. All of these are already present in the NVIDIA TensorFlow container.
- The use of Jupyter Notebooks is still under investigation. All the python scripts are submitted as jobs using