Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Reverted from v. 13

...

  • Step 1: In the File menu, select Site Manager. A window for the "Site Manager" settings will pop-up.
  • Step 2: In the lower left corner of the window, select "New Site". Name the site as you want (hpc-data is the name used here).
  • Step 3: In the Protocol entry, select SFTP as the protocol. In the Host entry, type hpc-data.pawsey.org.au
  • Step 4: In the Logon Type as "Ask for password".

Image RemovedImage Added

  • Step 5: Click "Connect". Use your username and your password for accessing Pawsey systems.

...

  • Step 1: Open a New Login Setup window. If it is not opened by default, then from the Session menu, choose "New Session".
  • Step 2: On the left panel choose "New Site" and on the bottom menu "Manage" (a bit on the left) choose "save as" and define a name for the connection (hpc-data is the name we are using here).
  • Step 3: Use hpc-data.pawsey.org.au as the hostname, your username and your password. (Do not save the password in WinSCP).
  • Step 4: Click save and then Login.


Image RemovedImage AddedBasic usage

After a connection has been established, the current directories in the local and remote filesystems are listed. You can navigate intuitively within the file systems by clicking into subdirectories or by going up one level using the "Parent directory" icon (Image RemovedImage Added). Several other navigating icons exist:Image Removed

Image Added

But this basic navigation may not be optimal for Pawsey filesystems. A more practical and faster way is to go directly to the directory we want to work with. For that, double click on the current directory path on the line shaded with grey colour (in this case, the line that has "/home/espinosa/"). A pop-up window will appear and there you can type directly the desired path (for example /scratch/[project]/[user]) and even save it as a bookmark for later use:

Image RemovedImage Added

When finished, choose "Disconnect" within the Session Menu.

...

You should set default permissions for transferred files. See the WinSCP UI permissions documentation for how to access the settings. See the below image for recommended settings, which gives read/write access to others in your project, and no access to anyone outside your project.

WinSCP settings for MagnusImage RemovedWinSCP settings for MagnusImage Added

Warning
titleUse preservation of times with care

WinSCP has a setting to preserve timestamps (i.e. modification times) of the files in the destination system.

This should not be used when transferring files to the /scratch filesystem, where a 30-day purge policy is in place (see Scratch Purge Policy). Using the -t option to transfer to /scratch files that have not been accessed for more than 30 days will result in the deletion of those files by the purge policy, and then data loss.

...

  • Step 1: Click the Open Connection button (at the top left of the main GUI window).
  • Step 2: Fill out the Server, Username and Password fields. Do not tick the Save Password box.
    Cyberduck settings for MagnusImage RemovedCyberduck settings for MagnusImage Added

  • Step 3: Accept the SSH key fingerprint by selecting "Allow", and to remember the setting by ticking the "Always" box.
    Cyberduck settings for MagnusImage RemovedCyberduck settings for MagnusImage Added

In Cyberduck it is useful to add a bookmark once you have navigated to the desired folder on the server. Add via the Bookmark menu.

...

It is available as a module on data mover nodes. Users may use it directly from command line when on a data mover node or submit a job on copyq

 

Code Block
module load mpifileutils
mpirun -np 4 dcp -p SOURCE DESTINATION

SOURCE can either be a file or directory. The option -p preserves the file attributes e.g permissions, group association and ownership of the files. 

 

Code Block
mpirun -np 4 dcp -f SOURCE DESTINATION

Option -f deletes the any files on in DESTINATION directory if an error occurs during the operation. 

 

Here is a sample output of a 100GB file copied from group space to scratch. (The file was striped over 4 OSTs)

Code Block
 mshaikh@hpc-data1:/scratch/pawsey0001/mshaikh/dcp2-test> mpirun -np 4 dcp $MYGROUP/dcp2-test/100gb.bin $MYSCRATCH/dcp-test/
[2017-01-27T12:34:41] [0] [handle_args.c:315] Walking /group/pawsey0001/mshaikh/dcp2-test/str-c4/100gb.bin
[2017-01-27T12:34:42] [0] [dcp.c:222] Creating directories.
  level=6 min=0 max=0 sum=0 rate=0.000000/sec secs=0.000006
[2017-01-27T12:34:42] [0] [dcp.c:430] Creating files.
  level=6 min=0 max=1 sum=1 rate=1.587000 secs=0.630120
[2017-01-27T12:34:43] [0] [dcp.c:922] Copying data.
[2017-01-27T12:48:00] [0] [dcp.c:967] Fixing permissions.
[2017-01-27T12:48:00] [0] [dcp.c:1362] Syncing updates to disk.
[2017-01-27T12:48:00] [0] [dcp.c:146] Started: Jan-27-2017,12:34:41
[2017-01-27T12:48:00] [0] [dcp.c:147] Completed: Jan-27-2017,12:48:00
[2017-01-27T12:48:00] [0] [dcp.c:148] Seconds: 798.996
[2017-01-27T12:48:00] [0] [dcp.c:149] Items: 1
[2017-01-27T12:48:00] [0] [dcp.c:150]   Directories: 0
[2017-01-27T12:48:00] [0] [dcp.c:151]   Files: 1
[2017-01-27T12:48:00] [0] [dcp.c:152]   Links: 0
[2017-01-27T12:48:00] [0] [dcp.c:154] Data: 100.000 GB (107374182400 bytes)
[2017-01-27T12:48:00] [0] [dcp.c:158] Rate: 128.161 MB/s (107374182400 bytes in 798.996 seconds)

Passphrase-less secure transfers

Transferring data using a protocol based on SSH allows us to protect information and ensure its integrity. However, setting up a proper environment configuration can be tricky; if not done right, security risks arise. This is especially true when one wants to automate copy operations, for example through a SLURM job on data mover nodes. In such a scenario, a public key-based authentication method is recommended because the ssh client, running on a Pawsey's supercomputer node, will only need the private key to connect to a third-party system's ssh server, which in turn has the correspondent public key to be used to perform a secure handshake. The private key, however, must not be protected by a passphrase otherwise a human input is required. There are several issues to address in the described situation.

A user must generate a key-pair specifically for this purpose, i.e. data transfers from and to Pawsey systems. Let's call it COPYPAIR. Do not repropose an existing key-pair used to log in to Pawsey or other systems (which by the way, should use a passphrase). This allows isolation of unauthorised accesses due to a compromised key-pair.

The ssh server on the third party system should be configured to avoid using COPYPAIR's public key to authorise connections not originating from Pawsey's data mover machines. This is a powerful capability that protects the third party server from unauthorised use of COPYPAIR from outside the Pawsey network. To enable the discussed feature prepend to COPYPAIR's public key the string

from="hpc-data*.pawsey.org.au" no-port-forwarding no-pty

followed by a space; then, append the result to the ssh server's authorized_keys file. Here is an example.

Code Block
from="hpc-data*.pawsey.org.au" no-port-forwarding no-pty ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDhGk1QdMVDVao1j9eclHPPhniU5x6rHYBhJp88DJZrEiDM3Kt70+gHvo/fCGaHmOMWQX0hjqLs5uin42VGUW7w3y0FrIBB/hZJro+JKXJzhUJFpTE/wR08CK8DI4c3GrxjrCqNRkd3ff4AOUIgS7VFGcmagg9aAj6iSas1ibvAMLMZuXkVyPcNcKhB+J38atc3u5/zuRqU9QgKGQvTQgLL7lx4CrsHGKd8bPzjdEVDaCoeD1KBdRq/S+am2wvaPwN5wqqgs6hVU83VvZggIBkGRLBbGEeMmnzu8dkG1osqE4S3RCmFVQ8MG9tiOiP0MN/jx/DpckP++NnuamJWcD/Z comment