...
- Step 1: In the File menu, select Site Manager. A window for the "Site Manager" settings will pop-up.
- Step 2: In the lower left corner of the window, select "New Site". Name the site as you want (hpc-data is the name used here).
- Step 3: In the Protocol entry, select SFTP as the protocol. In the Host entry, type hpc-data.pawsey.org.au
- Step 4: In the Logon Type as "Ask for password".
- Step 5: Click "Connect". Use your username and your password for accessing Pawsey systems.
...
- Step 1: Open a New Login Setup window. If it is not opened by default, then from the Session menu, choose "New Session".
- Step 2: On the left panel choose "New Site" and on the bottom menu "Manage" (a bit on the left) choose "save as" and define a name for the connection (hpc-data is the name we are using here).
- Step 3: Use hpc-data.pawsey.org.au as the hostname, your username and your password. (Do not save the password in WinSCP).
- Step 4: Click save and then Login.
Basic usage
After a connection has been established, the current directories in the local and remote filesystems are listed. You can navigate intuitively within the file systems by clicking into subdirectories or by going up one level using the "Parent directory" icon (
). Several other navigating icons exist:But this basic navigation may not be optimal for Pawsey filesystems. A more practical and faster way is to go directly to the directory we want to work with. For that, double click on the current directory path on the line shaded with grey colour (in this case, the line that has "/home/espinosa/"). A pop-up window will appear and there you can type directly the desired path (for example /scratch/[project]/[user]) and even save it as a bookmark for later use:
When finished, choose "Disconnect" within the Session Menu.
...
You should set default permissions for transferred files. See the WinSCP UI permissions documentation for how to access the settings. See the below image for recommended settings, which gives read/write access to others in your project, and no access to anyone outside your project.
Warning | ||
---|---|---|
| ||
WinSCP has a setting to preserve timestamps (i.e. modification times) of the files in the destination system. This should not be used when transferring files to the /scratch filesystem, where a 30-day purge policy is in place (see Scratch Purge Policy). Using the -t option to transfer to /scratch files that have not been accessed for more than 30 days will result in the deletion of those files by the purge policy, and then data loss. |
...
- Step 1: Click the Open Connection button (at the top left of the main GUI window).
- Step 2: Fill out the Server, Username and Password fields. Do not tick the Save Password box.
- Step 3: Accept the SSH key fingerprint by selecting "Allow", and to remember the setting by ticking the "Always" box.
In Cyberduck it is useful to add a bookmark once you have navigated to the desired folder on the server. Add via the Bookmark menu.
...
It is available as a module
on data mover nodes. Users may use it directly from command line when on a data mover node or submit a job on copyq
.
Code Block |
---|
module load mpifileutils mpirun -np 4 dcp -p SOURCE DESTINATION |
SOURCE
can either be a file or directory. The option -p
preserves the file attributes e.g permissions, group association and ownership of the files.
Code Block |
---|
mpirun -np 4 dcp -f SOURCE DESTINATION |
Option -f
deletes the any files on in DESTINATION
directory if an error occurs during the operation.
Here is a sample output of a 100GB file copied from group
space to scratch
. (The file was striped over 4 OSTs)
Code Block |
---|
mshaikh@hpc-data1:/scratch/pawsey0001/mshaikh/dcp2-test> mpirun -np 4 dcp $MYGROUP/dcp2-test/100gb.bin $MYSCRATCH/dcp-test/ [2017-01-27T12:34:41] [0] [handle_args.c:315] Walking /group/pawsey0001/mshaikh/dcp2-test/str-c4/100gb.bin [2017-01-27T12:34:42] [0] [dcp.c:222] Creating directories. level=6 min=0 max=0 sum=0 rate=0.000000/sec secs=0.000006 [2017-01-27T12:34:42] [0] [dcp.c:430] Creating files. level=6 min=0 max=1 sum=1 rate=1.587000 secs=0.630120 [2017-01-27T12:34:43] [0] [dcp.c:922] Copying data. [2017-01-27T12:48:00] [0] [dcp.c:967] Fixing permissions. [2017-01-27T12:48:00] [0] [dcp.c:1362] Syncing updates to disk. [2017-01-27T12:48:00] [0] [dcp.c:146] Started: Jan-27-2017,12:34:41 [2017-01-27T12:48:00] [0] [dcp.c:147] Completed: Jan-27-2017,12:48:00 [2017-01-27T12:48:00] [0] [dcp.c:148] Seconds: 798.996 [2017-01-27T12:48:00] [0] [dcp.c:149] Items: 1 [2017-01-27T12:48:00] [0] [dcp.c:150] Directories: 0 [2017-01-27T12:48:00] [0] [dcp.c:151] Files: 1 [2017-01-27T12:48:00] [0] [dcp.c:152] Links: 0 [2017-01-27T12:48:00] [0] [dcp.c:154] Data: 100.000 GB (107374182400 bytes) [2017-01-27T12:48:00] [0] [dcp.c:158] Rate: 128.161 MB/s (107374182400 bytes in 798.996 seconds) |
Passphrase-less secure transfers
Transferring data using a protocol based on SSH allows us to protect information and ensure its integrity. However, setting up a proper environment configuration can be tricky; if not done right, security risks arise. This is especially true when one wants to automate copy operations, for example through a SLURM job on data mover nodes. In such a scenario, a public key-based authentication method is recommended because the ssh client, running on a Pawsey's supercomputer node, will only need the private key to connect to a third-party system's ssh server, which in turn has the correspondent public key to be used to perform a secure handshake. The private key, however, must not be protected by a passphrase otherwise a human input is required. There are several issues to address in the described situation.
A user must generate a key-pair specifically for this purpose, i.e. data transfers from and to Pawsey systems. Let's call it COPYPAIR. Do not repropose an existing key-pair used to log in to Pawsey or other systems (which by the way, should use a passphrase). This allows isolation of unauthorised accesses due to a compromised key-pair.
The ssh server on the third party system should be configured to avoid using COPYPAIR's public key to authorise connections not originating from Pawsey's data mover machines. This is a powerful capability that protects the third party server from unauthorised use of COPYPAIR from outside the Pawsey network. To enable the discussed feature prepend to COPYPAIR's public key the string
from="hpc-data*.pawsey.org.au" no-port-forwarding no-pty
followed by a space; then, append the result to the ssh server's authorized_keys file. Here is an example.
Code Block |
---|
from="hpc-data*.pawsey.org.au" no-port-forwarding no-pty ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDhGk1QdMVDVao1j9eclHPPhniU5x6rHYBhJp88DJZrEiDM3Kt70+gHvo/fCGaHmOMWQX0hjqLs5uin42VGUW7w3y0FrIBB/hZJro+JKXJzhUJFpTE/wR08CK8DI4c3GrxjrCqNRkd3ff4AOUIgS7VFGcmagg9aAj6iSas1ibvAMLMZuXkVyPcNcKhB+J38atc3u5/zuRqU9QgKGQvTQgLL7lx4CrsHGKd8bPzjdEVDaCoeD1KBdRq/S+am2wvaPwN5wqqgs6hVU83VvZggIBkGRLBbGEeMmnzu8dkG1osqE4S3RCmFVQ8MG9tiOiP0MN/jx/DpckP++NnuamJWcD/Z comment
|