Search open search form

Rclone backups to AWS

Setup Rclone 1.62.2 for AWS backups from HPC

Setup the IAM user permissions in AWS

Perform the following before configuring rclone.

Log into the AWS console

Navigate to IAM

Click Add User

name: rclone-backup-from-hpc

Do not check "Provide User accounts to the AWS Management Console" as this account will be used for CLI based access via rclone.

Click Next > Select "Attach policies directly"

Search for S3 and select "AmazonS3FullAccess". If this AWS account already has multiple buckets created it's up to the end user to specify a more granular access policy to S3 for this new IAM user.

Click Next > Create User

Now that user is created, click on the User in the IAM console

Navigate to Security Credentials and locate the Access Keys section.

Click "Create access key" and select "Applications running outside of AWS"

Click Next

For Description enter "rclone-backups-resnick-hpc" > Create Access Key

Leave this window open and access keys visible during the configuration of rclone on the cluster side.

Configure Rclone

module load rclone/1.62.2

rclone config

Enter name for new remote.

name> my-aws-account

Select S3 compatible object store. (This same backend can also be used for backblaze B2, Wasabi etc.)

Storage > 5

Now select the actual provider of this S3 compatible object storage

Provider > 1

Option env_auth. Specify 1 to enter your credentials or add the credentials to your shell environment.

env_auth > 1 

Enter the new access key id_shown in the aws console during IAM creation.

access_key_id > AKIAX4EICHL66XXXXXXX

Now enter secret access key provided by the AWS console


Select a region for the storage bucket. We recommend us-west-2 (Oregon) as it's one of the newer regions, is distinct seismically from California and provides fast network access via Cenic/Internet2

regions > 4

Option Endpoint (leave default by hitting enter)

endpoint > (enter)

Location constraint (used when creating buckets via rclone commands). Set to option 4 (Oregon us-west-2)

location_constraint > 4

Optional ACL (just hit enter to leave blank)

>acl > (enter)

Optional server side encryption. Shown disabled below, set accordingly.

server_side-encryption > (enter)

Option sse_kms_key_id

sse_kms_id > (enter)

Option Storage Class

Standard IA (Good for fast recovery with the intention of not touching the data much). Set accordingly based on requirements and full understanding of the implications. 

storage_class > 4

Edit advanced config? No

y/n > n


Configuration complete.


- type: s3

- provider: AWS

- access_key_id: AKIAX4EICHL66XXXXXXX

- secret_access_key: aa8jXXXXXXXXXXXXXXXXXXXXXXXXX

- region: us-west-2

- location_constraint: us-west-2

- storage_class: STANDARD_IA

Keep this "my-aws-account" remote?

y) Yes this is OK (default)

e) Edit this remote

d) Delete this remote



Enter yes to save config.

y/e/d> y

Quit the config by entering q

e/n/d/r/c/s/q> q

Now create the new bucket for backups inside of the AWS account. Here we are creating the resnick-hpc-backups bucket.

rclone mkdir my-aws-account:/resnick-hpc-backups/


* Rclone stores configs and aws keys in config/rclone/rclone.conf.

* If using one of the deeper storage classes such as glacier-deep_archive-glacier_ir, you'll want to familiarize yourself with its limitations, retention periods and retrieval fees.

* To specify a remote, be sure to include the colon, i.e.


* Backups are self-managed, you will need to check on the backup process now and then to assure everything is backing up as intended.

Useful Links

[Amazon S3 Glacier Storage Classes | AWS](

[AWS Pricing Calculator](

Common commands

Basic ls

rclone ls my-aws-account:/resnick-hpc-backups/

Basic copy command

rclone copy ~/my-source-directory my-aws-account:/resnick-hpc-backups/

-P --progress flag to view real-time transfer statistics.

Scheduling reoccuring backups with crontab

In order to schedule a job that runs repeatedly, you may add it to your crontab. We suggest running crontab at 5 minute intervals as a test to assure the backups are capturing new data on the cluster side. (i.e. run a 5 minutes while adding data to the cluster/source directories then check S3 to verify they are showing up.)

crontab -e

*/5 * * * * /central/software/rclone/1.62.2/rclone copy ~/Google-GCP my-aws-account:/resnick-hpc-backups/

After verifying the data is being backed up successfully you can switch to a daily or longer type backup. Example for once a day at 4AM

0 4 * * * /central/software/rclone/1.62.2/rclone copy ~/Google-GCP my-aws-account:/resnick-hpc-backups/