Managing versions in object storage
Enabling versioning on an object storage bucket is a great way to make sure you never lose important work, but how exactly do you get a file version back if you want to revert your changes? This blog post covers everything you need to know about version management!
Enabling versioning on an object storage bucket is a great way to make sure you never lose important work, but how exactly do you get a file version back if you want to revert your changes, or if you accidentally delete a file, and how do you manage and delete old versions when you no longer need them? This blog post will cover all of this and more!
Note: We recommend using Cyberduck for all version management, since it largely simplifies the process; however, if you want to work with a large number of versions at once, you may be better off doing it programmatically with the AWS CLI. We will provide both options throughout this blog post.
How can I view versions in my object storage bucket?
First things first, it's important to know that an object storage bucket can be in one of three states at any one time in regards to versioning:
- Unversioned: versioning is off and has never been on
- Versioning-enabled: versioning is currently on
- Versioning-suspended: versioning is currently off, but has previously been on in the past
If you have previously enabled versioning on an object storage bucket, and then later turn it off, any previous file versions will still exist within that bucket i.e. turning versioning off does NOT remove current file versions, it only prevents new versions from being created.
You can view file versions in your object storage bucket via Cyberduck or via the terminal using AWS CLI.
In Cyberduck, once you have connected to your object storage bucket, you can view object versions by navigating to View in the top menubar and selecting Show hidden files.
Using the AWS CLI, once you have configured your object storage bucket, you can list all versions with the following command (replacing mybucket.store.ronin.cloud
with your bucket name):
aws s3api list-object-versions --bucket mybucket.store.ronin.cloud --no-paginate
If your object storage bucket was created prior to 2023, you may not have the necessary permissions to be able to manage versions. Please contact your RONIN administrator or send us a message with your institution and bucket name so that we can enable them for you.
The output will be divided into two sections (use the return button to scroll through the output):
- Versions: All versions of each file in your bucket
- DeleteMarkers: Files that have recently been deleted
Note: When you enable versioning on an object storage bucket, and delete a file either from within Cyberduck or using the AWS CLI, the file isn't actually deleted. Instead a delete marker is added, and becomes the latest "version" of that file.
For each file version within these two sections, the following key information will be given:
- Size: size of the file
- Key: name of the file
- VersionId: the unique Id given to that specific version (this is important for restoring versions later on)
- IsLatest: whether that version is the latest or current version of that file
- LastModified: date that version was last modified
If you would like to only list versions of a particular file, or in a particular folder, (which is recommended to reduce the number of results when your bucket contains a lot of versioned files) you can add the --prefix
option. For example, to see version information for a file called "myfile.txt" you would run:
aws s3api list-object-versions --bucket mybucket.store.ronin.cloud --prefix myfile.txt --no-paginate
Or, to see version information for files within a folder called "results", you would run:
aws s3api list-object-versions --bucket mybucket.store.ronin.cloud --prefix results/ --no-paginate
How do I revert back to a particular version of a file?
Again, you can restore a previous file version (or recover a file that was deleted) via Cyberduck or the AWS CLI.
In Cyberduck, you can right-click on the version you want to revert to and click "Revert" (Note, if the file has recently been deleted, you cannot revert back to the latest version which has the delete marker, instead revert to the second latest version - it's easy to tell which versions are delete markers as they don't have a file size listed).
If using the AWS CLI, to revert back to a previous version, you actually have to perform two steps:
- Retrieve the version you want to restore using
get-object
- Put this version back as the latest version using
put-object
First, use the command in the section above to find the VersionId for the version you want to restore. Then run the following command (replacing the bucket name, the name of the file you wish to restore and the version Id):
aws s3api get-object --bucket mybucket.store.ronin.cloud --key myfile.txt --version-id CLS52Wapa4g6l2fmcR483rJbQLNof9TB myfile.txt
Note: You need to include any paths to your file when specifying the--key
in this command and any other commands in this blog post. For example, if your file is a couple of folders deep, this section may look something more like:--key results/run1/myfile.txt
This will download that particular version of the file to your current directory. To then put this version back as the latest version run the following command:
aws s3api put-object --bucket mybucket.store.ronin.cloud --key myfile.txt --body myfile.txt
What if I accidentally deleted a file?
Fortunately, as mentioned previously, deleting a file on a versioning-enabled object storage bucket, doesn't actually delete the file. So you can just restore a previous version as described in the section above. Otherwise, you can also just delete the delete marker so that the version you deleted becomes the current version again.
To do this, grab the VersionId for the delete marker of the file as outlined in the first section of this blog post (make sure you select the delete marker where "IsLatest" is true) and then run (replacing the bucket name, file name and version Id):
aws s3api delete-object --bucket mybucket.store.ronin.cloud --key myfile.txt --version-id '.yHmSIci1glzBno58LbNHr2InMNUrlR0'
So how do I actually delete a file or file version when I no longer need it?
Much like the previous command, to delete a version of a file, you need to run the delete-object
command and specify the VersionId explicitly:
aws s3api delete-object --bucket mybucket.store.ronin.cloud --key myfile.txt --version-id 'lGCXPg2HVPAPPoRQr9vsrAne78DBWxAT'
In Cyberduck, you can delete a version by right-clicking on the version and then selecting "Delete".
Alternatively, you can tell RONIN to delete old (i.e. non-current) versions of files after a certain number of days in the "versioning" settings of your object storage bucket - more information is available in this blog post.
To delete a whole file forever, you must delete EVERY version of that file so that in no longer exists.
If you have a lot of specific versions you want to delete, Cyberduck will likely be the fastest solution. Alternatively you could create a list of the files and their VersionIds and script a for loop or use something like GNU Parallel to run the delete-object
command multiple times.
Note: Sometimes, if you are wanting to do a big clean-up at the end of your project, it may be easier to download just the files/versions you want to keep or store, and then move these to a fresh object storage bucket, rather than trying to clean up what you currently have.
We hope this blog post helps you go on to be the best version of yourself... or at least helps you stop you worrying about that file you accidentally deleted!