Amazon S3 Glacier

Amazon S3 Glacier is an archiving storage solution for infrequently accessed data.

There are two Amazon S3 storage tiers for using S3 Glacier:

S3 Glacier

  • Same low latency and high throughput performance of S3 Standard.
  • Designed for durability of 99.999999999% of objects in a single Availability Zone†.
  • Designed for 99.99% availability.
  • Backed with the Amazon S3 Service Level Agreement for availability.
  • Supports SSL for data in transit and encryption of data at rest.
  • S3 Lifecycle management for automatic migration of objects to other S3 Storage Classes.

S3 Glacier Deep Archive

  • Designed for durability of 99.999999999% of objects across multiple Availability Zones.
  • Data is resilient in the event of one entire Availability Zone destruction.
  • Supports SSL for data in transit and encryption of data at rest.
  • Low-cost design is ideal for long-term archive.
  • Configurable retrieval times, from minutes to hours.
  • S3 PUT API for direct uploads to S3 Glacier, and S3 Lifecycle management for automatic migration of objects.

The key difference between the tiers is that Deep Archive is lower cost but retrieval times are much longer (12 hours).

The S3 Glacier tier has configurable retrieval times from minutes to hours (you pay accordingly).

Archived objects are not available for real time access and you need to submit a retrieval request.

Glacier must complete a job before you can get its output.

Requested archival data is copied to S3 One Zone-IA.

Following retrieval you have 24 hours to download your data.

You cannot specify Glacier as the storage class at the time you create an object.

S3 Glacier is designed to sustain the loss of two facilities.

S3 Glacier automatically encrypts data at rest using AES 256 symmetric keys and supports secure transfer of data over SSL.

S3 Glacier may not be available in all AWS regions.

S3 Glacier objects are visible through S3 only (not Glacier directly).

S3 Glacier does not archive object metadata, you need to maintain a client-side database to maintain this information.

Archives can be 1 bytes up to 40TB.

S3 Glacier file archives of 1 byte – 4 GB can be performed in a single operation.

S3 Glacier file archives from 100MB up to 40TB can be uploaded to Glacier using the multipart upload API.

Uploading archives is synchronous.

Downloading archives is asynchronous.

The contents of an archive that has been uploaded cannot be modified.

You can upload data to Glacier using the CLI, SDKs or APIs – you cannot use the AWS Console.

S3 Glacier adds 32-40KB (indexing and archive metadata) to each object when transitioning from other classes using lifecycle policies.

AWS recommends that if you have lots of small objects they are combined in an archive (e.g. zip file) before uploading.

A description can be added to archives, no other metadata can be added.

S3 Glacier archive IDs are added upon upload and are unique for each upload.

Archive retrieval:

  • Expedited is 1-5 minutes retrieval (most expensive).
  • Standard is 3.5 hours retrieval (cheaper, 10GB data retrieval free per month).
  • Bulk retrieval is 5-12 hours (cheapest, use for large quantities of data).

You can retrieve parts of an archive.

When data is retrieved it is copied to S3 and the archive remains in Glacier and the storage class therefore does not change.

AWS SNS can send notifications when retrieval jobs are complete.

Retrieved data is available for 24 hours by default (can be changed).

To retrieve specific objects within an archive you can specify the byte range (Range) in the HTTP GET request (need to maintain a DB of byte ranges).

S3 Glacier Vault

A vault is a collection of archives.

Vault has:

  • One vault access policy
  • One vault lock policy

Vault policies are written in JSON.

A vault access policy is similar to a bucket policy.

A vault lock policy is a policy that is locked for regulatory and compliance requirements. This type of policy is immutable – it can never be changed.

Scroll to Top