Amazon S3 is also called Amazon Simple Storage Service.
Amazon S3 is a cloud-based storage service offered by Amazon Web Services (AWS). It allows users to store and retrieve any amount of data from anywhere on the internet. S3 is highly scalable and can handle extremely large amounts of data, making it ideal for big data projects and disaster recovery solutions.
S3 supports various data types including text, images, videos, and audio files. S3 offers three types of storage classes – Standard, Intelligent-Tiering, and Glacier, which are designed to meet different storage and retrieval needs.
S3 provides strong data durability through redundant data storage across multiple facilities and devices. S3 provides a secure and flexible data management solution with options for data encryption and access control. S3 integrates with other AWS services such as Amazon EC2, Amazon RDS, and Amazon EBS, making it easy to use and manage data in the cloud.
S3 offers various tools for data management and analysis, such as Amazon S3 Transfer Acceleration and Amazon S3 Inventory. S3 has a pay-per-use pricing model, which makes it cost-effective for businesses of all sizes, as they only pay for the data they store and transfer.
Let us discuss the top 10 interesting features of Amazon S3.
- Amazon S3 has Objects and Buckets: S3 is one of the service by Amazon Web Services that allows us to store files, also called objects in directories, also called buckets.
- Objects have a key. The key is nothing but a full path of the object. The key is nothing but a full path of the object. In an Amazon S3 bucket, an object is identified solely by its object key. You can specify object metadata when you upload an object to Amazon S3. Metadata is data about data. The metadata of an item is a set of name-value pairs. After an object is uploaded, its metadata cannot be changed. The only way to modify an item’s metadata is to create a copy of it and set the metadata on it.
- Buckets must have a globally unique name. Bucket names are kept in a shared namespace that is accessible to all users of cloud storage. If you attempt to create a bucket with a name that already exists in another bucket, cloud storage will give you an error.
- The max object size is 5TB. The size of a single Amazon S3 object can vary from 0 bytes in least size to 5 TB in maximum size. 5 GB is the maximum size for a one PUT upload. If you have extrmely large data set(>5TB) then you need to split it into more than one object.
- S3 suports object tags. Object tagging is a way to categorize storage. These are key value pair up to 10, useful for security and lifecycle of object. Note that, In S3, tags and object metadata are two different things. Tags may be searched for, metadata cannot be searched for because it only relates to that specific object in S3.
- S3 has infinite size, and it eliminates over-provisioning: With cloud storage, you can make changes instantly and use the storage you need right now without being shackled by a hardware upgrade. By removing over-provisioning, moving to Amazon S3 keeps you flexible, lowers expenses, offers infinite growth, and helps you break down data silos to derive insights from data.
- S3 has 99.999999999% durability: Amazon S3 provides a highly durable storage. It is designed for 99.999999999% durability and 99.99% availability of objects every year.
- S3 Object storage support any file format. Some examples of file types are CSV, JSON, Protobuf, Parquet, Avro and ORC. Any sort of file can be uploaded to an S3 bucket, including photos, backups, data, movies, etc. Using the Amazon S3 console, you can upload files up to 160 GB in size. Use the AWS CLI, AWS SDK, or Amazon S3 REST API to upload files more than 160 GB.
- S3 is decoupled from compute: You can control the costs of storage and computation separately and apply various cost-optimization tools to reduce costs by decoupling storage from computation. For example S3 is decoupled from EC2.
- S3 has centralized data architecture: It is simple to create a multi-tenant environment using Amazon S3 so that numerous users can utilize various analytical tools on the same copy of the data. Compared to conventional methods, which call for the distribution of multiple copies of data across several processing platforms, this improves cost and data governance.
- S3 is backbone for many AWS ML services. We can build reproducible and scalable machine learning system with Amazon S3 as the central data and model repository. For example Sage-Maker generally uses S3 as storage for data and model artifacts.
- Learn more about Implementation of Amazon S3 Data Storage