What is a data store?

A data store is a digital repository that stores and safeguards the information in computer systems. A data store can be network-connected storage, distributed cloud storage, a physical hard drive, or virtual storage. It can store both structured data like information tables and unstructured data like emails, images, and videos. Organizations use data stores to retain, share, and manage information across business units.

Why is a data store important?

You can use a data store to reliably save information in computer systems and prevent data loss. Computer systems store information on persistent storage devices. Persistent storage is nonvolatile, which means the storage retains the data even after a device’s power is turned off. This ensures that the computer system has access to the same data after it is powered on again.

Businesses use data stores to manage, categorize, and streamline data for operations, analysis, reporting, and data retention, which is important for regulatory compliance. Data stores have several use cases, such as data created and consumed by applications, data archiving, data analytics, and disaster recovery.

Due to the complexities in data requirements, companies use different types of data storage infrastructure to provide accessibility, redundancy, governance, and transparency. For example, organizations use Amazon Elastic File System (Amazon EFS) for a serverless file system and Amazon Simple Storage Service (Amazon S3) for object storage. 

In the context of data storage, several terms are often used interchangeably but have slightly different meanings. We give some examples below.

Database

A database is an organized storage system. Most databases are based on the relational database architecture. The relational database management system (RDBMS) allows users to store data in tables associated with specific data points. Organizations use databases to store transactional data, such as accounting, sales, and administrative logs.

Read about relational databases »

Data stores compared to databases

Discussions on data stores involve different methods to store and retrieve information. A database is one method that allows applications to store, share, and retrieve data easily. Unlike file systems, a database adheres to specific rules of how data is organized, formatted, and stored in the database. 

Data warehouse

A data warehouse is an extensive collection of business-related information acquired from various sources. Companies use data warehouses to support business intelligence and analytics. Business analysts and data scientists derive actionable insights from a data warehouse.

Read about data warehouses »

Data stores compared to. data warehouses

Data store is an umbrella term that includes the different hardware, technologies, formats, and architectures for storing and retrieving information. A data warehouse is a specific type of data store for consolidating analytical data for businesses. For example, GE Renewable Energy uses AWS Redshift to gain new insights into its collected data. 

How does a data store work?

A physical data storage device is the underlying technology behind a data store. You can read and write information to the device in specific formats such as files, tables, or blocks. The device can be local, remote, or in the cloud. Large data stores are typically distributed across multiple physical devices in different geographic locations. Software systems and services abstract the underlying operations of the data store.

We give some examples of physical devices below. Different types of data storage devices provide varying degrees of security and redundancy.

Flash and SSD drives

A solid state drive (SSD) is a semiconductor technology that allows the writing and reading of data in flash memory chips. Flash storage technology was commercially available in pen drives before becoming an alternative to hard disk drives (HDD). Compared to an HDD, a physical SSD has no moving parts, which means it has faster performance and a longer lifespan.

Hybrid storage array

Hybrid storage array is a physical storage setup that consists of an SSDand an HDD. While an SSD offers a low-latency operation, it costs much more per-unit storage than an HDD. Therefore, organizations use a hybrid storage array to balance performance, capacity, and cost.

RAID

RAID stands for a redundant array of independent disks. It is a technology that keeps the same data in multiple places on an SSD.

What are the different data store formats?

Data stores are designed to process and organize data in different formats.

File storage

File storage organizes stored information in a top-to-bottom hierarchy of files and folders. Computers use file storage to make storing, searching, and retrieving information easy for users. You can use the file storage system to store and organize almost any type of data. While file storage is easy to use, it is hard to scale horizontally due to its tightly connected architecture.

Read about file storage »

Block storage

Block storage divides data into multiple pieces of evenly sized segments called blocks. The block storage system stores different data blocks on different physical devices. It will retrieve and reassemble the pieces when users request specific data. It uses a mapping system to locate the requested data based on block metadata. Metadata is additional information that helps users or applications find specific information in the storage.

Read about block storage »

Object storage

Object storage stores unstructured data in a scalable, self-contained repository that can be hosted on different servers. Every data block that belongs to an object is described in its metadata. For example, an object can store social media content, videos, emails, and audio files. Applications search for information in the object storage by using specific metadata attributes such as video resolution, duration, and location.

Read about object storage »

What are the different types of data stores?

There are several different types of data stores, each bearing unique setup and characteristics.

Direct-attached storage

Direct-attached storage (DAS) consists of storage devices that connect physically to a computer. For example, a DAS setup connects a hard drive, optical disc, or flash drive to a computer. Creating backup copies on DAS is fairly straightforward, but data sharing with other computers is difficult.

Network-attached storage

Network-attached storage (NAS) is a file-dedicated storage device that makes data continuously available for applications and users to collaborate on effectively over a network. NAS devices are specialized servers that handle only data storage and file sharing requests. They provide fast, secure, and reliable storage services to private networks.

Read about NAS »

Storage area network

Storage area network (SAN) is a high-speed data storage infrastructure that uses different types of storage media and protocols. Businesses use SAN to scale block storage with ease and affordability. SAN uses storage virtualization to hide the complexity of the infrastructure from multiple devices.

Cloud storage

Cloud storage is distributed storage infrastructure hosted and managed by cloud providers. It is more scalable, flexible, and remotely accessible compared to on-premises storage. For example, users can connect to AWS cloud storage services as long as they have an internet connection and are authorized to access the data. Cloud storage is also cost-efficient as users pay only for the capacity used.

Hybrid cloud storage

Hybrid cloud storage allows companies to segregate data between on-premises and cloud storage services. Hybrid cloud storage helps companies migrate from legacy architecture to a lower-cost, more secure cloud environment.

How can AWS help with your data store requirements?

AWS provides several dozen cloud storage services to meet your data store requirements. Additionally, you have the option to host whatever you want on your Amazon Elastic Compute Cloud (Amazon EC2) instances. To choose the best AWS cloud storage service for your requirements, you need to:

  • Segment your system into workloads.
  • Identify a data storage mechanism that is most suitable for a particular workload, not a single data store for the entire system.
  • Further optimize by cost and performance to find the data store service that is most suited for you.

For example, Amazon Relational Database Service (Amazon RDS) is a popular choice for organizations that wish to set up and scale relational databases. It provides applications with a high-availability cloud data store for storing persistent operational data. Amazon RDS offers a self-managed database provisioning solution that frees developers from the tedious setup of storage infrastructure.

Get started with data stores on AWS by signing up for an AWS account today.

Data Store Next Steps

Check out additional product-related resources
Explore Free Databases Offers 
Sign up for a free account

Instant get access to the AWS Free Tier.

Sign up 
Start building in the console

Get started building in the AWS management console.

Sign in