Learn all about AWS Databases and Amazon RDS
A study guide for acing your AWS Certified Solutions Architect exam
Regardless of what type of work you do, whether front-end development, mobile application development, or pretty much any type of programming in general, you have most likely had to work with a database at some point in time.
There are two main categories of databases, Relational Database Management Systems (RDBMS), and NoSQL (Non-relational databases). Amazon RDS greatly simplifies and reduces the operational costs and time needed to maintain most major database systems.
When most people think of databases, they are thinking of relational databases. They are by far the most common types of databases, and can be found in use in every type of occupation. A few examples of relational databases are:
As the names imply, relational databases share a common system called Simple Query Language, or simply, SQL. SQL allows you to perform any task on a database, from checking the value of a specific column and row, deep data analytics across millions of rows, or even creating the database itself. For exam purposes, it's not necessary to know SQL, but knowing what it's used for, and the differences between the types of databases and when you would want to use them is important.
Relational databases contain one to possibly hundreds of tables, each table contains columns that describes the type of data, and rows, which contain the data itself. You can think of tables as a basic excel spreadsheet. The columns that describe the data are type restricted (such as Integers, Strings, Booleans, etc), and will reject data that is different than what the column allows.
One table can reference data from another table, and this is where the beauty of relational databases comes into play. For example, say you have a "users" table, and each user has a unique "id" column, this column would be called a primary key. In addition, you have another table called "posts", this table has it's own primary key column called "id", but in addition, it has another column called "user_id", which references the "id" field from the users table, this would be called a foreign key.
Using foreign keys and primary keys in this manner allows you to link together different columns from different tables, and allows you to more easily and efficiently perform operations such as finding all the posts from a specific user, or all types of cars from a particular make, and so on.
Amazon RDS significantly simplifies the setup, maintenance, and operations of relational databases. RDS supports six commonly used database systems, MySQL, Oracle, PostgreSQL, Microsoft SQL Server, MariaDB, and Amazon Aurora.
NoSQL databases are not part of Amazon RDS, but it's still important to understand what they are, and the other services AWS offers that you can use to implement and manage them.
NoSQL databases are not relational, and do not use tables like a relational database would. These types of databases instead use key/value stores, also known sometimes as document stores, that allow for very flexible schemas that can vary from one document to another. One of the most common types of NoSQL database used is MongoDB. AWS provides support for MongoDB type databases with Amazon DocumentDB , but it also provides another type of NoSQL database through their self-titled service called DynamoDB .
It's possible to run any other type of NoSQL database on AWS using your own installation on an EC2 instance. Some examples of other NoSQL databases are:
Setting up, maintaining, and supporting a database can be a very involved task, and usually is handled by a dedicated DBA, or database administrator. Amazon RDS simplifies all of this, and allows you to focus more on developing your applications, and allows you to save both time and money.
You can typically launch a new database instance within only a few minutes, instead of hours, and have them be consistent, secure, and backed up with minimal intervention.
RDS allows you to replicate your databases across multiple availability zones, so they can reach different audiences more efficiently, and to improve the durability of your databases. RDS also makes it easy to scale up your databases as your workloads increase.
Each database has a single exposed endpoint, and can be accessed from all the tools you would typically use to manage your database.
Amazon RDS provides an API that lets you manage and deploy multiple instances of your database. A database instance is a self-contained, and isolated installation of your database, you could think of each instance as a mini EC2 instance that you can't modify, apart from the database configuration itself. If you need more fine-tuned control of a database, you can also setup an EC2 instance with whatever database system you require.
Instances size, computational power, and memory are decided upon the initial creation of the database, and can be changed further down the road as your needs and work-loads change. While your database is being scaled up your down in this manner, it's still accessible, but in a read-only state.
There are many different configuration and feature settings that you can use when setting up your database, RDS calls these parameter groups and option groups. A parameter group contains the database configuration, and option groups contain specific engine features. Both can be changed at any time, but it will require the database instance to be rebooted.
The benefits of using RDS
Amazon RDS provides you a consistent and reliable workflow to manage and maintain your relational databases by automating many repetitive tasks and configurations for you. While you are able to access the databases with all the common tools you already use, you cannot use SSH to login to a database instance, if you require this ability, you will need to setup your database on an EC2 instance.
The types and versions of database engines RDS supports is very thorough, and changing constantly to support newer versions as they are released. As of writing this article, these are the types and versions RDS supports. It's important to check the official documentation if you require a specific version beforehand as it may or may not be available yet, or might require additional configuration steps, such as a separate EC2 instance.
Supports MySQL 8.x, 5.7, and 5.6, running the open source Community Edition with InnoDB as the default and recommended engine. RDS MySQL supports Multi-AZ deployments and additional read replicas for horizontal scaling.
Supports versions 13.x, 11.x, 10.x, and 9.x. RDS also supports Multi-AZ deployments and additional read replicas for horizontal scaling for PostgreSQL.
Supports versions 10.x, and Multi-AZ deployments and read replicas.
Supports Standard Edition Two, with a bring-your-own-license (BYOL) model, as well as a license included model, as well as Oracle Enterprise Edition using versions 19.x and 12.x. Oracle databases can have Multi-AZ deployments but not additional read replicas.
Microsoft SQL Server
Supports SQL Server Enterprise Edition, Standard Edition, Server Web Edition, and Sever Express Edition using versions 2012, 2014, 2016, 2017, and 2019. It also comes with a license included model, but not a BYOL model. Standard and Enterprise are the only editions that support Multi-AZ deployments.
Another type of database that RDS supports is Amazon Aurora. Aurora offers enterprise-grade commercial database technology while offering the simplicity and cost-effectiveness of many open source engines.
Aurora is a fully managed service, and it MySQL compatible out of the box, and is more reliable and performant than a typical MySQL deployment.
Aurora instances are built inside of clusters, which can contain one or more instances and can span multiple availability zones.
There are two types of Aurora Instances, the primary instance, and a replica instance. There can be only one primary instance, and many replicas. Replicas are read-only, and changes are made on the primary instance, and then migrated to each replica.
Database instances are stored using Amazon Elastic Block Store, or Amazon EBS . You are able to choose many different capacities depending on your needs and these capacities can be modified as needed.
There are three different types of storage available:
- Traditional Magnetic
- General Purpose SSD
- Provisioned IOPS SSD
Magnetic is best used for light I/O databases and is more cost-effective. General Purpose SSD is generally the recommended type and faster, and capable of burst speeds for usage spikes. Provisioned IOPS is recommend for I/O intensive workloads that require a more consistent and performant environment.
Amazon RDS provides two main mechanisms for dealing with backup and restoration requirements that you may require, using a combination of automated backups that you can specify the retention period and backup window of, and manual snapshots.
Automated backups can be retained for a minimum of 1 day, and up to 35 days maximum. Automated backups are also permanently lost when a database instance is terminated.
Manual snapshots can be taken at any time, and as frequently as you want. They can be stored on S3 or downloaded as needed, and are not deleted when the instance is terminated.
Databases can be recovered quickly in the event of failure or data loss. Restorations are not done on an existing database instance, instead, a new instance is created using the backup or snapshot you require.
When a restoration is made, only the default option and parameter groups are used, and it is important to change these to your requirements after the restoration is completed.
Multi availability zone deployments can also be used to failover to a healthier instance, in the event of failure or a required recovery. Data from the master database is replicated to another database in a different availability zone. In the event of a failure, the CNAME of the endpoint for your database is changed to point to the healthier instance, so you do not need to manually change your endpoints in your application.
Database instances can be scaled vertically by using more computational power, memory, or storage space, which can be changed at any time after the database is created. Changes can be scheduled to occur on the next maintenance window, or immediately at the cost of some downtime.
Databases can be scaled horizontally with partitioning. This is done by sharding, and requires more setup in your application, and allows for handling of more requests beyond what your database class can support.
Horizontal scaling can also be done using read replicas, and is supported for most database engines. This can be very useful for websites such as a very busy blog, that has a lot of readers but only a small number of people writing articles. They also allow the database to be read during any scheduled downtime such as a scheduled maintenance window or recovery process.
Amazon Identity and Access Management can be used to control different administrative tasks such as creating and deleting database instances. To control what database tasks users are able to use, you will need to use the specific user setup when deploying the database. These are controlled independently of IAM.
A best practice for securing your database is to run the instance in a Amazon Virtual Private Cloud, that has a DB specific subnet group, which is then restricted by an Access Control List that allows only specific IP's.
There are many benefits to using Amazon RDS, that can save you time, money, and many headaches from having to manage an on-premises solution. Databases using a large variety of engines can be deployed quickly across the world using multiple availability zones so they reach your target audiences as fast as possible.