AWS virtual machine sizes: What does m5d.2xlarge mean?
Anyone who has used AWS for a few minutes has probably created an EC2 instance – a remote virtual machine that can be created with a few mouse clicks or from a script. You choose a machine image (containing operating system plus perhaps some commercial application), an instance type (representing CPU and memory and other hardware), some further settings (including networking, disk size and a firewall), and it's ready for you to use within a few minutes.
In this blog we will look at:
- What is EC2? And what is an instance?
- The logic behind AWS instance type names
- The cost of running an EC2 instance
- How to choose the right EC2 instance type
De-mystifier 1: What is EC2?
EC2 stands for "Elastic Compute Cloud", which is a fancy title for a virtual machine and the infrastructure that supports it. They didn't want to call it a VM, because that's only two letters and it wouldn't be distinctive enough. They couldn't call it an ECC because that's a type of memory and also a type of cryptography. But there are two Cs in Elastic Compute Cloud, hence the 2. Maybe it should be superscripted, like EC2.
De-mystifier 2: What is an instance?
An EC2 instance is a single virtual machine running (or stopped) within the EC2 infrastructure, within the AWS cloud environment. It might be running Linux or Windows, and you can use it just like a real Linux or Windows server computer.
The EC2 infrastructure supports everything that is required to create running EC2 instances, including networking, storage, firewalls and the base Amazon Machine Images (AMIs) from which an EC2 instance is created.
Why is this important?
The big deal about EC2 and the reason that cloud services have become so popular is that you pay for what you use. You rent an EC2 instance by the hour or by the second. This different business approach (OpEx instead of CapEx) fundamentally changes the way in which IT infrastructure can be used.
The logic behind AWS instance type names
The biggest challenge with understanding AWS is their naming of things. Everything is a TLA (three-letter acronym), or it is "elastic" or "simple". Once you get over the naming, AWS is reasonably logical. EC2 instance sizes are a good example.
Instance types have names such as:
t2.micro
r5ad.large
x1e.16xlarge
m5.metal
At first glance, the "sizes" seem to make no sense, but there is a method in the naming of AWS EC2 instance types.
There are several parts to this:
- The first letter is a "family". More on that in a moment
- The number
- Other letters before the dot
- The wording after the dot
Number:
The number is the generation within that family: M5 is newer than M4, and in a reflection of Moore's Law, newer generation instances with the same spec are usually cheaper than the previous generation. For example:
- an M5.large (2x vCPU and 8GiB RAM on a 3.1 GHz Intel Xeon Platinum 8175M) costs $0.107/hr,
whereas - an M4.large (2x vCPU and 8GiB RAM on a 2.4 GHz Intel Xeon E5-2676 or E5-2686) costs $0.222/hr.
(Ireland prices, February 2020)
The first EC2 instance to be introduced was m1.small, back in 2006. Jeff Barr produced a fascinating timeline of EC2 history.
Other letters:
Other letters before the dot are also (mostly) logical:
- A lower-case letter a (e.g. m5a.large) indicates an AMD processor instead of an Intel processor. Both Linux and Windows are supported.
- A lower-case letter g (e.g. m6g.large) indicates an ARM Graviton Processor instead of an Intel processor. At the time of writing (September 2020), only Linux is supported.
- A lower-case letter d (e.g. m5d.large) indicates the presence of instance-store disks (see footnote).
- A lower-case letter n (e.g. m5n.large) is for faster networking. For example, m5 supports 25 Gbps; m5n supports 100 Gbps.
- Sometimes these may be combined, e.g. m5ad.large has an AMD processor and an instance-store disk.
After the dot:
The bit after the dot is also logical. There is a correlation between the name and the number of vCPU, regardless of the instance type. An xlarge has four vCPUs, and any multipier of xlarge is also a multiplier of the number of vCPU.
Name | # vCPU |
---|---|
.medium | 1 (but see below) |
.large | 2 |
.xlarge | 4 |
.2xlarge | 8 |
.4xlarge | 16 |
.9xlarge | 36, etc. |
The system breaks down a bit at the bottom end: A1, M1 and M3.medium instances have 1 vCPU, but some medium instances have 2 vCPU instead (C3, T2, T3, T3a). Instances smaller than medium mostly have one vCPU (including t2.nano, t2.micro and t2.small), except for the newer T3 and T3a, which have a minimum of two vCPU.
There are also some exceptions to this pattern among the older generation instances, notably C1 and M2.
Families:
Let's get back to the "families". There are many families in the instance type tribe now, and they are helpfully grouped into five categories:
- General purpose (start here)
- Compute optimised (more CPU, less memory)
- Memory optimised (less CPU, more memory)
- Storage optimised (large quantities of local instance-store volumes)
- Accelerated computing (GPUs, FPGAs and other specialised processing)
As a general rule, the families give you an indication of the ratio of compute (vCPU) to memory (RAM). From .large instances onwards:
Compute-optimised instances have 2 GiB RAM for every vCPU (C instances)
General-purpose instances have 4 GiB RAM for every vCPU (T and M instances)
Memory-optimised instances have 8 GiB RAM for every vCPU (R5 and Z1d instances)
Storage-optimised instances have 8 GiB RAM for every vCPU (approximately) (i3, D2, i3, H1)
There are some instances with even more memory:
x1 instances: 15.25 GiB per vCPU (64 vCPU with 976 GiB RAM, or 128 vCPU with 1952 GiB RAM)
x1e instances: 30.5 GiB per vCPU (4 to 128 vCPU, 122 to 3904 GiB RAM)
The correlation of vCPU to RAM isn't quite so logical among older generation instances. Many have slightly less than the RAM specified above, which I am guessing is due to less-efficient virtualisation.
The correlation of vCPU to RAM is different in the accelerated-computing instances, where a GPU is likely to be the dominant performance characteristic. For completeness, however:
G4: 4.000 GiB per vCPU
G3: 8.125 GiB per vCPU
P3: 8.125 GiB per vCPU
P2: 15.250 GiB per vCPU
F1: 15.250 GiB per vCPU
Exceptions:
Other exceptions to the pattern described above are as follows:
- Smaller instances (smaller than .large) don't adhere to the rules, mainly because you can't have less than one vCPU.
- The A instances (first-generation Graviton ARM machines) have a vCPU-to-RAM ratio which should put them in with other Compute-optimised instances. Apparently they compare well against C instances on both price and performance.
- The correlation on older-generation instances isn't so easy to see. For example, R4 instances have 15.25 GiB for 2 vCPU, which is *nearly* an 8:1 ratio.
- Accelerated-computing instances (where the focus is on the GPU not the CPU).
- High-memory instances (u*.metal) - all have 448 logical processors, but different amounts of RAM, up to 24 TiB.
The cost of running an EC2 instance
The cost of running an EC2 instance is broken down into a number of factors:
- the cost of the EC2 itself
- the cost of the EBS volume (a virtual "disk" which has a separate lifecycle to the EC2 instance)
- costs associated with some parts of networking, such as having a static IP address
- bandwidth costs for data out of AWS
The cost of an EC2 instance is based on the hours it is running: when you stop the instance, it costs nothing. When you start it again, you start paying again. The pricing page shows the cost per hour for different instance sizes. For most operating systems (RedHat, Windows, SuSE), you pay the whole billing hour even if the machine only runs for a few minutes. For Amazon Linux and Ubuntu, billing is calculated on a per-second basis (minimum of one minute).
The cost of a machine with twice the vCPU will cost twice as much. Looking at m5 instances with Ubuntu in the London region, the prices per hour (as published in September 2020) are:
m5.large $0.111 2 vCPU 8 GiB RAM
m5.xlarge $0.222 4 vCPU 16 GiB RAM
m5.2xlarge $0.444 8 vCPU 32 GiB RAM
m5.4xlarge $0.888 16 vCPU 64 GiB RAM
m5.8xlarge $1.776 32 vCPU 128 GiB RAM
This means it is equally cost-effective to choose a smaller number of larger machines, or a larger number of smaller machines, depending on the use case.
The chart below shows the correlation of machine cost (US$ per hour) versus machine size (GiB RAM) for different M instances in the Ireland region. AMD instances (M5a) are cheaper than Intel instances (M5). The M5ad and M5d instances additionally have local instance-store volumes, and the M5n and M5dn instances have enhanced networking: extra features with an associated extra cost.
Side note: like everything else in AWS, the prices are also available via an API, so you can write your own application to calculate costs if you want to.
In addition, EC2 instances will have one or more attached EBS volumes. These are used for the EC2's boot disk (the C:\ drive on a Windows machine). EBS volumes have their own lifecycle and are costed separately.
In addition, there might be networking costs related to the running of EC2 instances, including the use of load balancers, static IP addresses (called "Elastic" IP addresses) and NAT gateways.
Additionally, any data transferred out of AWS to the internet, or across regions, also incurs a cost per GB.
How to choose an instance type
With all this information, choosing an instance type might seem like rocket science. The following points might help:
- Start with an M instance type, and use CloudWatch monitoring to understand utilisation.
- Changing instance type involves stopping the instance, changing to the new instance type and restarting the instance. This can be achieved in a few clicks. A few moments of downtime are necessary.
- For memory-heavy databases, the R instances possibly don't have enough memory. Look at the X1 and X1e memory-optimised instances.
- For Big Data clusters, or anything which requires vast amounts of local storage, consider the H or D storage optimised instance type.
- If your workloads have bursty CPU usage, consider a T2 or T3 instance, which I blogged about previously.
- For bedtime reading, find the whitepaper The Well Architected Framework and read up on best practices for building applications in the cloud.
Further help
In order to get the most out of AWS, consider attending a training course. QA offers the full range of AWS training courses, as well as many other areas of technical instruction.
Footnotes:
EBS and instance store:
EBS volumes are the preferred storage for EC2: they behave like a physically attached disk, but are network-attached near the hosts on which the EC2 instance is running. EBS volumes are perfect for the boot disk (operating system disk) and for most applications, including high-performance databases.
Instance storage volumes are disks (magnetic HDD or solid-state SSD) attached locally to the physical host on which the EC2 is running. They are significantly faster than an EBS volume, but they are ephemeral: when the machine stops, the data is lost (in fact the physical devices are actively zeroed out to prevent data leakage). For a boot disk, you have to use EBS. Instance store volumes are great for buffers, caches, scratch data and other temporary content, or for data that is replicated across a fleet of instances, such as in Big Data clusters.