all blog posts


Using the EC2 Price List API

The CloudBanshee EC2 comparison page uses data from the AWS EC2 Pricing API. In this blog posts we will explain how to query and parse this data. We will look at available filters, the data structure, and how to retrieve the pricing options you're looking for.

First things first: what is the AWS Price List API? There are actually two variants:

  • The Query API, using specific requests through an SDK
  • The Bulk API, downloading very large data sets through HTML

In this blog post, we will focus on the first type. We will write a python script that uses boto3 (the python SDK for AWS) to query the API, process the results, and store the data in a JSON file. If you want to see the results, head over to the CloudBanshee EC2 section, which is powered by the EC2 Pricing API.

Getting Started

Let's set up some basic scaffolding:

import json
import boto3

pricing_client = boto3.client('pricing', region_name='us-east-1')

def get_products(region):
    paginator = pricing_client.get_paginator('get_products')

    response_iterator = paginator.paginate(
        ServiceCode="AmazonEC2",
        Filters=[
            {
                'Type': 'TERM_MATCH',
                'Field': 'location',
                'Value': region
            },
            {
                'Type': 'TERM_MATCH',
                'Field': 'instanceType',
                'Value': 'm5.large'
            }
        ],
        PaginationConfig={
            'PageSize': 100
        }
    )

    products = []
    for response in response_iterator:
        for priceItem in response["PriceList"]:
            priceItemJson = json.loads(priceItem)
            products.append(priceItemJson)

    print(products)

if __name__ == '__main__':
    get_products('EU (Ireland)')

For 38 lines of code, there is already a lot to digest here. Let's dive into it!

Pricing API Region

The pricing_client is defined as pricing_client = boto3.client('pricing', region_name='us-east-1'). The important part here is specifying the region as us-east-1. This is the only region the Pricing API is available in, so you should hardcode it. If you don't and then run your script in another AWS Region, it might fail because it can't find the Pricing API in that region.

Using an paginator

The Pricing API will return a lot of data - we'll get to that in a second. This data cannot be returned in one request, which is why we use a paginator. The paginator will return an iterator, which we can loop over. This will give us blocks of 100 results (as specified in the pagesize). We will then merge those results into an array called products.

Querying

We will query the Pricing API so it will only return prices for EC2 instances located in the 'EU (Ireland)' region, with the m5.large instance type. If you expect that this will only return one result, however, you might be in for a surprise.

Region name

Unfortunately, the Pricing API region requires the Region Name (eg. 'EU (Ireland)') instead of the Region Code (eg. 'eu-west-1') as a filter. You can find all region names on the AWS Regions and Endpoints page.

Results of running the code

Executing this script will return a JSON document of almost 10.000 lines! In the document you will find 60 products, all of them considered an m5.large in Ireland. So what gives? We'll look at that in the next section.

So many products

If you look at a single product, you'd find that it has a the following structure:

[
    {
        "product": {
            ...
        },
        "serviceCode": "AmazonEC2",
        "terms": {
            ...
        },
        "version": "20190520201514",
        "publicationDate": "2019-05-20T20:15:14Z"
    }
]

We will look at the terms section later, but for now, let's focus on the product section. The product section of one of the 60 products might look like this:

"product": {
    "productFamily": "Compute Instance",
    "attributes": {
        "enhancedNetworkingSupported": "Yes",
        "memory": "8 GiB",
        "dedicatedEbsThroughput": "Upto 2120 Mbps",
        "vcpu": "2",
        "capacitystatus": "Used",
        "locationType": "AWS Region",
        "storage": "EBS only",
        "instanceFamily": "General purpose",
        "operatingSystem": "Linux",
        "physicalProcessor": "Intel Xeon Platinum 8175",
        "clockSpeed": "2.5 GHz",
        "ecu": "8",
        "networkPerformance": "Up to 10 Gigabit",
        "servicename": "Amazon Elastic Compute Cloud",
        "instanceType": "m5.large",
        "tenancy": "Shared",
        "usagetype": "EU-BoxUsage:m5.large",
        "normalizationSizeFactor": "4",
        "processorFeatures": "Intel AVX, Intel AVX2, Intel AVX512, Intel Turbo",
        "servicecode": "AmazonEC2",
        "licenseModel": "No License required",
        "currentGeneration": "Yes",
        "preInstalledSw": "NA",
        "location": "EU (Ireland)",
        "processorArchitecture": "64-bit",
        "operation": "RunInstances"
    },
    "sku": "FP7Z96TTU3VFSX2H"
}

The reason there are 60 products is hidden in these keys:

  • tenancy
  • preInstalledSw
  • operatingSystem
  • licenseModel
  • capacitystatus

All the other keys are, in general, the same for every product. So what are the variants we're seeing?

Tenancy

There are three different values for tenancy:

  • Shared
  • Dedicated
  • Host

Shared is the cost for running a 'normal' EC2 instance without using dedicated instances or dedicated hosts. Dedicated is the cost for running an instance as a dedicated instance. Host is the cost of a decicated host. This one is a bit tricky, because it only returns a valid result for instance types like m5, g3, and so on, instead of instance types like m5.large or t3.medium used in the other tenancy variants.

Pre-Installed Software

There are four different values for preInstalledSw:

  • NA
  • SQL Web
  • SQL Std
  • SQL Ent

NA means no additional software is installed. SQL Web, SQL Std and SQL Ent reference the cost for instances with pre-installed Microsoft SQL Server Web, Microsoft SQL Server Standard and Microsoft SQL Server Enterprise, respectively.

Operating System

There are five different values for operatingSystem:

  • NA
  • Linux
  • RHEL
  • SUSE
  • Windows

NA is only used for the EBS Optimized Surcharge for very old instance families (c1, c3, g2, i2, m1, m2, m3 and r3). The other values reference the cost for running generic Linux (eg. Ubuntu or Amazon Linux), Red Hat Enterprise Linux, SUSE Linux Enterprise Server and Windows.

License Model

There are three different values for licenseModel:

  • NA
  • Bring your own license
  • No License required

NA is only used for the EBS Optimized Surcharge for very old instance families (c1, c3, g2, i2, m1, m2, m3 and r3).

Bring your own license specifies the cost for instances with your own Windows Licenses. The cost for these instances is lower then instances with a Windows license provided by Amazon.

No License required specifies all instances that don't have a licensing system (eg. generic Linux), or have an Amazon provided license (eg. RHEL, SUSE, Windows).

Capacity status

There are five different values for capacitystatus:

  • AllocatedHost
  • NA
  • AllocatedCapacityReservation
  • UnusedCapacityReservation
  • Used

AllocatedHost contains the price for Dedicated Hosts and will only occur for products with a tenancy of Host.

NA is only used for the EBS Optimized Surcharge for very old instance families (c1, c3, g2, i2, m1, m2, m3 and r3).

AllocatedCapacityReservation and UnusedCapacityReservation relate to EC2 Capacity Reservations. You can reserve capacity to ensure that you can always spin up a certain type of EC2 instance. Of course there is a cost for the reservation when you're not using it. This is called the UnusedCapacityReservation. Then there is the AllocatedCapacityReservation, which is the price you pay when you're actually using the reservation.

These relate to the Used capacity status, which is the price when you're actually running an instance.

A reserved capacity example

The costs for m5.large are:

  • Used: $0.107
  • UnusedCapacityReservation: $0.107
  • AllocatedCapacityReservation: $0.000

You reserve capacity for two m5.large instances. This will cost two times the UnusedCapacityReservation per hour as long as you don't actually run an instance. That's $0.214 per hour. Then you spin up one m5.large. Now you're paying the UnusedCapacityReservation price for one instance and the Used price plus the AllocatedCapacityReservation price for another. The cost is still $0.214 per hour. In other words, you pay the same for a reservation as running an actual instance. From the EC2 Pricing Page:

On-Demand Capacity Reservations are priced exactly the same as their equivalent (On-Demand) instance usage. If a Capacity Reservation is fully utilized, you only pay for instance usage and nothing towards the Capacity Reservation. If a Capacity Reservation is partially utilized, you pay for the instance usage and for the unused portion of the Capacity Reservation.

Why so many products?

The sections above explain the 60 variants of the m5.large type. There are different products for every OS, every type of pre-installed software, dedicated instances, dedicated hosts, and so on. So let us change the script to just show us the price for the m5.large running Linux, no additional software, on a shared host:

import json
import boto3

pricing_client = boto3.client('pricing', region_name='us-east-1')

def get_products(region):
    paginator = pricing_client.get_paginator('get_products')

    response_iterator = paginator.paginate(
        ServiceCode="AmazonEC2",
        Filters=[
            {
                'Type': 'TERM_MATCH',
                'Field': 'location',
                'Value': region
            },
            {
                'Type': 'TERM_MATCH',
                'Field': 'instanceType',
                'Value': 'm5.large'
            },
            {
                'Type': 'TERM_MATCH',
                'Field': 'capacitystatus',
                'Value': 'Used'
            },
            {
                'Type': 'TERM_MATCH',
                'Field': 'tenancy',
                'Value': 'Shared'
            },
            {
                'Type': 'TERM_MATCH',
                'Field': 'preInstalledSw',
                'Value': 'NA'
            },
            {
                'Type': 'TERM_MATCH',
                'Field': 'operatingSystem',
                'Value': 'Linux'
            }
        ],
        PaginationConfig={
            'PageSize': 100
        }
    )

    products = []
    for response in response_iterator:
        for priceItem in response["PriceList"]:
            priceItemJson = json.loads(priceItem)
            products.append(priceItemJson)

    print(products)

if __name__ == '__main__':
    get_products('EU (Ireland)')

This will give us only one product, but still we see 412 lines of JSON. The bulk of these are in the terms section, which we will investigate in the next section.

Terms

The result of the script in the previous section looks like this:

[
    {
        "product": {
            "productFamily": "Compute Instance",
            "attributes": {
                "enhancedNetworkingSupported": "Yes",
                "memory": "8 GiB",
                "dedicatedEbsThroughput": "Upto 2120 Mbps",
                "vcpu": "2",
                "capacitystatus": "Used",
                "locationType": "AWS Region",
                "storage": "EBS only",
                "instanceFamily": "General purpose",
                "operatingSystem": "Linux",
                "physicalProcessor": "Intel Xeon Platinum 8175",
                "clockSpeed": "2.5 GHz",
                "ecu": "8",
                "networkPerformance": "Up to 10 Gigabit",
                "servicename": "Amazon Elastic Compute Cloud",
                "instanceType": "m5.large",
                "tenancy": "Shared",
                "usagetype": "EU-BoxUsage:m5.large",
                "normalizationSizeFactor": "4",
                "processorFeatures": "Intel AVX, Intel AVX2, Intel AVX512, Intel Turbo",
                "servicecode": "AmazonEC2",
                "licenseModel": "No License required",
                "currentGeneration": "Yes",
                "preInstalledSw": "NA",
                "location": "EU (Ireland)",
                "processorArchitecture": "64-bit",
                "operation": "RunInstances"
            },
            "sku": "FP7Z96TTU3VFSX2H"
        },
        "serviceCode": "AmazonEC2",
        "terms": {
            "OnDemand": {
                "FP7Z96TTU3VFSX2H.JRTCKXETXF": {
                    "priceDimensions": {
                        "FP7Z96TTU3VFSX2H.JRTCKXETXF.6YS6EN2CT7": {
                            "unit": "Hrs",
                            "endRange": "Inf",
                            "description": "$0.107 per On Demand Linux m5.large Instance Hour",
                            "appliesTo": [],
                            "rateCode": "FP7Z96TTU3VFSX2H.JRTCKXETXF.6YS6EN2CT7",
                            "beginRange": "0",
                            "pricePerUnit": {
                                "USD": "0.1070000000"
                            }
                        }
                    },
                    "sku": "FP7Z96TTU3VFSX2H",
                    "effectiveDate": "2019-05-01T00:00:00Z",
                    "offerTermCode": "JRTCKXETXF",
                    "termAttributes": {}
                }
            },
            "Reserved": {
                "FP7Z96TTU3VFSX2H.6QCMYABX3D": {
                    "priceDimensions": {
                        "FP7Z96TTU3VFSX2H.6QCMYABX3D.2TG2D8R56U": {
                            "unit": "Quantity",
                            "description": "Upfront Fee",
                            "appliesTo": [],
                            "rateCode": "FP7Z96TTU3VFSX2H.6QCMYABX3D.2TG2D8R56U",
                            "pricePerUnit": {
                                "USD": "614"
                            }
                        },
                        "FP7Z96TTU3VFSX2H.6QCMYABX3D.6YS6EN2CT7": {
                            "unit": "Hrs",
                            "endRange": "Inf",
                            "description": "Linux/UNIX (Amazon VPC), m5.large reserved instance applied",
                            "appliesTo": [],
                            "rateCode": "FP7Z96TTU3VFSX2H.6QCMYABX3D.6YS6EN2CT7",
                            "beginRange": "0",
                            "pricePerUnit": {
                                "USD": "0.0000000000"
                            }
                        }
                    },
                    "sku": "FP7Z96TTU3VFSX2H",
                    "effectiveDate": "2017-10-31T23:59:59Z",
                    "offerTermCode": "6QCMYABX3D",
                    "termAttributes": {
                        "LeaseContractLength": "1yr",
                        "OfferingClass": "standard",
                        "PurchaseOption": "All Upfront"
                    }
                },
                "FP7Z96TTU3VFSX2H.R5XV2EPZQZ": {
                    "priceDimensions": {
                        "FP7Z96TTU3VFSX2H.R5XV2EPZQZ.2TG2D8R56U": {
                            "unit": "Quantity",
                            "description": "Upfront Fee",
                            "appliesTo": [],
                            "rateCode": "FP7Z96TTU3VFSX2H.R5XV2EPZQZ.2TG2D8R56U",
                            "pricePerUnit": {
                                "USD": "743"
                            }
                        },
                        "FP7Z96TTU3VFSX2H.R5XV2EPZQZ.6YS6EN2CT7": {
                            "unit": "Hrs",
                            "endRange": "Inf",
                            "description": "Linux/UNIX (Amazon VPC), m5.large reserved instance applied",
                            "appliesTo": [],
                            "rateCode": "FP7Z96TTU3VFSX2H.R5XV2EPZQZ.6YS6EN2CT7",
                            "beginRange": "0",
                            "pricePerUnit": {
                                "USD": "0.0280000000"
                            }
                        }
                    },
                    "sku": "FP7Z96TTU3VFSX2H",
                    "effectiveDate": "2017-10-31T23:59:59Z",
                    "offerTermCode": "R5XV2EPZQZ",
                    "termAttributes": {
                        "LeaseContractLength": "3yr",
                        "OfferingClass": "convertible",
                        "PurchaseOption": "Partial Upfront"
                    }
                },
                ...
            }
        },
        "version": "20190520201514",
        "publicationDate": "2019-05-20T20:15:14Z"
    }
]

I cut the bottom part of the JSON for readability, because there are so many terms! Let's go have a look what they mean.

On Demand

You'll find that the top section contains the On Demand price. This is the price you pay when you don't apply Reserved Instances (RIs). It's the price most people are familliar with, and matches the prices mentioned on the EC2 On Demand Pricing Page.

Reserved Instances

RIs are what makes pricing interesting. By reserving instances for longer periods of time, you achieve large savings on your EC2 instances. There are a few variants of Reserved Instances:

  • Term: 1 year or 3 year
  • Upfront: none, partial or all
  • Convertable: standard or convertable

Every attribute can be combined with every other attribute, so all in all there are 12 (2 * 3 * 2) reserved pricing models. Each of these models have their own entry in the results of the Pricing API, and that's where the majority of the content comes from.

For reserved instances, there is an Upfront component (which might be zero) and an Price per Hour component. These are mentioned separately for each reserved instances variant:

{
    "unit": "Quantity",
    "description": "Upfront Fee",
    "appliesTo": [],
    "rateCode": "FP7Z96TTU3VFSX2H.R5XV2EPZQZ.2TG2D8R56U",
    "pricePerUnit": {
        "USD": "743"
    }
}
{
    "unit": "Hrs",
    "endRange": "Inf",
    "description": "Linux/UNIX (Amazon VPC), m5.large reserved instance applied",
    "appliesTo": [],
    "rateCode": "FP7Z96TTU3VFSX2H.R5XV2EPZQZ.6YS6EN2CT7",
    "beginRange": "0",
    "pricePerUnit": {
        "USD": "0.0280000000"
    }
}

Codes

Every product has its own code, called the sku. The m5.large with generic Linux and no pre-installed software used in the previous section has the SKU FP7Z96TTU3VFSX2H, for example. The variant with SQL Web has the SKU PPN47BE9AY3KPKYM.

The pricing terms have their own code, called offerTermCode. On Demand pricing will always have the JRTCKXETXF offer term code. As such, the On Demand pricing for the two m5.large variants is FP7Z96TTU3VFSX2H.JRTCKXETXF and PPN47BE9AY3KPKYM.JRTCKXETXF, respectively. The offerTermCode for 1 year, standard, all upfront reserved pricing is 6QCMYABX3D.

Lastly, the price dimensions like price per hour and upfront fee also have their own code. Price per hour is 6YS6EN2CT7, and the upfront fee is 2TG2D8R56U.

As such, the code for the upfront component of the 1 year, standard, all upfront reserved pricing for the m5.large running Linux with no additional software is FP7Z96TTU3VFSX2H.6QCMYABX3D.2TG2D8R56U.

Conclusion

I hope this article shed some light on the complexity that is the AWS EC2 Pricing API. Using the Filters and their possible values, you should be able to drill down to the instance your searching for. Using the explanation of the terms, you should also be able to find the pricing that applies to your situation.

If you have any questions or remarks, please reach out to me on Twitter.


Related blog posts


all blog posts