In the second installment of the Architecture Series, we will look at how the AWS pricing data for EC2 and RDS is periodically retrieved and stored.
Like the rest of CloudBanshee, the data displayed in the EC2 Pricing and RDS Pricing is completely generated and hosted using Serverless technology. This makes the website very low maintenance and very cost effective. For more info, see the first post in the Architecture Series.
The AWS pricing data is quite unwieldy. If we would dynamically request the data in the EC2 and RDS tools, this would result in an enormous amount of API calls, complex Lambdas and API Gateways, and very high cost.
Instead, we have chosen to periodically query all relevant data from the AWS Pricing API and store it in JSON files on S3. These files are then retrieved by the user every time they visit the CloudBanshee pricing tools. To further optimize cost and speed, we use CloudFront to cache and compress the files.
The scheduled event runs every hour. When it does, it triggers the Lambda Function, which uses SSM to fetch all the AWS Regions, as described in the blog post Retrieving all Region Codes and Names with Boto3.
For every region, it fetches the EC2 and RDS prices, as described in the blog post Using the EC2 Price List API.
When the pricing data has been retrieved, it is stored in an S3 bucket for generated files. The actual website is stored in a different bucket for static files, as described Architecture Series: Static Website). The CloudFront distribution for CloudBanshee uses both the static and the generated files bucket as origins.
This setup allows us to have different TTLs for different types of files. The generated files change every hour, and as such have a TTL of 3600 (1 hour). The website files change a lot less often, so they have a TTL of 86400 (1 day).
Putting a scheduled Lambda in CloudFormation requires a
AWS::Lambda::Permission resource. You can find a blog post on how to that here: Defining a Scheduled Lambda in CloudFormation.