One of the best ways to save money on AWS is turning resources off when you don’t use them. This is pretty easy to automate if you have consistent usage patterns (like an application that’s only used during business hours), but can be harder if the usage is very irregular (for example an application that’s only used a few times per quarter).
We recently worked with a customer that had some applications that could be without usage for months. To be more cost efficient, they were looking for a solution where:
- They could turn off as much instances and services as possible
- The users could start the application with one button click if they needed to use it
- The users didn’t have AWS credentials
We came up with the following solution to satisfy these requirements, and if you’re running the same kind of applications, maybe you can also reduce costs by implementing this.
This solutions works by taking advantage of the Route53 health checks. We’ve split up our infrastructure in two parts: an always-on part that uses low cost or usage based services to provide the user with a way to start the real application; and a part that can started and stopped on demand.
We configure the on-demand part to be the primary resource in Route53 and the always-on part as a failover. This way the traffic will be routed to the real application if it’s online, and the user will get a static webpage that gives him the option to start the application if it’s not.
If we look at how this would go if the application is offline, these are the steps that would happen:
- The user requests the DNS record for application.example.com from Route53. Because the real application is offline, Route53 will respond with the recordset of the fallback CloudFront distribution.
- The user request a page from CloudFront. CloudFront will get this from S3 and serve it to the user. This page contains an explanation of why the application is not available and a button to start it.
- When the user clicks the button, it uses javascript to call the API Gateway and invoke a lambda function.
- The lambda function calls Service Catalog or CloudFormation (depending on your environment) to start the real application
- When the application has started, the health check will pass, and Route53 will start returning the recordset for the CloudFront distribution that is linked to the application
- When the user uses the new DNS records, it will go through the second CloudFront distribution and to the real application
Some things to keep in mind.
This only a high level overview of a possible solution. To implement this, you would also have to consider the following:
- After starting the Application, the static webpage should refresh the page, to force the browser to do a new DNS lookup.
- CloudFront will cache errors for 5 minutes by default. Decreasing this will make the failover go faster.
- The TTL of an CloudFront DNS record is 60 seconds