AWS Autoscaling Based On Database Query Custom Metrics

Did you ever run into a situation where the generic autoscaling triggers available like Avg CPU, Network IO, custom metrics memory etc are not enough to rightly decide whether you should scale up or scale down? We ran into this problem for a client with very specific workload and we had to find a way to scale up and down based on the result of a SQL query on target database?

So how did we go about achieving this? Custom metrics again, of course. We created a lambda function to poll the database at regular intervals and the results were shipped to CloudWatch as custom metrics.This custom metric was used to trigger scale up and scale down events. What follows is a detailed step by step for configure this with a generic SELECT COUNT(*) SQL Query on a MYSQL database. Depending on your needs, you can make changes to what gets polled for custom metrics.

Prerequisites

DB Instance

The Instance for which you want to put count value to Cloudwatch.
We have a RDS Instance available in private subnet with publicly accessible set to No
The RDS instance has a database named powerdb which has an employees table. The employees table has two columns of type INT - id and count. The count value is being added to the cloudwatch metric through Lambda Function. Like I said already, this is a simplistic example. Your database query in real life might be more complex than a count(*)

AutoScaling Group

The autoscaling group which will scale with the Cloudwatch Metric Data. We have one available with initially one server in running state and no scaling policy assigned.

Lambda Execution IAM Role

Create a lambda_basic_execution role with the following inline policy attached

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricData",
                "ec2:CreateNetworkInterface",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
        }
    ]
}

This policy will allow you to create a lambda function within a VPC.

Creating The Lambda Function

We have posted the lambda function code on this public repo. Few snippets from the code are explained below:

We have used mysql nodejs module to connect the function to the Mysql RDS Instance. If you are using a different database, there is nothing stopping you from connecting as long as node has a connector. Provide the database credentials in the code as shown below.

var connection = mysql.createConnection({  
  host     : 'xxxxxx.us-east-1.rds.amazonaws.com',
  user     : 'puc',
  password : 'xxxxxx',
  database : 'powerdb'
});

connection.connect();  

Next, we are querying the database and getting the value that needs to be put in the Cloudwatch.This is the place to change the query to your liking. If your database supports NOLOCK hints etc, you might want to add them too to avoid locking.

connection.query('SELECT count from employees where id=1', function(err, rows, fields) {  
  if (err) throw err;

  console.log('The count is: ', rows[0].count);
  value = rows[0].count
  putCloudWatchMetric('DBCount', value);

});

Once we get the count value in a variable ‘value’, we are calling a function putCloudWatchMetric() that we defined as

function putCloudWatchMetric(metricName, count){  
    cw.putMetricData({
        Namespace: 'DBData',
        MetricData: [{
            'MetricName': metricName,
            'Unit': 'Count',
            'Value': count
        }]}, function(err, data) {
                 if (err) console.log(err, err.stack); // an error occurred
                else     console.log(data);           // successful response
        }
     )
}

Create this lambda function in the same VPC as RDS. Ensure to use private subnets for this function.
Also, for lambda function to access the RDS, open the DB instance port in its security group to the security group used by lambda function.
We have scheduled this lambda function to trigger every 15 mins. You may change this frequency as well, if you need faster scaleup and down events.

Creating CloudWatch Alarms From Polled Query Results

Once the lambda function triggered successfully, go to Cloudwatch console and see the graph for the DB Count.
Create an alarm for this count graph.
We have created an alarm for count value greater than 10, this is for scale up. Create one more Cloudwatch alarm for Scale down policy when the count value is less than 10.

Creating Scaling Policies

Go to Autoscaling Console and create scale up policy with alarm count_alarm. So whenever the count value will be greater than 10 for 300 seconds, it will set the instance count to 2.
Create a scale down policy too with other alarm count_alarm01. So, whenever the count value will be less than 10 for 300 seconds, it will set the instance count to 1.
Once we create both policies, we will have scale up and scale down policies as shown below in the screenshot

Verify Autoscaling

Connect to the Mysql Database and update the count value to 14 (greater than 10).
We have scheduled the lambda function to trigger every 15 mins. So, the count value will get updated in Cloudwatch Metric every 15 mins. We can see the count graph in Cloudwatch.
Due to the Scale Up policy, the auto scaling will happen and we can see the updated instance count to 2
Next, we can verify the scale down policy too by updating the count value again to less than 10.
Once the lambda function will trigger, we can see the updated count value in the Cloudwatch graph.
Autoscaling will happen and the instance count will scale down to 1
Also, you can set the DBCount range in Autoscaling policies.
Next, we have updated the count value to 25.
As per the policy, the autoscaling will set the instance count to 3.

Hope you found this useful.

Happy scaling! :)

Priyanka Sharma

Priyanka is Senior Cloud and DevOps Engineer. She can churn out CloudFormation templates at a moment's notice and play with Chef/Ansible. Dancing, music, badminton and word games are her hobbies

comments powered by Disqus