EC2 Dynamic Metrics Monitoring

Never lose sight of monitoring your Autoscaled EC2 instances.

Ed Austin
Gecogeco

--

Photo by Shane Aldendorff on Unsplash

Overview

When using Autoscaling with EC2, there might be a scenario where-in you want to monitor your instances (CPU Utilization, RequestCount, etc.).

Although monitoring it manually will work, the approach is not efficient since the instances will be removed and new EC2 instances will spawn again with Autoscaling managing these cycle.

One way to address this is to configure a CloudWatch Event with Lambda to dynamically create and remove CloudWatch alarms when new instances are created.

The diagram below roughly explains the process.

EC2 Dynamic Metrics Monitoring Architecture Diagram
EC2 Dynamic Metrics Monitoring Architecture Overview

High-level process

1. CloudWatch event trigger when Autoscaling kicks-in.

2. Forward the CloudWatch event to the lambda function for processing. Lambda function will check the event type based on the event parameter.

  • if “EC2 Instance Launch Successful”, create a CloudWatch Alarm dynamically
  • if “EC2 Instance Terminate Successful” , remove the CloudWatch Alarm dynamically

3. Based on the event, create a CloudWatch Alarm dynamically.
After lambda function processing, the newly spawned EC2 instance should have the CloudWatch alarm associated to it.

4. Notification is sent to the slack channel if metrics is hit.
If the alarm threshold is reached (ie. CPU Utilization > 50%), then a notification will be sent to the Slack channel.

Setup

1. Set an IAM Policy on an IAM Role for Lambda function

Since we will be using a lambda function to create the CloudWatch Alarms and check EC2 instances states, we need to give the lambda function access rights to manage these resources.

Create IAM Policy for this and attach it to target IAM Role which used by the lambda function.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"cloudwatch:PutMetricAlarm",
"cloudwatch:DeleteAlarms",
"cloudwatch:DescribeAlarms"
],
"Resource": "*"
}
]
}

2. Create the lambda function

Now proceed in creating the lambda function and reference the role we created in Step 1. The sample source code is written in Ruby, you can find it here.

You can test it by triggering it manually. Refer to the Testing section below.

3. Create a CloudWatch Event Rule

Proceed in setting up the CloudWatch Event Rule, see image details below.

CloudWatch Event Rule Setup
CloudWatch Event Rule Setup

4. Done!

When everything is set, expect that your newly spawned EC2 instances will have your configured alarms.

Testing

You can use the JSON below to test the lambda function. The sample request is for launching an EC2 instance.

For termination testing, change the detail-type attribute of the JSON to “EC2 Instance Terminate Successful”.

{
"version": "0",
"id": "3e3c153a-8339-4e30-8c35-687ebef853fe",
"detail-type": "EC2 Instance Launch Successful",
"source": "aws.autoscaling",
"account": "123456789012",
"time": "2015-11-11T21:31:47Z",
"region": "us-east-1",
"resources": [
"arn:aws:autoscaling:us-east-1:123456789012:autoScalingGroup:eb56d16b-bbf0-401d-b893-d5978ed4a025:autoScalingGroupName/sampleLuanchSucASG",
"arn:aws:ec2:us-east-1:123456789012:instance/i-b188560f"
],
"detail": {
"StatusCode": "InProgress",
"AutoScalingGroupName": "sampleLuanchSucASG",
"ActivityId": "9cabb81f-42de-417d-8aa7-ce16bf026590",
"Details": {
"Availability Zone": "us-east-1b",
"Subnet ID": "subnet-95bfcebe"
},
"RequestId": "9cabb81f-42de-417d-8aa7-ce16bf026590",
"EndTime": "2015-11-11T21:31:47.208Z",
"EC2InstanceId": "i-012f5fd64120892e3",
"StartTime": "2015-11-11T21:31:13.671Z",
"Cause": "At 2015-11-11T21:31:10Z a user request created an AutoScalingGroup changing the desired capacity from 0 to 1. At 2015-11-11T21:31:11Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 1."
}
}

Conclusion

It is possible to create dynamic CloudWatch Alarms for Autoscaled EC2 instances by using CloudWatch Event Rule with Lambda. This will remove manual intervention since creating and removing the alarms will be managed by the Lambda function.

That’s all!

Special thanks to Cedric Sarigumba.

--

--