In this article, we’ll walk you through our journey of optimizing AWS EC2 and EBS pricing and costs by identifying instances that haven’t been used in the last 30 days and subsequently deleting them, along with their associated EBS volumes.
The Problem: EC2 Instances & EBS blocks not being used.
As our infrastructure grew, so did the number of EC2. Over time, we realized that not all of them were actively being used. Some were remnants of old projects, while others were test instances that had served their purpose. These instances were silently adding to our monthly AWS bill.
The solution: Lambda function & CloudWatch alarm set up
Our goal was clear: Identify EC2 instances that haven’t been used in the last 30 days and delete them to cut costs. Here’s how we approached the problem:
- Create a Custom Metric:
- We utilize a script or an AWS Lambda function to monitor the desired metrics, such as CPU utilization, of our EC2 instances.
- If we observe that an instance has been underutilized for an extended period, we push a custom metric to CloudWatch.
- Create a CloudWatch Alarm:
- Based on the custom metric, we created an alarm in CloudWatch.
- Delete the EC2 instances & EBS volumes:
- ⚠️ Important Note: This method focuses on detecting instances with low CPU usage. However, there could be instances with minimal CPU activity but significant network usage. It’s crucial to consider all aspects of an instance’s activity before deeming it “unused.”
- If you don’t want to delete the EBS volumes, this article will show you how we also trimmed 35% of billing costs in AWS by changing EBS unused volumes to a ‘colder’ S3 bucket.
- Don’t know what a EBS volume is and how is it managed by AWS? Check this article.
Here’s a step-by-step guide:
Create a Custom Metric to identify the unused EC2:
The Lambda Function
- Create a Lambda function with permissions to describe EC2 instances and put CloudWatch metrics.
- Use the AWS SDK (e.g., Boto3 for Python) to describe your EC2 instances and their metrics.
- If an instance has low utilization for over 30 days, push a custom metric to CloudWatch.
Here’s our Python example code using boto3:
import boto3
import datetime
def lambda_handler(event, context):
ec2 = boto3.client('ec2')
cloudwatch = boto3.client('cloudwatch')
# Get all instances
instances = ec2.describe_instances()
unused_instances = []
for reservation in instances['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
instance_type = instance['InstanceType']
# Get CPU utilization for the last 30 days
metrics = cloudwatch.get_metric_data(
MetricDataQueries=[
{
'Id': 'm1',
'MetricStat': {
'Metric': {
'Namespace': 'AWS/EC2',
'MetricName': 'CPUUtilization',
'Dimensions': [
{
'Name': 'InstanceId',
'Value': instance_id
},
]
},
'Period': 86400, # One day in seconds
'Stat': 'Average',
},
'ReturnData': True,
},
],
StartTime=datetime.datetime.now() - datetime.timedelta(days=30),
EndTime=datetime.datetime.now(),
)
# Check if average CPU utilization is below a threshold (e.g., 5%) for the entire 30-day period
if metrics['MetricDataResults'][0]['Values'] and all(value < 5 for value in metrics['MetricDataResults'][0]['Values']):
ebs_volumes = []
for block_device in instance['BlockDeviceMappings']:
volume_id = block_device['Ebs']['VolumeId']
volume = ec2.describe_volumes(VolumeIds=[volume_id])
ebs_volumes.append({
'VolumeId': volume_id,
'Size': volume['Volumes'][0]['Size']
})
unused_instances.append({
'InstanceId': instance_id,
'InstanceType': instance_type,
'EBSVolumes': ebs_volumes
})
return unused_instances
Here’s a high level breakdown of our code:
- Data Collection: Using Boto3, we pulled data on all EC2 instances and their 30-day CPU utilization from CloudWatch.
- Idle Detection & EBS Association: Identified EC2 instances with low CPU activity (below 5%) and located their associated EBS volumes, pinpointing potential cost drains.
- Optimization & Savings: By regularly assessing and acting on this data, we streamlined our infrastructure, reducing costs tied to unused EC2 instances and EBS volumes.
This Lambda function, when executed, returns a list of unused instances, providing a clear picture of potential cost savings.
Create a CloudWatch Alarm
- Navigate to the CloudWatch console.
- In the left navigation pane, click on
Metrics
. - Click on the
Custom
namespace and then theUnusedEC2
namespace. - Select the
UnusedInstanceCount
metric. - Click on the
Create Alarm
button. - Configure the alarm:
- Name and description.
- Define the condition. For instance, if you want to be alerted when there’s at least one unused instance, set the threshold to
>= 1
. - Configure actions like sending a notification.
- Click on the
Create Alarm
button.
Now, whenever the Lambda function identifies unused instances and pushes the metric, the CloudWatch Alarm will trigger if the condition is met.
Remember to schedule the Lambda function to run periodically (e.g., daily) using CloudWatch Events or EventBridge to regularly check for unused instances and push the metric.
By implementing this proactive approach, we saw a substantial reduction in our monthly AWS bill. Not only did we save on the costs of the EC2 instances, but also on the storage costs associated with the EBS volumes.
Note: Before deleting any EC2 instances or EBS volumes, always ensure you have backups and have communicated with relevant stakeholders. It’s essential to ensure no critical data or applications are lost in the process.
Leave a Reply