How to extract logs from AWS CloudWatch using python and Boto3
AWS is a popular cloud service that offers CloudWatch, a powerful tool for monitoring, storing, and accessing log files from various resources like EC2 instances, CloudTrail, ECS, and Route 53.
Knowing how to extract these logs for analysis and examination is essential. In this blog, we will explore the same and get the data we need.
Lets consider we want to extract only the messages which starts with “START RequestId:” from last 7 days logs.
- Install and import required libraries:
import boto3
from datetime import datetime, timedelta
import time
2. Create a client with ‘logs’:
client = boto3.client('logs')
3. Define log group and timeframe:
log_group_name = '/aws/lambda/wby-graphql-af86eaa' # Replace with your log group name
end_time = int(datetime.now().timestamp() * 1000) # Current time in milliseconds
start_time = int((datetime.now() - timedelta(days=7)).timestamp() * 1000) # Time 7 days ago in milliseconds
4. Define the pattern to extract from logs:
filter_pattern = '"START RequestId:"' # Replace with your filter pattern like "Response of API: {}"
5. Get paginator by providing required fields:
paginator = client.get_paginator('filter_log_events')
6. Create a response iterator using paginator
response_iterator = paginator.paginate(
logGroupName=log_group_name,
filterPattern=filter_pattern,
startTime=start_time,
endTime=end_time
)
7. Create a list of required messages by iterating over response iterator
logs = []
for response in response_iterator:
for event in response['events']:
message = event['message']
logs.append(message)
Full Code:
Below code contains full code on how to extract data from AWS CloudWatch service.
import time
import boto3
from datetime import datetime, timedelta
# Initialize a session using Amazon CloudWatch
client = boto3.client('logs')
# Define the log group name and the time range for which the logs are to be retrieved
log_group_name = '/aws/lambda/wby-graphql-af86eaa' # Replace with your log group name
end_time = int(datetime.now().timestamp() * 1000) # Current time in milliseconds
start_time = int((datetime.now() - timedelta(days=7)).timestamp() * 1000) # Time 7 days ago in milliseconds
# Function to get logs from CloudWatch
def get_logs(log_group_name, start_time, end_time):
"""
Retrieves logs from the specified log group within the given time range.
Parameters:
log_group_name (str): The name of the log group.
start_time (int): The start time in milliseconds.
end_time (int): The end time in milliseconds.
Returns:
list: A list of log messages.
"""
filter_pattern = '"START RequestId:"' # Replace with your filter pattern
paginator = client.get_paginator('filter_log_events')
print(f"paginator: {paginator}")
# Paginate through the log events
response_iterator = paginator.paginate(
logGroupName=log_group_name,
filterPattern=filter_pattern,
startTime=start_time,
endTime=end_time
)
logs = []
for response in response_iterator:
for event in response['events']:
message = event['message']
logs.append(message)
return logs
if __name__ == '__main__':
# Measure the time taken to retrieve logs
st = time.time()
# Retrieve logs
logs = get_logs(log_group_name, start_time, end_time)
print(f"Logs: {logs}")
# Calculate elapsed time
et = time.time()
elapsed_time = et - st
print(f"Elapsed time: {elapsed_time} seconds")