问题
I have a lambda function that writes metrics to Cloudwatch
. While, it writes metrics, It generates some logs in a log-group.
INFO:: username: simran+test@abc.com ClinicID: 7667 nodename: MacBook-Pro-2.local
INFO:: username: simran+test2@abc.com ClinicID: 7669 nodename: MacBook-Pro-3.local
INFO:: username: simran+test@abc.com ClinicID: 7668 nodename: MacBook-Pro-4.local
INFO:: username: simran+test3@abc.com ClinicID: 7667 nodename: MacBook-Pro-5.local
INFO:: username: simran+test3@abc.com ClinicID: 7667 nodename: MacBook-Pro-2.local
I need an efficient way to get distinct values of nodename
for a given ClinicId
. For example, I pass in 7667
for ClinicId
and I expect
['MacBook-Pro-2.local', 'MacBook-Pro-5.local']
This is what I tried:
query = "fields @timestamp, @message | parse @message \"username: * ClinicID: * nodename: *\" as username, ClinicID, nodename | filter ClinicID = "+ clinic_id
start_query_response = client.start_query(
logGroupName=log_group,
startTime=int(time.mktime((Util.utcnow() - timedelta(hours=hours)).timetuple())),
endTime=int(time.mktime(Util.utcnow().timetuple())),
queryString=query,
)
I considered iterating start_query_response
in Python but I do not like that idea. Since it is logs for over 7 days
that I will be looking at, I need an efficient way instead of having to iterate each log from past 7 days
for the given ClinicID
.
回答1:
You can pipe you expression to the stat
command and count occurrences of each nodename.
Add this to the end of your query:
| stats count(*) by nodename
Result will be:
{
'results': [
[
{
'field': 'nodename',
'value': 'MacBook-Pro-2.local\n'
},
{
'field': 'count(*)',
'value': '2'
}
],
[
{
'field': 'nodename',
'value': 'MacBook-Pro-5.local\n'
},
{
'field': 'count(*)',
'value': '1'
}
]
]
}
See here for more details on various commands: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html
来源:https://stackoverflow.com/questions/59314132/query-cloudwatch-logs-for-distinct-values-using-boto3-in-python