AWS CLI EMR get Master node Instance ID and tag it

荒凉一梦 提交于 2019-12-30 10:38:31

问题


I want to automate the running of a cluster and can use tags to get attributes of an EC2 instance like its instance-id.

The documentation on https://docs.aws.amazon.com/cli/latest/reference/emr/create-cluster.html states that

--tags (list)

A list of tags to associate with a cluster, which apply to each Amazon EC2 instance in the cluster. Tags are key-value pairs that consist of a required key string with a maximum of 128 characters, and an optional value string with a maximum of 256 characters.

You can specify tags in key=value format or you can add a tag without a value using only the key name, for example key . Use a space to separate multiple tags.

So this applies tags to every EC2 instance including the master and slaves. How do I discern which instance is the master node?

Additional Info: I am using the following command to get attributes from aws cli based on tags where you can replace the "Name" and "Prod" with your tags key-value pairs respectively.

aws ec2 describe-instances | jq '.Reservations[].Instances | select(.[].Tags[].Value | startswith("Prod") ) |   select(.[].Tags[].Key == "Name") |   {InstanceId: .[].InstanceId, PublicDnsName: .[].PublicDnsName, State: .[].State, LaunchTime: .[].LaunchTime, Tags: .[].Tags}   | [.]' | jq .[].InstanceId

回答1:


As you noted when you create an EMR cluster, the tags are the same for all nodes (Master, Slave, Task).

You will find that this process using the AWS CLI to be complicated. My recomendation is to review the examples below and then write a Python program to do this.

Process to add your own tags to the EC2 instances.

STEP 1: List your EMR Clusters: aws emr list-clusters

This will output JSON:

{
    "Clusters": [
        {
            "Id": "j-ABCDEFGHIJKLM",
            "Name": "'MyCluster'",
            "Status": {
                "State": "WAITING",
                "StateChangeReason": {
                    "Message": "Cluster ready after last step completed."
                },
                "Timeline": {
                    "CreationDateTime": 1536626095.303,
                    "ReadyDateTime": 1536626568.482
                }
            },
            "NormalizedInstanceHours": 0
        }
    ]
}

STEP 2: Make a note of the Cluster ID from the JSON:

"Id": "j-ABCDEFGHIJKLM",

STEP 3: Describe your EMR Cluster: aws emr describe-cluster --cluster-id j-ABCDEFGHIJKLM

This will output JSON (I have truncated this output to just the MASTER section):

{
    "Cluster": {
        "Id": "j-ABCDEFGHIJKLM",
        "Name": "'Test01'",
....
        "InstanceGroups": [
            {
                "Id": "ig-2EHOYXFABCDEF",
                "Name": "Master Instance Group",
                "Market": "ON_DEMAND",
                "InstanceGroupType": "MASTER",
                "InstanceType": "m3.xlarge",
                "RequestedInstanceCount": 1,
                "RunningInstanceCount": 1,
                "Status": {
                    "State": "RUNNING",
                    "StateChangeReason": {
                        "Message": ""
                    },
                    "Timeline": {
                        "CreationDateTime": 1536626095.316,
                        "ReadyDateTime": 1536626533.886
                    }
                },
                "Configurations": [],
                "EbsBlockDevices": [],
                "ShrinkPolicy": {}
            },
....
        ]
    }
}

STEP 4: InstanceGroups is an array. Find the entry where InstanceGroupType is MASTER. Make note of the Id.

"Id": "ig-2EHOYXFABCDEF",

STEP 5: List your cluster instances: aws emr list-instances --cluster-id j-ABCDEFGHIJKLM

This will output JSON (I have truncated the output):

{
    "Instances": [
....
        {
            "Id": "ci-31LGK4KIECHNY",
            "Ec2InstanceId": "i-0524ec45912345678",
            "PublicDnsName": "ec2-52-123-201-221.us-west-2.compute.amazonaws.com",
            "PublicIpAddress": "52.123.201.221",
            "PrivateDnsName": "ip-172-31-41-111.us-west-2.compute.internal",
            "PrivateIpAddress": "172.31.41.111",
            "Status": {
                "State": "RUNNING",
                "StateChangeReason": {},
                "Timeline": {
                    "CreationDateTime": 1536626164.073,
                    "ReadyDateTime": 1536626533.886
                }
            },
            "InstanceGroupId": "ig-2EHOYXFABCDEF",
            "Market": "ON_DEMAND",
            "InstanceType": "m3.xlarge",
            "EbsVolumes": []
        }
    ]
}

STEP 6: Find the matching InstanceGroupId ig-2EHOYXFABCDEF. This will give you the EC2 Instance ID for the MASTER: "Ec2InstanceId": "i-0524ec45912345678"

Step 7: Tag your EC2 instance:

aws ec2 create-tags --resources i-0524ec45912345678 --tags Key=EMR,Value=MASTER

The above steps might be simpler with CLI Filters and / or jq, but this should be enough information so that you know how to find and tag the EMR Master Instance.




回答2:


Below can be used to directly get instance Id

aws emr list-instances --cluster-id ${aws_emr_cluster.cluster.id} --instance- 
group-id ${aws_emr_cluster.cluster.master_instance_group.0.id}  --query 
'Instances[*].Ec2InstanceId' --output text


来源:https://stackoverflow.com/questions/52256438/aws-cli-emr-get-master-node-instance-id-and-tag-it

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!