AWS CLI EMR get Master node Instance ID and tag it

后端 未结 4 1591
生来不讨喜
生来不讨喜 2021-01-06 22:20

I want to automate the running of a cluster and can use tags to get attributes of an EC2 instance like its instance-id.

The documentation on https://docs.aws.amazon.

相关标签:
4条回答
  • 2021-01-06 22:28

    In an enviroinment where you does not have the aws cli, you can cat the following file:

    cat /mnt/var/lib/info/job-flow.json
    

    An example of the content is the following one:

    {
      "jobFlowId": "j-0000X0X0X00XX",
      "jobFlowCreationInstant": 1579512208006,
      "instanceCount": 2,
      "masterInstanceId": "i-00x0xx0000xxx0x00",
      "masterPrivateDnsName": "localhost",
      "masterInstanceType": "m5.xlarge",
      "slaveInstanceType": "m5.xlarge",
      "hadoopVersion": "2.8.5",
      "instanceGroups": [
        {
          "instanceGroupId": "ig-0XX00XX0X0XXX",
          "instanceGroupName": "Master - 1",
          "instanceRole": "Master",
          "marketType": "OnDemand",
          "instanceType": "m5.xlarge",
          "requestedInstanceCount": 1
        },
        {
          "instanceGroupId": "ig-000X0XXXXXXX",
          "instanceGroupName": "Core - 2",
          "instanceRole": "Core",
          "marketType": "OnDemand",
          "instanceType": "m5.xlarge",
          "requestedInstanceCount": 1
        }
      ]
    

    NOTE: i've omitted the ID of the jobs using 0 where a number is expected and X where a ltter is expected.

    0 讨论(0)
  • 2021-01-06 22:30

    As you noted when you create an EMR cluster, the tags are the same for all nodes (Master, Slave, Task).

    You will find that this process using the AWS CLI to be complicated. My recomendation is to review the examples below and then write a Python program to do this.

    Process to add your own tags to the EC2 instances.

    STEP 1: List your EMR Clusters: aws emr list-clusters

    This will output JSON:

    {
        "Clusters": [
            {
                "Id": "j-ABCDEFGHIJKLM",
                "Name": "'MyCluster'",
                "Status": {
                    "State": "WAITING",
                    "StateChangeReason": {
                        "Message": "Cluster ready after last step completed."
                    },
                    "Timeline": {
                        "CreationDateTime": 1536626095.303,
                        "ReadyDateTime": 1536626568.482
                    }
                },
                "NormalizedInstanceHours": 0
            }
        ]
    }
    

    STEP 2: Make a note of the Cluster ID from the JSON:

    "Id": "j-ABCDEFGHIJKLM",
    

    STEP 3: Describe your EMR Cluster: aws emr describe-cluster --cluster-id j-ABCDEFGHIJKLM

    This will output JSON (I have truncated this output to just the MASTER section):

    {
        "Cluster": {
            "Id": "j-ABCDEFGHIJKLM",
            "Name": "'Test01'",
    ....
            "InstanceGroups": [
                {
                    "Id": "ig-2EHOYXFABCDEF",
                    "Name": "Master Instance Group",
                    "Market": "ON_DEMAND",
                    "InstanceGroupType": "MASTER",
                    "InstanceType": "m3.xlarge",
                    "RequestedInstanceCount": 1,
                    "RunningInstanceCount": 1,
                    "Status": {
                        "State": "RUNNING",
                        "StateChangeReason": {
                            "Message": ""
                        },
                        "Timeline": {
                            "CreationDateTime": 1536626095.316,
                            "ReadyDateTime": 1536626533.886
                        }
                    },
                    "Configurations": [],
                    "EbsBlockDevices": [],
                    "ShrinkPolicy": {}
                },
    ....
            ]
        }
    }
    

    STEP 4: InstanceGroups is an array. Find the entry where InstanceGroupType is MASTER. Make note of the Id.

    "Id": "ig-2EHOYXFABCDEF",

    STEP 5: List your cluster instances: aws emr list-instances --cluster-id j-ABCDEFGHIJKLM

    This will output JSON (I have truncated the output):

    {
        "Instances": [
    ....
            {
                "Id": "ci-31LGK4KIECHNY",
                "Ec2InstanceId": "i-0524ec45912345678",
                "PublicDnsName": "ec2-52-123-201-221.us-west-2.compute.amazonaws.com",
                "PublicIpAddress": "52.123.201.221",
                "PrivateDnsName": "ip-172-31-41-111.us-west-2.compute.internal",
                "PrivateIpAddress": "172.31.41.111",
                "Status": {
                    "State": "RUNNING",
                    "StateChangeReason": {},
                    "Timeline": {
                        "CreationDateTime": 1536626164.073,
                        "ReadyDateTime": 1536626533.886
                    }
                },
                "InstanceGroupId": "ig-2EHOYXFABCDEF",
                "Market": "ON_DEMAND",
                "InstanceType": "m3.xlarge",
                "EbsVolumes": []
            }
        ]
    }
    

    STEP 6: Find the matching InstanceGroupId ig-2EHOYXFABCDEF. This will give you the EC2 Instance ID for the MASTER: "Ec2InstanceId": "i-0524ec45912345678"

    Step 7: Tag your EC2 instance:

    aws ec2 create-tags --resources i-0524ec45912345678 --tags Key=EMR,Value=MASTER

    The above steps might be simpler with CLI Filters and / or jq, but this should be enough information so that you know how to find and tag the EMR Master Instance.

    0 讨论(0)
  • 2021-01-06 22:46

    Below example is for Instance Fleet, it saves Cluster ID, Instance Fleet ID and Master IP as environment variables.

    Replace cluster name "My-Cluster" to the actual cluster name.

    export CLUSTER_ID=$(aws emr list-clusters --active --query 'Clusters[?Name==`My-Cluster`].Id' --output text)
    export INSTANCE_FLEET=$(aws emr describe-cluster --cluster-id $CLUSTER_ID | jq -r '.[].InstanceFleets | .[] | select(.InstanceFleetType=="MASTER") | .Id')
    export PRIVATE_IP=aws emr list-instances --cluster-id $CLUSTER_ID --instance-fleet-id $INSTANCE_FLEET  --query 'Instances[*].PrivateIpAddress' --output text
    
    0 讨论(0)
  • 2021-01-06 22:47

    Below can be used to directly get instance Id

    aws emr list-instances --cluster-id ${aws_emr_cluster.cluster.id} --instance- 
    group-id ${aws_emr_cluster.cluster.master_instance_group.0.id}  --query 
    'Instances[*].Ec2InstanceId' --output text
    
    0 讨论(0)
提交回复
热议问题