Use bootstrap to replace default jar on EMR

烈酒焚心 提交于 2019-12-23 03:49:26

问题


I am on a EMR cluster with AMI 3.0.4. Once the cluster is up, I ssh to master and did the following manually:

cd /home/hadoop/share/hadoop/common/lib/
rm guava-11.0.2.jar
wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
chmod 777 guava-14.0.1.jar

Is it possible to do above in a bootstrap action? Thanks!


回答1:


With EMR 4.0 the hadoop installation path changed. So the manual update of guava-14.0.1.jar must be changed to:

cd /usr/lib/hadoop/lib
sudo wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
sudo rm guava-11.0.2.jar

The boostrap Action in the Answer from Sandesh doesn't work for us.

Edit:

Now we got a solution for EMR 4.0. You have to provide a spark-config.json in S3 which sets the extra ClassPath for both the Spark Executor and Driver. In the "Edit software settings (optional)" section you can define the location of this config file and load it from S3.

spark-config.json

[
  {
  "classification":"spark",
  "properties":{
    "maximizeResourceAllocation":"true"
    }
  },
  {
  "classification":"spark-defaults",
  "properties":{
    "spark.executor.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
    "spark.driver.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
    }
  }
]

The guava-14.0.1.jar needs to be downloaded via the boostrap script: guava_download.sh

#!/bin/bash
mkdir -p /home/hadoop/lib/
cd /home/hadoop/lib/
wget https://repo1.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar



回答2:


Yes , you can add bootstrap script to do this. create a shell script and upload it s3 and then use the path for script in bootstrap action for EMR.

e.g you can keep guava-14.0.1.jar in s3 bucket and download it

#!/bin/bash
hadoop fs -copyToLocal s3n://rootbucket/myjars/guava-14.0.1.jar /home/hadoop/share/hadoop/common/lib/
rm -rf /home/hadoop/share/hadoop/common/lib/guava-11.0.2.jar

I assume you are doing it as you have some dependency from with 14.0.1 jar from your map reduce code. You can build fat jar with guava-14.0.1.jar added and upload the jar as your custom jar to run you job



来源:https://stackoverflow.com/questions/28207658/use-bootstrap-to-replace-default-jar-on-emr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!