Reducing size of application jar by providing spark- classPath for maven dependencies:
My cluster is having 3 ec2 instances on which hadoop and spark i
Finally, I was able to solve the problem. I have created application jar using "mvn package" instead of "mvn clean compile assembly:single ",so that it will not download the maven dependencies while creating jar(But need to provide these jar/dependencies run-time) which resulted in small size Jar(as there is only reference of dependencies).
Then, I have added below two parameters in spark-defaults.conf on each node as:
spark.driver.extraClassPath /home/spark/.m2/repository/com/datastax/cassandra/cassandra-driver-core/2.1.7/cassandra-driver-core-2.1.7.jar:/home/spark/.m2/repository/com/googlecode/json-simple/json-simple/1.1/json-simple-1.1.jar:/home/spark/.m2/repository/com/google/code/gson/gson/2.3.1/gson-2.3.1.jar:/home/spark/.m2/repository/com/google/guava/guava/16.0.1/guava-16.0.1.jar
spark.executor.extraClassPath /home/spark/.m2/repository/com/datastax/cassandra/cassandra-driver-core/2.1.7/cassandra-driver-core-2.1.7.jar:/home/spark/.m2/repository/com/googlecode/json-simple/json-simple/1.1/json-simple-1.1.jar:/home/spark/.m2/repository/com/google/code/gson/gson/2.3.1/gson-2.3.1.jar:/home/spark/.m2/repository/com/google/guava/guava/16.0.1/guava-16.0.1.jar
So question arises that,how application JAR will get the maven dependencies(required jar's) run-time?
For that I have downloaded all required dependencies on each node using mvn clean compile assembly:single in advance.