问题
I am using Jupyter notebook and want to save csv file to cassandra db. There is no problem while getting data and showing it, But when I try to save this csv data to cassandra db it throws below exception.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, localhost, executor driver): java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
I dowloaded maven package manually both 2.4.0 and 2.4.1 and none of them worked. Also stated packages at the top of code.
import sys
import uuid
import time
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.0 pyspark-shell'
try:
from pyspark import SparkContext
from pyspark import SparkConf
from pyspark.sql import SparkSession
from itertools import islice
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql import Row
from datetime import datetime
except ImportError as e:
print("error importing spark modules", e)
sys.exit(1)
conf = SparkConf().setAppName("Stand Alone Python Script").setMaster("local[*]")\
.setAll([('spark.executor.memory', '8g'),\
('spark.executor.cores', '3'),\
('spark.cores.max', '3'),\
('spark.cassandra.connection.host', 'cassandra_ip'),\
('spark.cassandra.auth.username', 'cassandra_user_name'),\
('spark.cassandra.auth.password', 'cassandra_password'),\
('spark.driver.memory','8g')])
sc = SparkContext(conf=conf)
sql_context = SQLContext(sc)
consumer_complaints = sql_context.read.format("csv").option("header", "true").option("inferSchema", "false").load("in/Consumer_Complaints.csv")
consumer_complaints.write\
.format("org.apache.spark.sql.cassandra")\
.mode('append')\
.options(table="table_name", keyspace="space_name")\
.save()
sc.stop()
回答1:
Hello I solved my problem with following steps:
downloaded twitter jsr jar and moved it to $SPARK_HOME/jars directory.
cp /home/jovyan/.m2/repository/com/twitter/jsr166e/1.1.0/jsr166e-1.1.0.jar /usr/local/spark/jars/
Also because of docker's jupyter user is jovyan not root I grant permission to this folder
I used directly below statement but you can use more restrictive way.
chmod -R 777 /usr/local/spark/jars/
Thank you
来源:https://stackoverflow.com/questions/55276840/jupyter-cassandra-save-problem-java-lang-noclassdeffounderror-com-twitter-jsr