Spark-Cassandra Connector : Failed to open native connection to Cassandra

后端 未结 5 1348
情深已故
情深已故 2020-12-20 23:54

I am new to Spark and Cassandra. On trying to submit a spark job, I am getting an error while connecting to Cassandra.

Details:

Versions:

Spa         


        
相关标签:
5条回答
  • 2020-12-21 00:09

    you did not specified spark.cassandra.connection.host by default spark assume that cassandra host is same as spark master node.

    var sc:SparkContext=_
    val conf = new SparkConf().setAppName("Cassandra Demo").setMaster(master)
    .set("spark.cassandra.connection.host", "192.168.101.11")
    c=new SparkContext(conf)
    
    val rdd = sc.cassandraTable("test", "words")
    rdd.toArray.foreach(println)
    

    it should work if you have properly set seed nodein cassandra.yaml

    0 讨论(0)
  • 2020-12-21 00:16

    The issue resolved. It was due to some mess up with the dependencies. I built a jar with dependencies and passed it to spark-submit, instead of specifying dependent jars separately.

    0 讨论(0)
  • 2020-12-21 00:20

    It's worked finally :

    steps :

    1. set listen_address to private IP of EC2 instance.
    2. do not set any broadcast_address
    3. set rpc_address to 0.0.0.0
    4. set broadcast_rpc_address to public ip of EC2 instance.
    0 讨论(0)
  • 2020-12-21 00:23

    This is an issue with version of the cassandra-driver-core jar's dependency.

    The provided cassandra's version is 2.0
    The provided cassandra-driver-core jar's version is 2.1.5
    

    The jar should be the same as the version of the cassandra running.

    In this case, the included jar file should be cassandra-driver-core-2.0.0.jar
    
    0 讨论(0)
  • 2020-12-21 00:34

    I struggled with this issue overnight, and finally got a combination that works. I am writing it down for those who may run into similar issue.

    First of all, this is a version issue cassandra-driver-core's dependency. But to track down the exact combination that works takes me quite a bit time.

    Secondly, this is the combination that works for me.

    1. Spark 1.6.2 with Hadoop 2.6, cassandra 2.1.5 (Ubuntu 14.04, Java 1.8),
    2. In built.sbt (sbt assembly, scalaVersion := "2.10.5"), use

    "com.datastax.spark" %% "spark-cassandra-connector" % "1.4.0", "com.datastax.cassandra" % "cassandra-driver-core" % "2.1.5"

    Thirdly, let me clarify my frustrations. With spark-cassandra-connector 1.5.0, I can run the assembly with spark-submit with --master "local[2]" on the same machine with remote cassandra connection without any problem. Any combination of connector 1.5.0, 1.6.0 with Cassandra 2.0, 2.1, 2.2, 3,4 works well. But if I try to submit the job to a cluster from the same machine (NodeManager) with --master yarn --deploy-mode cluster, then I will always run into the problem: Failed to open native connection to Cassandra at {192.168.122.12}:9042

    What is going on here? Any from DataStarX can take a look at this issue? I can only guess it has something to do with "cqlversion", which should match the version of Cassandra cluster.

    Anybody know a better solution? [cassandra], [apache-spark]

    0 讨论(0)
提交回复
热议问题