Spark Streaming to Cassandra, not persisiting

て烟熏妆下的殇ゞ 提交于 2019-12-12 01:38:10

问题


I am trying to persist spark stream in to Cassandra, here is my code:

JavaDStream<BusinessPointNYCT> studentFileDStream = m_JavaStreamingContext.textFileStream(new File(fileDir, "BUSINESSPOINTS_NY_CT.csv").getAbsolutePath()).map(new BusinessPointMapFunction());
    //Save it to Cassandra
    CassandraStreamingJavaUtil.javaFunctions(studentFileDStream)
    .writerBuilder("spatial_keyspace", "businesspoints_ny_ct", mapToRow(BusinessPointNYCT.class)).saveToCassandra();

My application is started without any error or warning, but the data is not persisting in to Cassandra. As per log it is deleting it after storing:

16/04/14 14:54:30 INFO JobScheduler: Added jobs for time 1460625870000 ms
16/04/14 14:54:30 INFO JobScheduler: Starting job streaming job 1460625870000 ms.0 from job set of time 1460625870000 ms
16/04/14 14:54:31 INFO SparkContext: Starting job: runJob at DStreamFunctions.scala:54
16/04/14 14:54:31 INFO DAGScheduler: Job 0 finished: runJob at DStreamFunctions.scala:54, took 0.001267 s
16/04/14 14:54:31 INFO JobScheduler: Finished job streaming job 1460625870000 ms.0 from job set of time 1460625870000 ms
16/04/14 14:54:31 INFO JobScheduler: Total delay: 1.028 s for time 1460625870000 ms (execution: 0.058 s)
16/04/14 14:54:31 INFO FileInputDStream: Cleared 0 old files that were older than 1460625810000 ms: 
16/04/14 14:54:31 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
16/04/14 14:54:31 INFO ReceiverTracker: Cleanup old received batch data: 1460625810000 ms
16/04/14 14:54:31 INFO InputInfoTracker: remove old batch metadata: 
16/04/14 14:54:40 INFO FileInputDStream: Finding new files took 0 ms
16/04/14 14:54:40 INFO FileInputDStream: New files at time 1460625880000 ms:

16/04/14 14:54:40 INFO JobScheduler: Added jobs for time 1460625880000 ms
16/04/14 14:54:40 INFO JobScheduler: Starting job streaming job 1460625880000 ms.0 from job set of time 1460625880000 ms
16/04/14 14:54:40 INFO SparkContext: Starting job: runJob at DStreamFunctions.scala:54
16/04/14 14:54:40 INFO DAGScheduler: Job 1 finished: runJob at DStreamFunctions.scala:54, took 0.000018 s
16/04/14 14:54:40 INFO JobScheduler: Finished job streaming job 1460625880000 ms.0 from job set of time 1460625880000 ms
16/04/14 14:54:40 INFO JobScheduler: Total delay: 0.022 s for time 1460625880000 ms (execution: 0.010 s)
16/04/14 14:54:40 INFO MapPartitionsRDD: Removing RDD 2 from persistence list
16/04/14 14:54:40 INFO MapPartitionsRDD: Removing RDD 1 from persistence list
16/04/14 14:54:40 INFO BlockManager: Removing RDD 2
16/04/14 14:54:40 INFO FileInputDStream: Cleared 0 old files that were older than 1460625820000 ms: 
16/04/14 14:54:40 INFO BlockManager: Removing RDD 1
16/04/14 14:54:40 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
16/04/14 14:54:40 INFO ReceiverTracker: Cleanup old received batch data: 1460625820000 ms
16/04/14 14:54:40 INFO InputInfoTracker: remove old batch metadata: 
16/04/14 14:54:41 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
16/04/14 14:54:50 INFO FileInputDStream: Finding new files took 1 ms
16/04/14 14:54:50 INFO FileInputDStream: New files at time 1460625890000 ms:

I also verified it from a Cassandara client, it is not returning any data:

          CassandraSimpleClient client = new CassandraSimpleClient();
      client.connect("127.0.0.1");
      //Session session = cluster.connect(“Your keyspace name”);
      Session session = client.getActiveCluster().connect("spatial_keyspace");

      ResultSet result = session.execute("SELECT * FROM spatial_keyspace.BUSINESSPOINTS_NY_CT");          

I am stuck here, spark streaming is not getting data from text file ? Need help !!. Thanks

It does not works for me, i think it work only with HDFS, so i changed it to socket textStream(), and that is working fine.

m_JavaStreamingContext.socketTextStream("IN-6WX6152", 9090);

来源:https://stackoverflow.com/questions/36619211/spark-streaming-to-cassandra-not-persisiting

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!