datastax

NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

本小妞迷上赌 提交于 2020-01-03 20:57:31
问题 The setup: 2-node Cassandra 1.2.6 cluster replicas=2 very large CQL3 table with no secondary index Rowkey is a UUID.randomUUID().toString() read consistency set to ONE Using DataStax java driver 1.0 The request: Attempting to do a table scan by " SELECT some-col from schema.table LIMIT nnn; " The fail: Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver. It reads like this: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)

CQLSH client - module' object has no attribute 'parse_options

拟墨画扇 提交于 2020-01-03 08:38:09
问题 I'm trying to access my Cassandra server through a CQLSH client to import a huge CSV file. I'm getting a module' object has no attribute 'parse_options error. I run the follow command: cqlsh XXX.XXX.XX.XX XXXX --cqlversion="3.4.2" --execute="copy evolvdso.teste from '2016-10-26 15:25:10.csv' WITH DELIMITER =',' AND HEADER=TRUE --debug"; This is the debug and error message that follows: Starting copy of evolvdso.teste with columns ['ref_equip', 'date', 'load', 'ptd_assoc']. Traceback (most

All masters are unresponsive ! ? Spark master is not responding with datastax architecture

核能气质少年 提交于 2020-01-03 04:35:10
问题 Tried using both Spark shell and Spark submit, getting this exception? Initializing SparkContext with MASTER: spark://1.2.3.4:7077 ERROR 2015-06-11 14:08:29 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. WARN 2015-06-11 14:08:29 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application ID is not initialized yet. ERROR 2015-06-11 14:08:30 org.apache.spark.scheduler.TaskSchedulerImpl

How to create a graph and its schema without using Datastax Studio but through Java?

老子叫甜甜 提交于 2020-01-03 02:22:13
问题 I was trying to create my first connection with DSE Graph through java.. public static void main(String args[]){ DseCluster dseCluster = null; try { dseCluster = DseCluster.builder() .addContactPoint("192.168.1.43") .build(); DseSession dseSession = dseCluster.connect(); GraphTraversalSource g = DseGraph.traversal(dseSession, new GraphOptions().setGraphName("graph")); GraphStatement graphStatement = DseGraph.statementFromTraversal(g.addV("test")); GraphResultSet grs = dseSession.executeGraph

How to executing batch statement and LWT as a transaction in Cassandra

杀马特。学长 韩版系。学妹 提交于 2020-01-02 11:27:09
问题 I have two table with below model: CREATE TABLE IF NOT EXISTS INV ( CODE TEXT, PRODUCT_CODE TEXT, LOCATION_NUMBER TEXT, QUANTITY DECIMAL, CHECK_INDICATOR BOOLEAN, VERSION BIGINT, PRIMARY KEY ((LOCATION_NUMBER, PRODUCT_CODE))); CREATE TABLE IF NOT EXISTS LOOK_INV ( LOCATION_NUMBER TEXT, CHECK_INDICATOR BOOLEAN, PRODUCT_CODE TEXT, CHECK_INDICATOR_DDTM TIMESTAMP, PRIMARY KEY ((LOCATION_NUMBER), CHECK_INDICATOR, PRODUCT_CODE)) WITH CLUSTERING ORDER BY (CHECK_INDICATOR ASC, PRODUCT_CODE ASC); I

RDD not serializable Cassandra/Spark connector java API

别说谁变了你拦得住时间么 提交于 2020-01-02 07:31:34
问题 so I previously had some questions on how to query cassandra using spark in a java maven project here: Querying Data in Cassandra via Spark in a Java Maven Project Well my question was answered and it worked, however I've run into an issue (possibly an issue). I'm trying to now use the datastax java API. Here is my code: package com.angel.testspark.test2; import org.apache.commons.lang3.StringUtils; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache

how to connect spark streaming with cassandra?

巧了我就是萌 提交于 2020-01-01 15:32:12
问题 I'm using Cassandra v2.1.12 Spark v1.4.1 Scala 2.10 and cassandra is listening on rpc_address:127.0.1.1 rpc_port:9160 For example, to connect kafka and spark-streaming, while listening to kafka every 4 seconds, I have the following spark job sc = SparkContext(conf=conf) stream=StreamingContext(sc,4) map1={'topic_name':1} kafkaStream = KafkaUtils.createStream(stream, 'localhost:2181', "name", map1) And spark-streaming keeps listening to kafka broker every 4 seconds and outputs the contents.

how to connect spark streaming with cassandra?

一曲冷凌霜 提交于 2020-01-01 15:32:07
问题 I'm using Cassandra v2.1.12 Spark v1.4.1 Scala 2.10 and cassandra is listening on rpc_address:127.0.1.1 rpc_port:9160 For example, to connect kafka and spark-streaming, while listening to kafka every 4 seconds, I have the following spark job sc = SparkContext(conf=conf) stream=StreamingContext(sc,4) map1={'topic_name':1} kafkaStream = KafkaUtils.createStream(stream, 'localhost:2181', "name", map1) And spark-streaming keeps listening to kafka broker every 4 seconds and outputs the contents.

Why spark is slower when compared to sqoop , when it comes to jdbc?

依然范特西╮ 提交于 2020-01-01 07:04:14
问题 It is understood , while migrating/load from oracle db to hdfs/parquet , it is preferred to use SQOOP rather than SPARK with JDBC driver. Spark suppose to be 100x faster when processing right ? Then what is wrong with Spark ? Why people prefer SQOOP while loading data from oracle db tables ? Please suggest me what should i need to do make Spark faster when loading data from oracle. 回答1: Spark is fast when it knows how to parallelize queries. If you're just executing single query, then Spark

How to configure access permissions for Cassandra on Linux Ubuntu

半城伤御伤魂 提交于 2020-01-01 05:10:07
问题 Thank-you for reading this. I am stuck at step three on this tutorial pertaining to installing Cassandra: http://wiki.apache.org/cassandra/GettingStarted#Step_3:_Start_Cassandra I can only run this software as root. ( shouting this over fictional helicopter noise ) This seem like a terrible way to run the software. When starting the Cassandra server as my normal user I receive the following errors: 1.) 15:46:00,147 |-ERROR in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - openFile(