datastax | 易学教程

NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

阅读更多关于 NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

问题 The setup: 2-node Cassandra 1.2.6 cluster replicas=2 very large CQL3 table with no secondary index Rowkey is a UUID.randomUUID().toString() read consistency set to ONE Using DataStax java driver 1.0 The request: Attempting to do a table scan by " SELECT some-col from schema.table LIMIT nnn; " The fail: Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver. It reads like this: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)

CQLSH client - module' object has no attribute 'parse_options

阅读更多关于 CQLSH client - module' object has no attribute 'parse_options

问题 I'm trying to access my Cassandra server through a CQLSH client to import a huge CSV file. I'm getting a module' object has no attribute 'parse_options error. I run the follow command: cqlsh XXX.XXX.XX.XX XXXX --cqlversion="3.4.2" --execute="copy evolvdso.teste from '2016-10-26 15:25:10.csv' WITH DELIMITER =',' AND HEADER=TRUE --debug"; This is the debug and error message that follows: Starting copy of evolvdso.teste with columns ['ref_equip', 'date', 'load', 'ptd_assoc']. Traceback (most

All masters are unresponsive ! ? Spark master is not responding with datastax architecture

阅读更多关于 All masters are unresponsive ! ? Spark master is not responding with datastax architecture

问题 Tried using both Spark shell and Spark submit, getting this exception? Initializing SparkContext with MASTER: spark://1.2.3.4:7077 ERROR 2015-06-11 14:08:29 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. WARN 2015-06-11 14:08:29 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application ID is not initialized yet. ERROR 2015-06-11 14:08:30 org.apache.spark.scheduler.TaskSchedulerImpl

How to create a graph and its schema without using Datastax Studio but through Java?

阅读更多关于 How to create a graph and its schema without using Datastax Studio but through Java?

问题 I was trying to create my first connection with DSE Graph through java.. public static void main(String args[]){ DseCluster dseCluster = null; try { dseCluster = DseCluster.builder() .addContactPoint("192.168.1.43") .build(); DseSession dseSession = dseCluster.connect(); GraphTraversalSource g = DseGraph.traversal(dseSession, new GraphOptions().setGraphName("graph")); GraphStatement graphStatement = DseGraph.statementFromTraversal(g.addV("test")); GraphResultSet grs = dseSession.executeGraph

How to executing batch statement and LWT as a transaction in Cassandra

阅读更多关于 How to executing batch statement and LWT as a transaction in Cassandra

问题 I have two table with below model: CREATE TABLE IF NOT EXISTS INV ( CODE TEXT, PRODUCT_CODE TEXT, LOCATION_NUMBER TEXT, QUANTITY DECIMAL, CHECK_INDICATOR BOOLEAN, VERSION BIGINT, PRIMARY KEY ((LOCATION_NUMBER, PRODUCT_CODE))); CREATE TABLE IF NOT EXISTS LOOK_INV ( LOCATION_NUMBER TEXT, CHECK_INDICATOR BOOLEAN, PRODUCT_CODE TEXT, CHECK_INDICATOR_DDTM TIMESTAMP, PRIMARY KEY ((LOCATION_NUMBER), CHECK_INDICATOR, PRODUCT_CODE)) WITH CLUSTERING ORDER BY (CHECK_INDICATOR ASC, PRODUCT_CODE ASC); I

RDD not serializable Cassandra/Spark connector java API

阅读更多关于 RDD not serializable Cassandra/Spark connector java API

问题 so I previously had some questions on how to query cassandra using spark in a java maven project here: Querying Data in Cassandra via Spark in a Java Maven Project Well my question was answered and it worked, however I've run into an issue (possibly an issue). I'm trying to now use the datastax java API. Here is my code: package com.angel.testspark.test2; import org.apache.commons.lang3.StringUtils; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache

how to connect spark streaming with cassandra?

阅读更多关于 how to connect spark streaming with cassandra?

问题 I'm using Cassandra v2.1.12 Spark v1.4.1 Scala 2.10 and cassandra is listening on rpc_address:127.0.1.1 rpc_port:9160 For example, to connect kafka and spark-streaming, while listening to kafka every 4 seconds, I have the following spark job sc = SparkContext(conf=conf) stream=StreamingContext(sc,4) map1={'topic_name':1} kafkaStream = KafkaUtils.createStream(stream, 'localhost:2181', "name", map1) And spark-streaming keeps listening to kafka broker every 4 seconds and outputs the contents.

how to connect spark streaming with cassandra?

阅读更多关于 how to connect spark streaming with cassandra?

Why spark is slower when compared to sqoop , when it comes to jdbc?

阅读更多关于 Why spark is slower when compared to sqoop , when it comes to jdbc?

问题 It is understood , while migrating/load from oracle db to hdfs/parquet , it is preferred to use SQOOP rather than SPARK with JDBC driver. Spark suppose to be 100x faster when processing right ? Then what is wrong with Spark ? Why people prefer SQOOP while loading data from oracle db tables ? Please suggest me what should i need to do make Spark faster when loading data from oracle. 回答1: Spark is fast when it knows how to parallelize queries. If you're just executing single query, then Spark

How to configure access permissions for Cassandra on Linux Ubuntu

阅读更多关于 How to configure access permissions for Cassandra on Linux Ubuntu

问题 Thank-you for reading this. I am stuck at step three on this tutorial pertaining to installing Cassandra: http://wiki.apache.org/cassandra/GettingStarted#Step_3:_Start_Cassandra I can only run this software as root. ( shouting this over fictional helicopter noise ) This seem like a terrible way to run the software. When starting the Cassandra server as my normal user I receive the following errors: 1.) 15:46:00,147 |-ERROR in ch.qos.logback.core.rolling.RollingFileAppender[FILE] - openFile(