hortonworks-data-platform

Oraoop disabled for Sqoop import

試著忘記壹切 提交于 2019-12-13 07:42:24
问题 I'm using the Hortonworks HDP Sandbox, and I’ve installed Oraoop per the instructions, but whenever I run a Sqoop import I get the message “oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.”. I’m not sure what else I need to do for it to pick it up. I have verified that the oraoop driver is in my sqoop lib directory. The imports do work, but they are just using the oracle driver, and I would like to play around with some of the features that you get with Oraoop.

Ambari - Unsupported or invalid service in stack

主宰稳场 提交于 2019-12-13 05:18:19
问题 I am trying to install a custom service using these instructions and these commands to add the service. When I issue the curl command, instead of getting added, I get this error: HTTP/1.1 400 Bad Request Set-Cookie: AMBARISESSIONID=ID;Path=/ Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Type: text/plain Content-Length: 139 Server: Jetty(7.6.7.v20120910) { "status" : 400, "message" : "Unsupported or invalid service in stack, clusterName=MahiMahi, serviceName=TESTSRV, stackInfo=HDP-2.1" } What

pyspark-java.lang.IllegalStateException: Input row doesn't have expected number of values required by the schema

跟風遠走 提交于 2019-12-13 04:28:28
问题 I'm running pyspark-sql code on Horton sandbox 18/08/11 17:02:22 INFO spark.SparkContext: Running Spark version 1.6.3 # code from pyspark.sql import * from pyspark.sql.types import * rdd1 = sc.textFile ("/user/maria_dev/spark_data/products.csv") rdd2 = rdd1.map( lambda x : x.split("," ) ) df1 = sqlContext.createDataFrame(rdd2, ["id","cat_id","name","desc","price", "url"]) df1.printSchema() root |-- id: string (nullable = true) |-- cat_id: string (nullable = true) |-- name: string (nullable =

HiveServer2 generate a lot of directories in hdfs /tmp/hive/hive

限于喜欢 提交于 2019-12-13 03:00:21
问题 We create new claster with Hiveserver2 (on Hortonworks HDP2.2 distribution). After some time we have more than 1048576 directories in /tmp/hive/hive on hdfs, because hive server generates it in this location. Someone has got similar problem? Logs from hiveserver: 2015-08-31 06:48:15,828 WARN [HiveServer2-Handler-Pool: Thread-1104]: conf.HiveConf (HiveConf.java:initialize(2499)) - HiveConf of name hive.heapsize does not exist 2015-08-31 06:48:15,829 WARN [HiveServer2-Handler-Pool: Thread-1104]

How connect to Hortonworks sandbox Hbase using Java Client API

Deadly 提交于 2019-12-12 18:53:10
问题 I have setup fresh Hortonworks sandbox .I'm trying to connect Hbase using Java Client API.This is the code i tried so far.But did not success.I did not change any configuration on Wortonworks sandbox.Do i need to do any configuration part in Hbase ? Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.property.clientPort", "2181"); configuration.set("hbase.zookeeper.quorum", "127.0.0.1"); configuration.set("hbase.master", "127.0.0.1:600000");

NoSuchMethodError while running Spark Streaming job on HDP 2.2

淺唱寂寞╮ 提交于 2019-12-12 16:25:54
问题 I am trying to run a simple streaming job on HDP 2.2 Sandbox but facing java.lang.NoSuchMethodError error. I am able to run SparkPi example on this machine without an issue. Following are the versions I am using- <kafka.version>0.8.2.0</kafka.version> <twitter4j.version>4.0.2</twitter4j.version> <spark-version>1.2.1</spark-version> <scala.version>2.11</scala.version> Code Snippet - val sparkConf = new SparkConf().setAppName("TweetSenseKafkaConsumer").setMaster("yarn-cluster"); val ssc = new

How to produce messages from public computers to a Kafka installation on a private network?

牧云@^-^@ 提交于 2019-12-12 05:50:11
问题 The system on which my Kafka server is running has two NICs, one with a public IP (135.220.23.45) and the other with a private one (192.168.1.14). The private NIC is connected to a subnet composed of 7 machines in total (all with addresses 192.168.1.xxx). Kafka has been installed as a service using HDP and has been configured with zookeeper.connect=192.168.1.14:2181 and listeners=PLAINTEXT://192.168.1.14:6667. I have started a consumer on the system that hosts the kafka server using: [bin

Spark Hbase connection issue

99封情书 提交于 2019-12-12 05:06:42
问题 Hitting with followiong error while i am trying to connect the hbase through spark(using newhadoopAPIRDD ) in HDP 2.4.2.Already tried increasing the RPC time in hbase site xml file,still getting the same. any idea how to fix ? Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Wed Nov 16 14:59:36 IST 2016, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=71216: row 'scores,,00000000000000' on table

Giraph ZooKeeper port problems

[亡魂溺海] 提交于 2019-12-11 20:55:25
问题 I am trying to run the SimpleShortestPathsVertex (aka SimpleShortestPathComputation) example described in the Giraph Quick Start. I am running this on a Hortonworks Sandbox instance (HDP 2.1) using VirtualBox, and I packaged giraph.jar using profile hadoop_2.0.0. When I try to run the example using hadoop jar giraph.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsVertex -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/hue

Pig xmlloader error when loading tag with colon

烈酒焚心 提交于 2019-12-11 19:22:54
问题 Ive been using Pig and XMLLOADER to load xml files. I've been practising on BOOK example. However, XML file I need to process has colons in tag. When I run a script it says that due to ':' it cannot be processed.(exact log at the end) This is the file I have. Modified for the purpose of ":" case. BOOKT.xml <CATALOG> <BC:BOOK id="1"> <TITLE>Hadoop Defnitive Guide</TITLE> <AUTHOR>Tom White</AUTHOR> <COUNTRY>US</COUNTRY> <COMPANY>CLOUDERA</COMPANY> <PRICE>24.90</PRICE> <YEAR>2012</YEAR> </BC