avro

How to sqoop to import oracle clob data to avro files on hdfs

99封情书 提交于 2020-08-05 06:27:10
问题 I am getting a strange error when sqooping the data from oracle DB to HDFS. Sqoop is not able to import clob data into an avro files on hadoop. This is the sqoop import error : ERROR tool.ImportTool: Imported Failed: Cannot convert SQL type 2005 Do we need to add any extra arguments to sqoop import statement for it correctly import clob data into avro files ? 回答1: Update: Found the solution, We need to add -- map-column-java for the clob columns. For Eg: If the column name is clob then we

Storing null values in avro files

只谈情不闲聊 提交于 2020-08-02 06:50:55
问题 I have some json data that looks like this: { "id": 1998983092, "name": "Test Name 1", "type": "search string", "creationDate": "2017-06-06T13:49:15.091+0000", "lastModificationDate": "2017-06-28T14:53:19.698+0000", "lastModifiedUsername": "testuser@test.com", "lockedQuery": false, "lockedByUsername": null } I am able to add the lockedQuery null value to a GenericRecord object without issue. GenericRecord record = new GenericData.Record(schema); if(json.isNull("lockedQuery")){ record.put(

Storing null values in avro files

隐身守侯 提交于 2020-08-02 06:50:10
问题 I have some json data that looks like this: { "id": 1998983092, "name": "Test Name 1", "type": "search string", "creationDate": "2017-06-06T13:49:15.091+0000", "lastModificationDate": "2017-06-28T14:53:19.698+0000", "lastModifiedUsername": "testuser@test.com", "lockedQuery": false, "lockedByUsername": null } I am able to add the lockedQuery null value to a GenericRecord object without issue. GenericRecord record = new GenericData.Record(schema); if(json.isNull("lockedQuery")){ record.put(

Unnesting in SQL (Athena): How to convert array of structs into an array of values plucked from the structs?

余生长醉 提交于 2020-06-25 08:37:55
问题 I am taking samples from a Bayesian statistical model, serializing them with Avro, uploading them to S3, and querying them with Athena. I need help writing a query that unnests an array in the table. The CREATE TABLE query looks like: CREATE EXTERNAL TABLE `model_posterior`( `job_id` bigint, `model_id` bigint, `parents` array<struct<`feature_name`:string,`feature_value`:bigint, `is_zid`:boolean>>, `posterior_samples` struct <`parameter`:string,`is_scaled`:boolean,`samples`:array<double>>) The

org.apache.avro.AvroTypeException: Unknown union branch

十年热恋 提交于 2020-06-22 22:58:44
问题 I'm using this Avro schema: prices-state.avsc { "namespace": "com.company.model", "name": "Product", "type": "record", "fields": [ { "name": "product_id", "type": "string" }, { "name": "sale_prices", "type": { "name": "sale_prices", "type": "record", "fields": [ { "name": "default", "type": { "name": "default", "type": "record", "fields": [ { "name": "order_by_item_price_by_item", "type": [ "null", { "name": "markup_strategy", "type": "record", "fields": [ { "name": "type", "type": { "name":

org.apache.avro.AvroTypeException: Unknown union branch

萝らか妹 提交于 2020-06-22 22:58:30
问题 I'm using this Avro schema: prices-state.avsc { "namespace": "com.company.model", "name": "Product", "type": "record", "fields": [ { "name": "product_id", "type": "string" }, { "name": "sale_prices", "type": { "name": "sale_prices", "type": "record", "fields": [ { "name": "default", "type": { "name": "default", "type": "record", "fields": [ { "name": "order_by_item_price_by_item", "type": [ "null", { "name": "markup_strategy", "type": "record", "fields": [ { "name": "type", "type": { "name":

Creating sample Avro data for bytes type

╄→尐↘猪︶ㄣ 提交于 2020-05-28 07:19:13
问题 I am trying to create a sample .avro file containing bytes as type and decimal as logicalType, But the avro file when loaded to hive table results in a different value. What could be the reason? schema.avsc: { "type" : "record", "name" : "example", "namespace" : "com.xyz.avro", "fields" : [ { "name" : "cost", "type" : { "type" : "bytes", "logicalType" : "decimal", "precision" : 38, "scale" : 10 } }] } data.json: { "cost" : "0.0" } Converted to .avro using avro-tools : java -jar avro-tools-1.8

Creating sample Avro data for bytes type

落爺英雄遲暮 提交于 2020-05-28 07:16:46
问题 I am trying to create a sample .avro file containing bytes as type and decimal as logicalType, But the avro file when loaded to hive table results in a different value. What could be the reason? schema.avsc: { "type" : "record", "name" : "example", "namespace" : "com.xyz.avro", "fields" : [ { "name" : "cost", "type" : { "type" : "bytes", "logicalType" : "decimal", "precision" : 38, "scale" : 10 } }] } data.json: { "cost" : "0.0" } Converted to .avro using avro-tools : java -jar avro-tools-1.8

Creating sample Avro data for bytes type

随声附和 提交于 2020-05-28 07:16:13
问题 I am trying to create a sample .avro file containing bytes as type and decimal as logicalType, But the avro file when loaded to hive table results in a different value. What could be the reason? schema.avsc: { "type" : "record", "name" : "example", "namespace" : "com.xyz.avro", "fields" : [ { "name" : "cost", "type" : { "type" : "bytes", "logicalType" : "decimal", "precision" : 38, "scale" : 10 } }] } data.json: { "cost" : "0.0" } Converted to .avro using avro-tools : java -jar avro-tools-1.8