Error when querying avro-backed hive table: java.lang.IllegalArgumentException

杀马特。学长 韩版系。学妹 提交于 2020-01-06 02:52:11

问题


I am trying to create a hive table on azure HDInsight from an avro file exported from raw google analytics data in BigQuery.

It seems to work. I can created the table, and there are no errors when I run DESCRIBE. But when I try to select results, even if I select only two non-nested columns, I get a an error: "java.lang.IllegalArgumentException".

Here's how I created the table:

DROP TABLE IF EXISTS ga_sessions_20150106;
CREATE EXTERNAL TABLE IF NOT EXISTS ga_sessions_20150106
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION '/upload/ga_sessions'
TBLPROPERTIES ('avro.schema.url'='/upload/ga_sessions.avsc');
describe ga_sessions_20150106;

Here's the avro schema:

{"type":"record","name":"root","fields":[{"name":"visitorId","type":["long","null"]},{"name":"visitNumber","type":["long","null"]},{"name":"visitId","type":["long","null"]},{"name":"visitStartTime","type":["long","null"]},{"name":"date","type":["string","null"]},{"name":"totals","type":[{"type":"record","name":"totals","fields":[{"name":"visits","type":["long","null"]},{"name":"hits","type":["long","null"]},{"name":"pageviews","type":["long","null"]},{"name":"timeOnSite","type":["long","null"]},{"name":"bounces","type":["long","null"]},{"name":"transactions","type":["long","null"]},{"name":"transactionRevenue","type":["long","null"]},{"name":"newVisits","type":["long","null"]},{"name":"screenviews","type":["long","null"]},{"name":"uniqueScreenviews","type":["long","null"]},{"name":"timeOnScreen","type":["long","null"]},{"name":"totalTransactionRevenue","type":["long","null"]}]},"null"]},{"name":"trafficSource","type":[{"type":"record","name":"trafficSource","fields":[{"name":"referralPath","type":["string","null"]},{"name":"campaign","type":["string","null"]},{"name":"source","type":["string","null"]},{"name":"medium","type":["string","null"]},{"name":"keyword","type":["string","null"]},{"name":"adContent","type":["string","null"]},{"name":"adwordsClickInfo","type":[{"type":"record","name":"adwordsClickInfo","fields":[{"name":"campaignId","type":["long","null"]},{"name":"adGroupId","type":["long","null"]},{"name":"creativeId","type":["long","null"]},{"name":"criteriaId","type":["long","null"]},{"name":"page","type":["long","null"]},{"name":"slot","type":["string","null"]},{"name":"criteriaParameters","type":["string","null"]},{"name":"gclId","type":["string","null"]},{"name":"customerId","type":["long","null"]},{"name":"adNetworkType","type":["string","null"]},{"name":"targetingCriteria","type":[{"type":"record","name":"targetingCriteria","fields":[{"name":"boomUserlistId","type":["long","null"]}]},"null"]}]},"null"]}]},"null"]},{"name":"device","type":[{"type":"record","name":"device","fields":[{"name":"browser","type":["string","null"]},{"name":"browserVersion","type":["string","null"]},{"name":"operatingSystem","type":["string","null"]},{"name":"operatingSystemVersion","type":["string","null"]},{"name":"isMobile","type":["boolean","null"]},{"name":"mobileDeviceBranding","type":["string","null"]},{"name":"flashVersion","type":["string","null"]},{"name":"javaEnabled","type":["boolean","null"]},{"name":"language","type":["string","null"]},{"name":"screenColors","type":["string","null"]},{"name":"screenResolution","type":["string","null"]},{"name":"deviceCategory","type":["string","null"]}]},"null"]},{"name":"geoNetwork","type":[{"type":"record","name":"geoNetwork","fields":[{"name":"continent","type":["string","null"]},{"name":"subContinent","type":["string","null"]},{"name":"country","type":["string","null"]},{"name":"region","type":["string","null"]},{"name":"metro","type":["string","null"]}]},"null"]},{"name":"customDimensions","type":{"type":"array","items":{"type":"record","name":"customDimensions","fields":[{"name":"index","type":["long","null"]},{"name":"value","type":["string","null"]}]}}},{"name":"hits","type":{"type":"array","items":{"type":"record","name":"hits","fields":[{"name":"hitNumber","type":["long","null"]},{"name":"time","type":["long","null"]},{"name":"hour","type":["long","null"]},{"name":"minute","type":["long","null"]},{"name":"isSecure","type":["boolean","null"]},{"name":"isInteraction","type":["boolean","null"]},{"name":"isEntrance","type":["boolean","null"]},{"name":"isExit","type":["boolean","null"]},{"name":"referer","type":["string","null"]},{"name":"page","type":[{"type":"record","name":"page","fields":[{"name":"pagePath","type":["string","null"]},{"name":"hostname","type":["string","null"]},{"name":"pageTitle","type":["string","null"]},{"name":"searchKeyword","type":["string","null"]},{"name":"searchCategory","type":["string","null"]}]},"null"]},{"name":"transaction","type":[{"type":"record","name":"transaction","fields":[{"name":"transactionId","type":["string","null"]},{"name":"transactionRevenue","type":["long","null"]},{"name":"transactionTax","type":["long","null"]},{"name":"transactionShipping","type":["long","null"]},{"name":"affiliation","type":["string","null"]},{"name":"currencyCode","type":["string","null"]},{"name":"localTransactionRevenue","type":["long","null"]},{"name":"localTransactionTax","type":["long","null"]},{"name":"localTransactionShipping","type":["long","null"]},{"name":"transactionCoupon","type":["string","null"]}]},"null"]},{"name":"item","type":[{"type":"record","name":"item","fields":[{"name":"transactionId","type":["string","null"]},{"name":"productName","type":["string","null"]},{"name":"productCategory","type":["string","null"]},{"name":"productSku","type":["string","null"]},{"name":"itemQuantity","type":["long","null"]},{"name":"itemRevenue","type":["long","null"]},{"name":"currencyCode","type":["string","null"]},{"name":"localItemRevenue","type":["long","null"]}]},"null"]},{"name":"contentInfo","type":[{"type":"record","name":"contentInfo","fields":[{"name":"contentDescription","type":["string","null"]}]},"null"]},{"name":"appInfo","type":[{"type":"record","name":"appInfo","fields":[{"name":"name","type":["string","null"]},{"name":"version","type":["string","null"]},{"name":"id","type":["string","null"]},{"name":"installerId","type":["string","null"]},{"name":"appInstallerId","type":["string","null"]},{"name":"appName","type":["string","null"]},{"name":"appVersion","type":["string","null"]},{"name":"appId","type":["string","null"]},{"name":"screenName","type":["string","null"]},{"name":"landingScreenName","type":["string","null"]},{"name":"exitScreenName","type":["string","null"]},{"name":"screenDepth","type":["string","null"]}]},"null"]},{"name":"exceptionInfo","type":[{"type":"record","name":"exceptionInfo","fields":[{"name":"description","type":["string","null"]},{"name":"isFatal","type":["boolean","null"]}]},"null"]},{"name":"eventInfo","type":[{"type":"record","name":"eventInfo","fields":[{"name":"eventCategory","type":["string","null"]},{"name":"eventAction","type":["string","null"]},{"name":"eventLabel","type":["string","null"]},{"name":"eventValue","type":["long","null"]}]},"null"]},{"name":"product","type":{"type":"array","items":{"type":"record","name":"product","fields":[{"name":"productSKU","type":["string","null"]},{"name":"v2ProductName","type":["string","null"]},{"name":"v2ProductCategory","type":["string","null"]},{"name":"productVariant","type":["string","null"]},{"name":"productBrand","type":["string","null"]},{"name":"productRevenue","type":["long","null"]},{"name":"localProductRevenue","type":["long","null"]},{"name":"productPrice","type":["long","null"]},{"name":"localProductPrice","type":["long","null"]},{"name":"productQuantity","type":["long","null"]},{"name":"productRefundAmount","type":["long","null"]},{"name":"localProductRefundAmount","type":["long","null"]},{"name":"isImpression","type":["boolean","null"]},{"name":"customDimensions","type":{"type":"array","items":"customDimensions"}},{"name":"customMetrics","type":{"type":"array","items":{"type":"record","name":"customMetrics","fields":[{"name":"index","type":["long","null"]},{"name":"value","type":["long","null"]}]}}}]}}},{"name":"promotion","type":{"type":"array","items":{"type":"record","name":"promotion","fields":[{"name":"promoId","type":["string","null"]},{"name":"promoName","type":["string","null"]},{"name":"promoCreative","type":["string","null"]},{"name":"promoPosition","type":["string","null"]}]}}},{"name":"promotionActionInfo","type":[{"type":"record","name":"promotionActionInfo","fields":[{"name":"promoIsView","type":["boolean","null"]},{"name":"promoIsClick","type":["boolean","null"]}]},"null"]},{"name":"refund","type":[{"type":"record","name":"refund","fields":[{"name":"refundAmount","type":["long","null"]},{"name":"localRefundAmount","type":["long","null"]}]},"null"]},{"name":"eCommerceAction","type":[{"type":"record","name":"eCommerceAction","fields":[{"name":"action_type","type":["string","null"]},{"name":"step","type":["long","null"]},{"name":"option","type":["string","null"]}]},"null"]},{"name":"experiment","type":{"type":"array","items":{"type":"record","name":"experiment","fields":[{"name":"experimentId","type":["string","null"]},{"name":"combination","type":["string","null"]}]}}},{"name":"customVariables","type":{"type":"array","items":{"type":"record","name":"customVariables","fields":[{"name":"index","type":["long","null"]},{"name":"customVarName","type":["string","null"]},{"name":"customVarValue","type":["string","null"]}]}}},{"name":"customDimensions","type":{"type":"array","items":"customDimensions"}},{"name":"customMetrics","type":{"type":"array","items":"customMetrics"}},{"name":"type","type":["string","null"]},{"name":"social","type":[{"type":"record","name":"social","fields":[{"name":"socialInteractionNetwork","type":["string","null"]},{"name":"socialInteractionAction","type":["string","null"]}]},"null"]}]}}},{"name":"fullVisitorId","type":["string","null"]},{"name":"userId","type":["string","null"]}]}

Here's what comes back with DESCRIBE:

visitorid               bigint                  from deserializer   
visitnumber             bigint                  from deserializer   
visitid                 bigint                  from deserializer   
visitstarttime          bigint                  from deserializer   
date                    string                  from deserializer   
totals                  struct<visits:bigint,hits:bigint,pageviews:bigint,timeonsite:bigint,bounces:bigint,transactions:bigint,transactionrevenue:bigint,newvisits:bigint,screenviews:bigint,uniquescreenviews:bigint,timeonscreen:bigint,totaltransactionrevenue:bigint>   from deserializer   
trafficsource           struct<referralpath:string,campaign:string,source:string,medium:string,keyword:string,adcontent:string,adwordsclickinfo:struct<campaignid:bigint,adgroupid:bigint,creativeid:bigint,criteriaid:bigint,page:bigint,slot:string,criteriaparameters:string,gclid:string,customerid:bigint,adnetworktype:string,targetingcriteria:struct<boomuserlistid:bigint>>>   from deserializer   
device                  struct<browser:string,browserversion:string,operatingsystem:string,operatingsystemversion:string,ismobile:boolean,mobiledevicebranding:string,flashversion:string,javaenabled:boolean,language:string,screencolors:string,screenresolution:string,devicecategory:string>    from deserializer   
geonetwork              struct<continent:string,subcontinent:string,country:string,region:string,metro:string>  from deserializer   
customdimensions        array<struct<index:bigint,value:string>>    from deserializer   
hits                    array<struct<hitnumber:bigint,time:bigint,hour:bigint,minute:bigint,issecure:boolean,isinteraction:boolean,isentrance:boolean,isexit:boolean,referer:string,page:struct<pagepath:string,hostname:string,pagetitle:string,searchkeyword:string,searchcategory:string>,transaction:struct<transactionid:string,transactionrevenue:bigint,transactiontax:bigint,transactionshipping:bigint,affiliation:string,currencycode:string,localtransactionrevenue:bigint,localtransactiontax:bigint,localtransactionshipping:bigint,transactioncoupon:string>,item:struct<transactionid:string,productname:string,productcategory:string,productsku:string,itemquantity:bigint,itemrevenue:bigint,currencycode:string,localitemrevenue:bigint>,contentinfo:struct<contentdescription:string>,appinfo:struct<name:string,version:string,id:string,installerid:string,appinstallerid:string,appname:string,appversion:string,appid:string,screenname:string,landingscreenname:string,exitscreenname:string,screendepth:string>,exceptioninfo:struct<description:string,isfatal:boolean>,eventinfo:struct<eventcategory:string,eventaction:string,eventlabel:string,eventvalue:bigint>,product:array<struct<productsku:string,v2productname:string,v2productcategory:string,productvariant:string,productbrand:string,productrevenue:bigint,localproductrevenue:bigint,productprice:bigint,localproductprice:bigint,productquantity:bigint,productrefundamount:bigint,localproductrefundamount:bigint,isimpression:boolean,customdimensions:array<struct<index:bigint,value:string>>,custommetrics:array<struct<index:bigint,value:bigint>>>>,promotion:array<struct<promoid:string,promoname:string,promocreative:string,promoposition:string>>,promotionactioninfo:struct<promoisview:boolean,promoisclick:boolean>,refund:struct<refundamount:bigint,localrefundamount:bigint>,ecommerceaction:struct<action_type:string,step:bigint,option:string>,experiment:array<struct<experimentid:string,combination:string>>,customvariables:array<struct<index:bigint,customvarname:string,customvarvalue:string>>,customdimensions:array<struct<index:bigint,value:string>>,custommetrics:array<struct<index:bigint,value:bigint>>,type:string,social:struct<socialinteractionnetwork:string,socialinteractionaction:string>>>   from deserializer   
fullvisitorid           string                  from deserializer   
userid                  string                  from deserializer   

Error (i can post more of the log if desired. It doesn't have more details after "15 more", but you can see what's happening prior.):

Caused by: java.lang.IllegalArgumentException
    at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
    at org.apache.avro.io.BinaryDecoder.readBytes(BinaryDecoder.java:288)
    at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:112)
    at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
    at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81)
    at org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:498)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:588)
    ... 15 more

回答1:


OK -- this issue is resolved.

The issue is that when I downloaded the file from google cloud storage using python client, I wrote it to file in text mode (the default) when I needed to use binary mode.

I re-downloaded it, re-uploaded it, and it worked.



来源:https://stackoverflow.com/questions/34727987/error-when-querying-avro-backed-hive-table-java-lang-illegalargumentexception

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!