I met an error when i using hive transform features

匿名 (未验证) 提交于 2019-12-03 01:18:02


java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script. at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:486) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) ... 8 more The HQL script as follow:

SELECT   TRANSFORM (userid, movieid, rating)   USING 'python /home/daxingyu930/test_data_mapper2.py'   AS userid, movieid, rating ; 

the python script is very simple, using \t to split lines.

I have tested the python script in Linux with follow shell script:

cat test_data/u_data.txt | python test_data_mapper2.py 

Pleas give me some idea about the question, it drive me crazy and make me cant sleep. Thanks very much.


before using your custom script, you should add your scripts into distributed cache.


add file  /home/daxingyu930/test_data_mapper2.py;  SELECT     TRANSFORM (userid, movieid, rating)     USING 'python test_data_mapper2.py'     AS userid, movieid, rating ; 


chmod +x test_data_mapper2.py.

then From hiveCl1 run below command add file /home/daxingyu930/test_data_mapper2.py ;


You should not give the full path to your script in USING clause. Just use the python script (.py) name.
