How can I convert local ORC files to CSV?

给你一囗甜甜゛ 提交于 2019-12-11 09:37:56

问题


I have an ORC file on my local machine and I need any reasonable format from it (e.g. CSV, JSON, YAML, ...).

How can I convert ORC to CSV?


回答1:


  1. Download
  2. Extract the files, go to the java folder and execute maven: mvn install
  3. Use ORC-Tools

This is how I use them - you will likely need to adjust the paths:

java -jar ~/.m2/repository/org/apache/orc/orc-tools/1.5.4/orc-tools-1.5.4-uber.jar data ~/your_file.orc > output.json

The output is JSON Lines which is easy to convert to CSV. First I needed to remove the last two lines from the output. Then:

import pandas as pd

df = pd.read_json('output.json', lines=True)
df.to_csv('output.csv')


来源:https://stackoverflow.com/questions/54482815/how-can-i-convert-local-orc-files-to-csv

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!