问题
I have an ORC file on my local machine and I need any reasonable format from it (e.g. CSV, JSON, YAML, ...).
How can I convert ORC to CSV?
回答1:
- Download
- Extract the files, go to the
java
folder and execute maven:mvn install
- Use ORC-Tools
This is how I use them - you will likely need to adjust the paths:
java -jar ~/.m2/repository/org/apache/orc/orc-tools/1.5.4/orc-tools-1.5.4-uber.jar data ~/your_file.orc > output.json
The output is JSON Lines which is easy to convert to CSV. First I needed to remove the last two lines from the output. Then:
import pandas as pd
df = pd.read_json('output.json', lines=True)
df.to_csv('output.csv')
来源:https://stackoverflow.com/questions/54482815/how-can-i-convert-local-orc-files-to-csv