Transfer and write Parquet with python and pandas got timestamp error

前端未结

关注

 5  2292

春和景丽 2021-02-19 00:44

I tried to concat() two parquet file with pandas in python .
It can work , but when I try to write and save the Data frame to a parquet file ,it display the error :

5条回答

盖世英雄少女心 (楼主)

2021-02-19 01:08
I think this is a bug and you should do what Wes says. However, if you need working code now, I have a workaround.

The solution that worked for me was to specify the timestamp columns to be millisecond precision. If you need nanosecond precision, this will ruin your data... but if that's the case, it may be the least of your problems.
```
import pandas as pd

table1 = pd.read_parquet(path=('path1.parquet'))
table2 = pd.read_parquet(path=('path2.parquet'))

table1["Date"] = table1["Date"].astype("datetime64[ms]")
table2["Date"] = table2["Date"].astype("datetime64[ms]")

table = pd.concat([table1, table2], ignore_index=True) 
table.to_parquet('./file.gzip', compression='gzip')
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...