faster csv loading with datetime index pandas

后端未结

关注

 1  1640

I am often iterating of financial price data stored in csv file. Like the accessibility of using pandas datetime objects to subset and organize data when all of my analysis

相关标签:

1条回答

醉酒成梦

2021-01-17 04:06
after testing few options for loading & parsing a csv file with, 13,811,418 rows having, 98 unique date values, we arrived at the below snippet, and found out that if we pass the format param with predefined date-format ('%m/%d/%Y' in our case) we could reach 2.52 s with Pandas.0.15.3.
```
def to_date(dates, lookup=False, **args):
    if lookup:
        return dates.map({v: pd.to_datetime(v, **args) for v in dates.unique()})
    return pd.to_datetime(dates, **args)
```
- also use coerce=True (or coarse='raise' in later versions) for enabling date-format validation, other-wise the error values are retained as string-value, and will lead to an error when any other date-time operation is performed on the dataframe column
0 讨论(0)
发布评论:

提交评论
- 加载中...