Read csv with dd.mm.yyyy in Python and Pandas

情到浓时终转凉″ 提交于 2020-03-18 12:33:19

问题


I am reading a csv file with German date format. Seems like it worked ok in this post:

Picking dates from an imported CSV with pandas/python

However, it seems like in my case the date is not recognized as such. I could not find any wrong string in the test file.

import pandas as pd
import numpy as np


%matplotlib inline
import matplotlib.pyplot as plt

from matplotlib import style
from pandas import DataFrame

style.use('ggplot')

df = pd.read_csv('testdata.csv', dayfirst=True, parse_dates=True)
df[:5]

This results in:

So, the Column with the dates is not recognized as such. What am I doing wrong here? Or is this date format simply not compatible?

  • OSX 10.10.3
  • Anaconda conda 3.13.0
  • Python 3.4.3-0
  • iPython notebook 3.1.0

回答1:


If you use parse_dates=True then read_csv tries to parse the index as a date. Therefore, you would also need to declare the first column as the index with index_col=[0]:

In [216]: pd.read_csv('testdata.csv', dayfirst=True, parse_dates=True, index_col=[0])
Out[216]: 
            morgens  mittags  abends
Datum                               
2015-03-16      382      452     202
2015-03-17      288      467     192

Alternatively, if you don't want the Datum column to be an index, you could use parse_dates=[0] to explicitly tell read_csv to parse the first column as dates:

In [217]: pd.read_csv('testdata.csv', dayfirst=True, parse_dates=[0])
Out[217]: 
       Datum  morgens  mittags  abends
0 2015-03-16      382      452     202
1 2015-03-17      288      467     192

Under the hood read_csv uses dateutil.parser.parse to parse date strings:

In [218]: import dateutil.parser as DP

In [221]: DP.parse('16.03.2015', dayfirst=True)
Out[221]: datetime.datetime(2015, 3, 16, 0, 0)

Since dateutil.parser has no trouble parsing date strings in DD.MM.YYYY format, you don't have to declare a custom date parser here.




回答2:


use the date_parser parameter of read_csv to pass a custom date parsing function (a lambda that wraps strptime with the relevant date format)

pandas.read_csv




回答3:


May be this will help

    from datetime import datetime as dt
    dtm = lambda x: dt.strptime(str(x), "%d.%m.%Y")
    df["Datum"] = df["Datum"].apply(dtm)


来源:https://stackoverflow.com/questions/30833133/read-csv-with-dd-mm-yyyy-in-python-and-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!