I am trying to create a dataframe from csv, and its first column is like
\"2013-08-25T00:00:00-0400\";
\"2013-08-25T01:00:00-0400\";
\"2013-08-25T02:00:00-04
Pandas parser will take into account the timezone information if it's available, and give you a naive Timestamp (naive == no timezone info), but with the timezone offset taken into account.
To keep the timezone information in you DataFrame you should first localize the Timestamps as UTC
and then convert them to their timezone (which in this case is Etc/GMT+4
):
>>> df = pd.read_csv(PeriodC, sep=';', parse_dates=[0], index_col=0)
>>> df.index[0]
>>> Timestamp('2013-08-25 04:00:00', tz=None)
>>> df.index = df.index.tz_localize('UTC').tz_convert('Etc/GMT+4')
>>> df.index[0]
Timestamp('2013-08-25 00:00:00-0400', tz='Etc/GMT+4')
If you want to completely discard the timezone information, then just specify a date_parser
that will split the string and pass only the datetime portion to the parser.
>>> df = pd.read_csv(file, sep=';', parse_dates=[0], index_col=[0]
date_parser=lambda x: pd.to_datetime(x.rpartition('-')[0]))
>>> df.index[0]
Timestamp('2013-08-25 00:00:00', tz=None)
The x.rpartition('-')
from https://stackoverflow.com/a/18912631/4318671 is not so good.
The string format of datetime get from Influxdb with 'Asia/Shanghai' will be:
2019-09-09T12:51:54.46303+08:00
If you are using pandas
, you can try
df['time'] = pd.to_datetime(df['time'])