ValueError: Cannot cast DatetimeIndex to dtype datetime64[us]

前端未结

关注

 3  1944

I\'m trying to create a PostgreSQL table of 30-minute data for the S&P 500 ETF (spy30new, for testing freshly inserted data) from a table of several stocks with 15-minute da

相关标签:

3条回答

别跟我提以往

2021-02-20 07:14

Actually, this was my data frame.

                              Biomass  Fossil Brown coal/Lignite  Fossil Coal-derived gas  Fossil Gas  Fossil Hard coal  Fossil Oil  Geothermal  Hydro Pumped Storage  Hydro Run-of-river and poundage  Hydro Water Reservoir  Nuclear   Other  Other renewable    Solar  Waste  Wind Offshore  Wind Onshore
2018-02-02 00:00:00+01:00   4835.0                    16275.0                    446.0      1013.0            4071.0       155.0         5.0                   7.0                           1906.0                   35.0   8924.0  3643.0            142.0      0.0  595.0         2517.0       19999.0
2018-02-02 00:15:00+01:00   4834.0                    16272.0                    446.0      1010.0            3983.0       155.0         5.0                   7.0                           1908.0                   71.0   8996.0  3878.0            142.0      0.0  594.0         2364.0       19854.0
2018-02-02 00:30:00+01:00   4828.0                    16393.0                    446.0      1019.0            4015.0       155.0         5.0

I was trying to insert into SQL database but getting the same error as in the above question. What i have done is, convert the index of the data frame to the column with a label 'index'.

df.reset_index(level=0, inplace=True)

Rename the column name 'index' to 'DateTime' by using this code.

df = df.rename(columns={'index': 'DateTime'})

Change the datatype to the 'datetime64'.

df['DateTime'] = df['DateTime'].astype('datetime64')

Store it in the sql database using these code.

engine = create_engine('mysql+mysqlconnector://root:Password@localhost/generation_data', echo=True)
df.to_sql(con=engine, name='test', if_exists='replace')

0 讨论(0)

死守一世寂寞

2021-02-20 07:26
Using pd.to_datetime() on each element worked. Option 4, which doesn't work, applies pd.to_datetime() to the entire series. Perhaps the Postgres driver understands python datetime, but not datetime64 in pandas & numpy. Option 4 produced the correct output, but I got ValueError (see title) when sending the DF to Postgres
```
timesAsPyDt = (spy0030Df['dt']).apply(lambda d: pd.to_datetime(str(d)))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
忘掉有多难

2021-02-20 07:31
I had the same problem and applying pd.to_datetime() on each element worked as well. But it is orders of magnitude slower than running pd.to_datetime() on the entire series. For a dataframe with over a 1 million rows:
```
(df['Time']).apply(lambda d: pd.to_datetime(str(d)))
```
takes approximately 70 seconds

and
```
pd.to_datetime(df['Time'])
```
takes approximately 0.01 seconds

The actual problem is that timezone information is being included. To remove it:
```
t = pd.to_datetime(df['Time'])
t = t.tz_localize(None)
```
This should be much faster!
0 讨论(0)
发布评论:

提交评论
- 加载中...