问题
Right, I'm going to be as clear as I can.
Here's my dataframe called base_varlist2
.
completion_date_latest completion_date_original customer_birth_date_1 \
0 07/10/2004 17/05/1996 04/02/1963
1 16/02/2004 16/02/2004 31/10/1968
2 25/03/2004 25/03/2004 18/09/1960
3 10/02/2004 10/02/2004 18/04/1972
4 03/08/2010 25/05/2004 12/09/1960
5 16/04/2004 16/04/2004 27/08/1975
6 05/04/2004 05/04/2004 02/02/1971
7 26/03/2004 26/03/2004 05/05/1959
8 29/07/2004 29/07/2004 10/10/1960
9 14/06/2004 14/06/2004 29/07/1962
10 16/09/2004 16/09/2004 03/11/1969
11 07/07/2004 07/07/2004 01/05/1972
12 10/05/2004 10/05/2004 04/12/1952
13 14/07/2004 14/07/2004 20/08/1967
14 04/08/2006 04/08/2006 22/05/1973
15 10/08/2004 10/08/2004 05/01/1964
16 23/09/2004 23/09/2004 30/07/1970
17 12/03/2010 17/09/2004 15/01/1964
18 16/10/2006 27/09/2004 06/01/1947
19 10/08/2006 10/08/2006 26/03/1964
20 03/08/2004 03/08/2004 31/10/1971
21 06/10/2004 06/10/2004 18/03/1958
22 13/12/2005 13/10/2004 23/01/1986
23 31/08/2004 31/08/2004 22/01/1959
24 08/03/2007 09/03/2007 20/09/1974
customer_birth_date_2 d_start latest_maturity_date \
0 NaN 01-Feb-18 01/03/2027
1 NaN 01-Feb-18 16/04/2021
2 NaN 01-Feb-18 16/03/2024
3 03/01/1972 01-Feb-18 16/02/2029
4 NaN 01-Feb-18 16/07/2026
5 23/05/1975 01-Feb-18 16/04/2027
6 22/11/1972 01-Feb-18 16/04/2029
7 08/10/1959 01-Feb-18 16/03/2016
8 14/09/1961 01-Feb-18 16/07/2024
9 NaN 01-Feb-18 16/07/2020
10 23/01/1966 01-Feb-18 16/02/2034
11 NaN 01-Feb-18 16/07/2029
12 06/08/1961 01-Feb-18 16/05/2018
13 NaN 01-Feb-18 16/07/2029
14 NaN 01-Feb-18 16/08/2026
15 16/09/1966 01-Feb-18 16/08/2029
16 19/07/1968 01-Feb-18 16/06/2026
17 18/08/1969 01-Feb-18 16/10/2022
18 30/07/1957 01-Feb-18 16/09/2021
19 NaN 01-Feb-18 16/08/2028
20 15/10/1964 01-Feb-18 16/08/2029
21 09/02/1959 01-Feb-18 16/10/2022
22 NaN 01-Feb-18 16/01/2037
23 NaN 01-Feb-18 16/08/2023
24 NaN 01-Feb-18 01/03/2027
latest_valuation_date sdate startdt_def
0 08/05/2004 NaN NaN
1 17/01/2004 NaN NaN
2 02/01/2004 NaN NaN
3 30/12/2003 NaN NaN
4 14/06/2010 NaN NaN
5 16/03/2004 NaN NaN
6 17/02/2004 NaN NaN
7 02/03/2004 NaN 01-Sep-16
8 19/05/2004 NaN NaN
9 10/05/2004 NaN NaN
10 01/07/2004 NaN NaN
11 05/02/2004 NaN NaN
12 07/04/2004 NaN NaN
13 22/04/2004 NaN NaN
14 26/04/2006 NaN NaN
15 05/05/2004 NaN NaN
16 21/05/2004 NaN NaN
17 18/02/2010 NaN NaN
18 25/09/2006 NaN NaN
19 26/06/2006 NaN NaN
20 07/07/2004 NaN NaN
21 07/09/2004 NaN NaN
22 29/09/2005 NaN 01-Dec-07
23 02/04/2004 NaN NaN
24 30/01/2007 NaN NaN
varlist2
is a list of all the columns in my dataframe:
In [123]: varlist2
Out[123]:
['completion_date_latest',
'completion_date_original',
'customer_birth_date_1',
'customer_birth_date_2',
'd_start',
'latest_maturity_date',
'latest_valuation_date',
'sdate',
'startdt_def']
I want to convert all dates to datetime. Of course, some of these are NaN
.
Here's what I've tried:
mm_dates_base = base_varlist2.copy()
for l in range (0,len(varlist2)):
date_var = varlist2[l]
print('MM_Dates transform variable: ' + date_var)
mm_dates_base[date_var] = mm_dates_base[date_var].fillna('')
mm_dates_base[date_var] = pd.to_datetime(mm_dates_base[date_var], errors='coerce', dayfirst=True)
So I loop through each element of my list and change missing values to blank. Then I change the date in mm_dates
to datetime. This works swimmingly, until it gets to missing values. Nothing errors when I run the for loop, but when I run this:
print(mm_dates_base.iloc[0])
I get a ValueError:
ValueError: cannot convert float NaN to integer
This is bizzare. I even have error='coerce'
... does anyone know what might be going wrong?
回答1:
Change your for loop to errors='ignore'
for l in range (0,len(varlist2)):
date_var = varlist2[l]
print('MM_Dates transform variable: ' + date_var)
mm_dates_base[date_var] = pd.to_datetime(mm_dates_base[date_var], errors='ignore', dayfirst=True)
mm_dates_base[date_var] = mm_dates_base[date_var].fillna('')
来源:https://stackoverflow.com/questions/49756649/pandas-to-datetime-not-working-for-null-values