Filling data using .fillNA(), data pulled from Quandl

て烟熏妆下的殇ゞ 提交于 2021-02-08 03:49:22

问题


I've pulled some stock data from Quandl for both Crude Oil prices (WTI) and Caterpillar (CAT) price. When I concatenate the two dataframes together I'm left with some NaNs. My ultimate goal is to run a .Pearsonr() to assess the correlation (along with p-values), however I can't get Pearsonr() to work because of all the Nan's. So I'm trying to clean them up. When I use the .fillNA() function it doesn't seem to be working. I've even tried .interpolate() as well as .dropna(). None of them appear to work. Here is my working code.

import Quandl
import pandas as pd
import numpy as np


#WTI Data#
WTI_daily = Quandl.get("DOE/RWTC", collapse="daily",trim_start="1986-10-10", trim_end="1986-10-15")
WTI_daily.columns = ['WTI']

#CAT Data
CAT_daily = Quandl.get("YAHOO/CAT.6", collapse = "daily",trim_start="1986-10-10", trim_end="1986-10-15")
CAT_daily.columns = ['CAT']  

#Combine Data Frames
daily_price_df = pd.concat([CAT_daily, WTI_daily], axis=1)
print daily_price_df

#Verify they are dataFrames:
def really_a_df(var):
    if isinstance(var, pd.DataFrame):
        print "DATAFRAME SUCCESS"
    else:
        print "Wahh Wahh"
    return 'done'

print really_a_df(daily_price_df)

#Fill NAs 
#CAN'T GET THIS TO WORK!!
daily_price_df.fillna(method='pad', limit=8)
print daily_price_df

# Try to interpolate
#CAN'T GET THIS TO WORK!!
daily_price_df.interpolate()
print daily_price_df

#Drop NAs
#CAN'T GET THIS TO WORK!!
daily_price_df.dropna(axis=1)
print daily_price_df

For what it's worth I've managed to get the function working when I create a dataframe from scratch using this code:

import pandas as pd
import numpy as np

d = {'a' : 0., 'b' : 1., 'c' : 2.,'d':None,'e':6}
d_series = pd.Series(d, index=['a', 'b', 'c', 'd','e'])
d_df =  pd.DataFrame(d_series)
d_df = d_df.fillna(method='pad')

print d_df

Initially I was thinking that perhaps my data wasn't in dataframe form, but I used a simple test to confirm they are in fact dataframe. The only conclusion I that remains (in my opinion) is that it is something about the structure of the Quandl dataframe, or possibly the TimeSeries nature. Please know I'm somewhat new to python so structure answers for a begginner/novice. Any help is much appreciated!


回答1:


pot shot - have you just forgotten to assign or use the inplace flag.

daily_price_df = daily_price_df.fillna(method='pad', limit=8)
OR
daily_price_df.fillna(method='pad', limit=8, inplace=True)


来源:https://stackoverflow.com/questions/35461548/filling-data-using-fillna-data-pulled-from-quandl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!