I\'m trying to merge a (Pandas 14.1) dataframe and a series. The series should form a new column, with some NAs (since the index values of the series are a subset of the ind
I tried Join and Append but none of them worked. I used a 'try: ..., except: continue' around that section of my code and it worked perfectly.
My problem were different indices, the following code solved my problem.
df1.reset_index(drop=True, inplace=True)
df2.reset_index(drop=True, inplace=True)
df = pd.concat([df1, df2], axis=1)
Aus_lacy's post gave me the idea of trying related methods, of which join does work:
In [196]:
hl.name = 'hl'
Out[196]:
'hl'
In [199]:
df.join(hl).head(4)
Out[199]:
high low loc_h loc_l hl
2014-01-01 17:00:00 1.376235 1.375945 1.376235 1.375945 1.376090
2014-01-01 17:01:00 1.376005 1.375775 NaN NaN NaN
2014-01-01 17:02:00 1.375795 1.375445 NaN 1.375445 1.375445
2014-01-01 17:03:00 1.375625 1.375515 NaN NaN NaN
Some insight into why concat works on the example but not this data would be nice though!
To drop duplicate indices, use
df = df.loc[df.index.drop_duplicates()]
. C.f. pandas.pydata.org/pandas-docs/stable/generated/… – BallpointBen Apr 18 at 15:25
This is wrong but I can't reply directly to BallpointBen's comment due to low reputation. The reason its wrong is that df.index.drop_duplicates()
returns a list of unique indices, but when you index back into the dataframe using those the unique indices it still returns all records. I think this is likely because indexing using one of the duplicated indices will return all instances of the index.
Instead, use df.index.duplicated()
, which returns a boolean list (add the ~
to get the not-duplicated records):
df = df.loc[~df.index.duplicated()]
I had a similar problem (join
worked, but concat
failed).
Check for duplicate index values in df1
and s1
, (e.g. df1.index.is_unique
)
Removing duplicate index values (e.g., df.drop_duplicates(inplace=True)
) or one of the methods here https://stackoverflow.com/a/34297689/7163376 should resolve it.
Try sorting index after concatenating them
result=pd.concat([df1,df2]).sort_index()