Pandas concat: ValueError: Shape of passed values is blah, indices imply blah2

前端 未结 7 940
梦如初夏
梦如初夏 2020-11-27 16:41

I\'m trying to merge a (Pandas 14.1) dataframe and a series. The series should form a new column, with some NAs (since the index values of the series are a subset of the ind

相关标签:
7条回答
  • 2020-11-27 17:13

    I tried Join and Append but none of them worked. I used a 'try: ..., except: continue' around that section of my code and it worked perfectly.

    0 讨论(0)
  • 2020-11-27 17:15

    My problem were different indices, the following code solved my problem.

    df1.reset_index(drop=True, inplace=True)
    df2.reset_index(drop=True, inplace=True)
    df = pd.concat([df1, df2], axis=1)
    
    0 讨论(0)
  • 2020-11-27 17:25

    Aus_lacy's post gave me the idea of trying related methods, of which join does work:

    In [196]:
    
    hl.name = 'hl'
    Out[196]:
    'hl'
    In [199]:
    
    df.join(hl).head(4)
    Out[199]:
    high    low loc_h   loc_l   hl
    2014-01-01 17:00:00 1.376235    1.375945    1.376235    1.375945    1.376090
    2014-01-01 17:01:00 1.376005    1.375775    NaN NaN NaN
    2014-01-01 17:02:00 1.375795    1.375445    NaN 1.375445    1.375445
    2014-01-01 17:03:00 1.375625    1.375515    NaN NaN NaN
    

    Some insight into why concat works on the example but not this data would be nice though!

    0 讨论(0)
  • 2020-11-27 17:26

    To drop duplicate indices, use df = df.loc[df.index.drop_duplicates()]. C.f. pandas.pydata.org/pandas-docs/stable/generated/… – BallpointBen Apr 18 at 15:25

    This is wrong but I can't reply directly to BallpointBen's comment due to low reputation. The reason its wrong is that df.index.drop_duplicates() returns a list of unique indices, but when you index back into the dataframe using those the unique indices it still returns all records. I think this is likely because indexing using one of the duplicated indices will return all instances of the index.

    Instead, use df.index.duplicated(), which returns a boolean list (add the ~ to get the not-duplicated records):

    df = df.loc[~df.index.duplicated()]
    
    0 讨论(0)
  • 2020-11-27 17:30

    I had a similar problem (join worked, but concat failed).

    Check for duplicate index values in df1 and s1, (e.g. df1.index.is_unique)

    Removing duplicate index values (e.g., df.drop_duplicates(inplace=True)) or one of the methods here https://stackoverflow.com/a/34297689/7163376 should resolve it.

    0 讨论(0)
  • 2020-11-27 17:30

    Try sorting index after concatenating them

    result=pd.concat([df1,df2]).sort_index()
    
    0 讨论(0)
提交回复
热议问题