pandas merge dataframe with NaN (or “unknown”) for missing values

前端 未结 4 836
梦如初夏
梦如初夏 2021-02-05 01:27

I have 2 dataframes, one of which has supplemental information for some (but not all) of the rows in the other.

names = df({\'names\':[\'bob\',\'frank\',\'james\         


        
4条回答
  •  盖世英雄少女心
    2021-02-05 02:07

    For outer or inner join also join function can be used. In the case above let's suppose that names is the main table (all rows from this table must occur in result). Then to run left outer join use:

    what = names.set_index('names').join(info.set_index('names'), how='left')
    

    resp.

    what = names.set_index('names').join(info.set_index('names'), how='left').fillna("unknown")
    

    set_index functions are used to create temporary index column (same in both tables). When dataframes would have contain such index column, then this step wouldn't be necessary. For example:

    # define index when create dataframes
    names = pd.DataFrame({'names':['bob',...],'position':['dev',...]}).set_index('names')
    info = pd.DataFrame({'names':['joe',...],'classification':['thief',...]}).set_index('names')
    
    what = names.join(info, how='left')
    

    To perform other types of join just change how attribute (left/right/inner/outer are allowed). More info here

提交回复
热议问题