How to merge two dataframes side-by-side?

前端 未结 3 1664
耶瑟儿~
耶瑟儿~ 2020-11-30 09:47

is there a way to conveniently merge two data frames side by side?

both two data frames have 30 rows, they have different number of columns, say, df1 has 20 columns

相关标签:
3条回答
  • 2020-11-30 10:17
    • There is way, you can do it via a Pipeline.

    ** Use a pipeline to transform your numerical Data for ex-

    Num_pipeline = Pipeline
    ([("select_numeric", DataFrameSelector([columns with numerical value])),
    ("imputer", SimpleImputer(strategy="median")),
    ])
    

    **And for categorical data

    cat_pipeline = Pipeline([
        ("select_cat", DataFrameSelector([columns with categorical data])),
        ("cat_encoder", OneHotEncoder(sparse=False)),
    ])
    

    ** Then use a Feature union to add these transformations together

    preprocess_pipeline = FeatureUnion(transformer_list=[
        ("num_pipeline", num_pipeline),
        ("cat_pipeline", cat_pipeline),
    ])
    
    • Read more here - https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.FeatureUnion.html
    0 讨论(0)
  • 2020-11-30 10:27

    You can use the concat function for this (axis=1 is to concatenate as columns):

    pd.concat([df1, df2], axis=1)
    

    See the pandas docs on merging/concatenating: http://pandas.pydata.org/pandas-docs/stable/merging.html

    0 讨论(0)
  • 2020-11-30 10:27

    I came across your question while I was trying to achieve something like the following:

    So once I sliced my dataframes, I first ensured that their index are the same. In your case both dataframes needs to be indexed from 0 to 29. Then merged both dataframes by the index.

    df1.reset_index(drop=True).merge(df2.reset_index(drop=True), left_index=True, right_index=True)
    
    0 讨论(0)
提交回复
热议问题