Python - the best way to create a new dataframe from two other dataframes with different shapes?

问题

Essentially, I'm trying to build a new dataframe from two others but the situation is a little complicated and I'm not sure what the best way to do this is.

In DF1, each row is data about objects defined by IDs, and it looks something like this:

ID  Name  datafield1 datafield2

1  Foo    info1   info2
2  bar    info3   info4
3  Foos   info5   info6

DF2 has monthly data about each object formatted like this:

ID   Name   Month   data 

1  Foo    1/20   53.6  
1  Foo    2/20   47.2
1  Foo    3/20   12.7
1  Foo    4/20   3.2
2  Bar    1/20   82.2
2  Bar    2/20   65.0
2  Bar    3/20   41.7
2  Bar    4/20   28.4

So what I want to do is to search DF2 by ID found from DF1 and then put the monthly data from DF2 and a couple of important columns from DF1 and put all of this in a new dataframe.

This is what I had so far but from what I've read this is a bad approach:

IDs = df1['ID'].unique()

df3 = pd.DataFrame(rows = IDs)

for id, df in df1.groupby('ID'):
   if ([df2['ID'] == id]):
      *not sure what to put here*

So it sounds like creating an empty dataframe is a bad approach but I'm not sure how else to approach it. How should I create this new dataframe? And is it better (meaning which is a smarter approach) to convert the monthly data into columns and have a single row for each ID or would it be better to just keep each month separate and add a couple of columns from DF1 to each row?

回答1:

Check if below lines can help you to add columns from DF1 to new frame, I have taken frame through excel you can use your own way...data used is displayed in image

import pandas as pd
df1 = pd.read_excel('frame1.xlsx')
df2 = pd.read_excel('frame2.xlsx')

df = pd.merge(df2, df1[['ID','datafield1','datafield2']], on = 'ID', how = 'left')

print(df)

来源：https://stackoverflow.com/questions/61923715/python-the-best-way-to-create-a-new-dataframe-from-two-other-dataframes-with-d

标签

python

pandas

dataframe