问题
I have a dataframe like following :
<table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>Title</th> <th>ASIN</th> <th>State</th> <th>SellerSKU</th> <th>Quantity</th> <th>FBAStock</th> <th>QuantityToShip</th> </tr> </thead> <tbody> <tr> <th>1</th> <td>Daedal crafters- Pack of Two Gajra (Orange and...</td> <td>B075T64ZWJ</td> <td>WEST BENGAL</td> <td>DC216</td> <td>1</td> <td>0</td> <td>1</td> </tr> <tr> <th>2</th> <td>Daedal Dream Catchers - Intricate Web Design(B...</td> <td>B06XBRRYVK</td> <td>KARNATAKA</td> <td>DDC63BB</td> <td>1</td> <td>24</td> <td>0</td> </tr> <tr> <th>3</th> <td>Daedal Dream Catchers- Blue and White Four Rin...</td> <td>B07428QBJ9</td> <td>MAHARASHTRA</td> <td>12-16RT-1H8B</td> <td>1</td> <td>4</td> <td>0</td> </tr> <tr> <th>4</th> <td>Daedal dream catchers- Crescent wine DDC21</td> <td>B01DI70P9W</td> <td>UTTAR PRADESH</td> <td>70-PK4Z-6VSP</td> <td>1</td> <td>10</td> <td>0</td> </tr> </tbody></table>
The columns are :
Title ASIN State SellerSKU Quantity FBAStock QuantityToShip
I have another dataframe which contains a subset of rows of the above dataframe but only the column "Quantity" is changed in this dataframe and has the columns
ASIN State Quantity
How do I intersect or merge this smaller dataframe with the first dataframe such that Quantity of smaller dataframe overwrites the original quantity of dataframe by matching the ASIN and State columns ?
If it can be done by merging , how to do so ? I'm not familiar with SQL merge words like 'inner' , 'left' ,etc...
Purpose :
I am modifying the original DF like this :
new = originalDF.groupby(['State' ,'ASIN' , 'Quantity']).size().reset_index().rename(columns= {0 : 'Count'})
new.Quantity = new[['Quantity' , 'Count']].apply(lambda tup : tup[0]*tup[1] , axis = 1)
new.drop(['Count'] , axis =1 , inplace=True)
Now i want to put the columns of originalDF to the new DF matching the columns ASIN and State of the new DF (Quantity column of new DF is what I want in the final dataframe).
回答1:
I believe want transform for new column by size
per groups with multiple column Quantity
by *=
:
originalDF = pd.DataFrame({'State':list('aaabbb'),
'ASIN':list('cfcccc'),
'Quantity':[100] * 6})
originalDF['Quantity'] *= (originalDF.groupby(['State' ,'ASIN' , 'Quantity'])['State']
.transform('size'))
print (originalDF)
State ASIN Quantity
0 a c 200
1 a f 100
2 a c 200
3 b c 300
4 b c 300
5 b c 300
Detail:
print ((originalDF.groupby(['State' ,'ASIN' , 'Quantity'])['State']
.transform('size')))
0 2
1 1
2 2
3 3
4 3
5 3
Name: State, dtype: int64
来源:https://stackoverflow.com/questions/50680259/how-to-do-intersection-of-dataframes-in-pandas