问题
- I need to compare two columns together: "EMAIL" and "LOCATION".
- I'm using Email because it's more accurate than name for this issue.
My objective is to find total number of locations each person worked at, sum up the total of locations to select which sheet the data will been written to and copy the original data over to the new sheet(tab).
I need the original data copied over with all the duplicate locations, which is where this problem stumps me.
Full Excel Sheet
Had to use images because it flagged post as spam
The Excel sheet (SAMPLE) I'm reading in as a data frame: Excel Sample Spreadsheet
Example:
TOMAPPLES@EXAMPLE.COM worked at WENDYS,FRANKS HUT, and WALMART - That sums up to 3 different locations, which I would add to a new sheet called SHEET: 3 Different Locations
SJONES22@GMAIL.COM worked at LONDONS TENT and YOUTUBE - That's 2 different locations, which I would add to a new sheet called SHEET: 2 Different Locations
MONTYJ@EXAMPLE.COM worked only at WALMART - This user would be added to SHEET: 1 Location
Outcome:
- data copied to new sheets
Sheet 2
Sheet 2: different locations
Sheet 3
Sheet 3: different locations
Sheet 4
Sheet 4: different locations
Thanks for taking your time looking at my problem =)
回答1:
Hi Check below lines if work for you..
import pandas as pd
df = pd.read_excel('sample.xlsx')
df1 = df.groupby(['Name','Location','Job']).count().reset_index()
# this is long line
df2 = df.groupby(['Name','Location','Job','Email']).agg({'Location':'count','Email':'count'}).rename(columns={'Location':'Location Count','Email':'Email Count'}).reset_index()
print(df1)
print('\n\n')
print(df2)
below is the output change columns to check more variations
df1
Name Location Job Email
0 Monty Jakarta Manager 1
1 Monty Mumbai Manager 1
2 Sahara Jonesh Paris Cook 2
3 Tom App Jakarta Buser 1
4 Tom App Paris Buser 2
df2 all columns
Name Location ... Location Count Email Count
0 Monty Jakarta ... 1 1
1 Monty Mumbai ... 1 1
2 Sahara Jonesh Paris ... 2 2
3 Tom App Jakarta ... 1 1
4 Tom App Paris ... 2 2
来源:https://stackoverflow.com/questions/62034736/pythomcompare-2-columns-and-write-data-to-excel-sheets