问题
Is there a conditional "shift" parameter in data frames?
For example,
Assume I own a used car lot and I have data as follows
SaleDate Car
12/1/2016 Wrangler
12/2/2016 Camry
12/3/2016 Wrangler
12/7/2016 Prius
12/10/2016 Prius
12/12/2016 Wrangler
I want to find two things out from this list -
1) For each sale, when was the last day that a car was sold? This is simple in Pandas, just a simple shift as follows
df['PriorSaleDate'] = df['SaleDate'].shift()
2) For each sale, when was the prior date that the same type of car was sold? So, for example, the Wrangler sale on 12/3 would point two rows back to 12/1 (the last time the "car" value in row 3 was equal to the "car" value in a prior row).
For the Wrangler sold on 12/12, I would want the value of 12/3
Is there a conditional shift parameter that would allow me to get the row there the value df['Car'] equals the value of df['Car'] in that row?
Thank you so much for your help
回答1:
You can use groupby
and shift()
:
import io
import pandas as pd
text = """SaleDate Car
12/1/2016 Wrangler
12/2/2016 Camry
12/3/2016 Wrangler
12/7/2016 Prius
12/10/2016 Prius
12/12/2016 Wrangler"""
df = pd.read_csv(io.StringIO(text), delim_whitespace=True, parse_dates=[0])
df["lastSaleDate"] = df.groupby("Car").SaleDate.shift()
the output:
SaleDate Car lastSaleDate
0 2016-12-01 Wrangler NaT
1 2016-12-02 Camry NaT
2 2016-12-03 Wrangler 2016-12-01
3 2016-12-07 Prius NaT
4 2016-12-10 Prius 2016-12-07
5 2016-12-12 Wrangler 2016-12-03
回答2:
I'm basically copying HYRY's answer and modifying it slightly. If you like this solution. Choose HYRY's answer as your answer.
from StringIO import StringIO # this is what I needed to do
import pandas as pd
text = """SaleDate Car
12/1/2016 Wrangler
12/2/2016 Camry
12/3/2016 Wrangler
12/7/2016 Prius
12/10/2016 Prius
12/12/2016 Wrangler"""
df = pd.read_csv(StringIO(text), delim_whitespace=True, parse_dates=[0])
# what you already did
df['PriorSaleDate'] = df['SaleDate'].shift()
# what HYRY did
df["CarSpecificPriorSaleDate"] = df.groupby("Car").SaleDate.shift()
Looks like
Out[34]:
SaleDate Car PriorSaleDate CarSpecificPriorSaleDate
0 2016-12-01 Wrangler NaT NaT
1 2016-12-02 Camry 2016-12-01 NaT
2 2016-12-03 Wrangler 2016-12-02 2016-12-01
3 2016-12-07 Prius 2016-12-03 NaT
4 2016-12-10 Prius 2016-12-07 2016-12-07
5 2016-12-12 Wrangler 2016-12-10 2016-12-03
来源:https://stackoverflow.com/questions/36770814/pandas-and-python-dataframes-and-conditional-shift-function