问题
I'm new to FeatureTools library, and I got stuck trying to create two types of features, both are related to fetching previous values. One is the previous value itself for 'QUANTIDADE'
, 'VALOR_TOTAL'
and 'DATA_NOTA'
, and the other is the time since the previous observation (days) which has 'DATA_NOTA'
as the date field.
I don't know if it is possible to do it with FeaturelTools. If someone can help me, I would appreciate it.
I have a dataframe (df) as folowing:
When I normalize the above df it takes the following basic schema:
As I said, I would like to fetch the previous values and time since the last observation for 'QUANTIDADE'
, 'VALOR_TOTAL'
and 'DATA_NOTA'
, but when the combination of 'CODIGO_PRODUTO'
and 'CODIGO_CLIENTE'
matches.
回答1:
After did some research I discovered that what I needed could be done using groupby_trans_primitives
as follow:
from featuretools.primitives import TimeSincePrevious
time_since_previous = TimeSincePrevious(unit = "days")
fm, features = ft.dfs(entityset=es,
target_entity='recordings',
trans_primitives = [],
agg_primitives = [],
max_depth=2,
verbose=True,
groupby_trans_primitives=['Diff',time_since_previous])
But I realized that I would like to exclude other entities than recordings, and now I got stuck trying to exclude those entities!
I've tried:
from featuretools.primitives import TimeSincePrevious
time_since_previous = TimeSincePrevious(unit = "days")
fm, features = ft.dfs(entityset=es,
target_entity='recordings',
trans_primitives = [],
agg_primitives = [],
max_depth=2,
verbose=True,
groupby_trans_primitives=['Diff',time_since_previous],
primitive_options={'time_since_previous':{'ignore_groupby_entities': ['vendedores','produtos','cliente']}})
But this code still creates unintended features on the "excluded" entities.
I would appreciate any attempt to help me! Thank you!
来源:https://stackoverflow.com/questions/58701802/i-got-stuck-trying-to-fetch-the-previous-value-based-on-a-criteria