Do `normalize_entity()`, `add_relationships()` are logically same in featuretools?

久未见 提交于 2020-12-15 05:31:26

问题


Example:

buy_log_df = pd.DataFrame(
    [
        ["2020-01-01", 0, 1, 2, 2, 200],
        ["2020-01-02", 1, 1, 1, 3, 100],
        ["2020-01-02", 2, 2, 1, 1, 100],
        ["2020-01-03", 3, 3, 3, 1, 300],
    ],
    columns=['date', 'sale_id', 'customer_id', "item_id", "quantity", "price"]
)

es = ft.EntitySet(id="sale_set")
es = es.entity_from_dataframe(
    "sales",
    dataframe=buy_log_df,
    index="sale_id",
    time_index='date'
)
es = es.normalize_entity(
    new_entity_id="items",
    base_entity_id="sales",
    index="item_id",
    additional_variables=["price"],
)
buy_log_df = pd.DataFrame(
    [
        ["2020-01-01", 0, 1, 2, 2],
        ["2020-01-02", 1, 1, 1, 3],
        ["2020-01-02", 2, 2, 1, 1],
        ["2020-01-03", 3, 3, 3, 1],
    ],
    columns=['date', 'sale_id', 'customer_id', "item_id", "quantity",]
)
item_df = pd.DataFrame(
    [
        [1, 100],
        [2, 200],
        [3, 300],
    ],
    columns=['item_id', 'price']
)

es = ft.EntitySet(id="sale_set")
es = es.entity_from_dataframe(
    "sales",
    dataframe=buy_log_df,
    index="sale_id",
    time_index='date'
)
es = es.entity_from_dataframe(
    "items",
    dataframe=item_df,
    index="item_id",
)
from featuretools import Relationship
es = es.add_relationships(
    [Relationship(es['items']['item_id'], es['sales']['item_id'])],
)

It looks like the es of the above two are the same.

I'd like to know whether there is a specific case where ONLY normalize_entity() is allowed or so.


回答1:


Thanks for the question. That's correct. The two entity sets are the same. There aren't cases where only normalize_entity() can be used. Changes made by this method such as adding relationships can also be done manually.



来源:https://stackoverflow.com/questions/64892603/do-normalize-entity-add-relationships-are-logically-same-in-featuretool

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!