Matching values from one csv file to another and replace entire column using pandas/python

后端 未结 1 787
执念已碎
执念已碎 2021-01-24 07:19

Consider the following example:

I have a dataset of Movielens-

u.item.csv

ID|MOVIE NAME (YEAR)|REL.DATE|NULL|IMDB LINK|A|B|C|D|E         


        
相关标签:
1条回答
  • 2021-01-24 08:08

    I think you need map by Series created by set_index:

    print (df1.set_index('ID')['MOVIE NAME (YEAR)'])
    ID
    1     Toy Story (1995)
    2     GoldenEye (1995)
    3    Four Rooms (1995)
    Name: MOVIE NAME (YEAR), dtype: object
    
    df2['movie_id'] = df2['movie_id'].map(df1.set_index('ID')['MOVIE NAME (YEAR)'])
    print (df2)
       user_id           movie_id  rating  unix_timestamp
    0        1   Toy Story (1995)       5       874965758
    1        1   GoldenEye (1995)       3       876893171
    2        1  Four Rooms (1995)       4       878542960
    

    Or use replace:

    df2['movie_id'] = df2['movie_id'].replace(df1.set_index('ID')['MOVIE NAME (YEAR)'])
    print (df2)
       user_id           movie_id  rating  unix_timestamp
    0        1   Toy Story (1995)       5       874965758
    1        1   GoldenEye (1995)       3       876893171
    2        1  Four Rooms (1995)       4       878542960
    

    Difference is if not match, map create NaN and replace let original value:

    print (df2)
       user_id  movie_id  rating  unix_timestamp
    0        1         1       5       874965758
    1        1         2       3       876893171
    2        1         5       4       878542960 <- 5 not match
    
    df2['movie_id'] = df2['movie_id'].map(df1.set_index('ID')['MOVIE NAME (YEAR)'])
    print (df2)
       user_id          movie_id  rating  unix_timestamp
    0        1  Toy Story (1995)       5       874965758
    1        1  GoldenEye (1995)       3       876893171
    2        1               NaN       4       878542960
    

    df2['movie_id'] = df2['movie_id'].replace(df1.set_index('ID')['MOVIE NAME (YEAR)'])
    print (df2)
       user_id          movie_id  rating  unix_timestamp
    0        1  Toy Story (1995)       5       874965758
    1        1  GoldenEye (1995)       3       876893171
    2        1                 5       4       878542960
    
    0 讨论(0)
提交回复
热议问题