Pandas数据处理基本操作
以下是本次实验使用的数据表,如有需要联系,请留言
import pandas as pd
import numpy as np
#读取文件 不同媒体或网站对电影的评分
fandango = pd.read_csv('fandango_score_comparison.csv')
fandango.head()
FILM | RottenTomatoes | RottenTomatoes_User | Metacritic | Metacritic_User | IMDB | Fandango_Stars | Fandango_Ratingvalue | RT_norm | RT_user_norm | Metacritic_norm | Metacritic_user_nom | IMDB_norm | RT_norm_round | RT_user_norm_round | Metacritic_norm_round | Metacritic_user_norm_round | IMDB_norm_round | Metacritic_user_vote_count | IMDB_user_vote_count | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Avengers: Age of Ultron (2015) | 74 | 86 | 66 | 7.1 | 7.8 | 5.0 | 4.5 | 3.70 | 4.3 | 3.30 | 3.55 | 3.90 | 3.5 | 4.5 | 3.5 | 3.5 | 4.0 | 1330 | 271107 | 14846 | 0.5 |
1 | Cinderella (2015) | 85 | 80 | 67 | 7.5 | 7.1 | 5.0 | 4.5 | 4.25 | 4.0 | 3.35 | 3.75 | 3.55 | 4.5 | 4.0 | 3.5 | 4.0 | 3.5 | 249 | 65709 | 12640 | 0.5 |
2 | Ant-Man (2015) | 80 | 90 | 64 | 8.1 | 7.8 | 5.0 | 4.5 | 4.00 | 4.5 | 3.20 | 4.05 | 3.90 | 4.0 | 4.5 | 3.0 | 4.0 | 4.0 | 627 | 103660 | 12055 | 0.5 |
3 | Do You Believe? (2015) | 18 | 84 | 22 | 4.7 | 5.4 | 5.0 | 4.5 | 0.90 | 4.2 | 1.10 | 2.35 | 2.70 | 1.0 | 4.0 | 1.0 | 2.5 | 2.5 | 31 | 3136 | 1793 | 0.5 |
4 | Hot Tub Time Machine 2 (2015) | 14 | 28 | 29 | 3.4 | 5.1 | 3.5 | 3.0 | 0.70 | 1.4 | 1.45 | 1.70 | 2.55 | 0.5 | 1.5 | 1.5 | 1.5 | 2.5 | 88 | 19560 | 1021 | 0.5 |
重置索引值
可以看到在最左侧默认产生了索引,可以使用set_index重置索引,下面将FILM置为索引值
fandango.set_index('FILM').head()#将FILM置为索引后可以看到数据表中FILM被删除了 这是因为set_index的默认参数drop为True
RottenTomatoes | RottenTomatoes_User | Metacritic | Metacritic_User | IMDB | Fandango_Stars | Fandango_Ratingvalue | RT_norm | RT_user_norm | Metacritic_norm | Metacritic_user_nom | IMDB_norm | RT_norm_round | RT_user_norm_round | Metacritic_norm_round | Metacritic_user_norm_round | IMDB_norm_round | Metacritic_user_vote_count | IMDB_user_vote_count | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FILM | |||||||||||||||||||||
Avengers: Age of Ultron (2015) | 74 | 86 | 66 | 7.1 | 7.8 | 5.0 | 4.5 | 3.70 | 4.3 | 3.30 | 3.55 | 3.90 | 3.5 | 4.5 | 3.5 | 3.5 | 4.0 | 1330 | 271107 | 14846 | 0.5 |
Cinderella (2015) | 85 | 80 | 67 | 7.5 | 7.1 | 5.0 | 4.5 | 4.25 | 4.0 | 3.35 | 3.75 | 3.55 | 4.5 | 4.0 | 3.5 | 4.0 | 3.5 | 249 | 65709 | 12640 | 0.5 |
Ant-Man (2015) | 80 | 90 | 64 | 8.1 | 7.8 | 5.0 | 4.5 | 4.00 | 4.5 | 3.20 | 4.05 | 3.90 | 4.0 | 4.5 | 3.0 | 4.0 | 4.0 | 627 | 103660 | 12055 | 0.5 |
Do You Believe? (2015) | 18 | 84 | 22 | 4.7 | 5.4 | 5.0 | 4.5 | 0.90 | 4.2 | 1.10 | 2.35 | 2.70 | 1.0 | 4.0 | 1.0 | 2.5 | 2.5 | 31 | 3136 | 1793 | 0.5 |
Hot Tub Time Machine 2 (2015) | 14 | 28 | 29 | 3.4 | 5.1 | 3.5 | 3.0 | 0.70 | 1.4 | 1.45 | 1.70 | 2.55 | 0.5 | 1.5 | 1.5 | 1.5 | 2.5 | 88 | 19560 | 1021 | 0.5 |
将默认参数改为False
fandango.set_index('FILM',drop=False,inplace=True)
fandango.head()
FILM | RottenTomatoes | RottenTomatoes_User | Metacritic | Metacritic_User | IMDB | Fandango_Stars | Fandango_Ratingvalue | RT_norm | RT_user_norm | Metacritic_norm | Metacritic_user_nom | IMDB_norm | RT_norm_round | RT_user_norm_round | Metacritic_norm_round | Metacritic_user_norm_round | IMDB_norm_round | Metacritic_user_vote_count | IMDB_user_vote_count | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FILM | ||||||||||||||||||||||
Avengers: Age of Ultron (2015) | Avengers: Age of Ultron (2015) | 74 | 86 | 66 | 7.1 | 7.8 | 5.0 | 4.5 | 3.70 | 4.3 | 3.30 | 3.55 | 3.90 | 3.5 | 4.5 | 3.5 | 3.5 | 4.0 | 1330 | 271107 | 14846 | 0.5 |
Cinderella (2015) | Cinderella (2015) | 85 | 80 | 67 | 7.5 | 7.1 | 5.0 | 4.5 | 4.25 | 4.0 | 3.35 | 3.75 | 3.55 | 4.5 | 4.0 | 3.5 | 4.0 | 3.5 | 249 | 65709 | 12640 | 0.5 |
Ant-Man (2015) | Ant-Man (2015) | 80 | 90 | 64 | 8.1 | 7.8 | 5.0 | 4.5 | 4.00 | 4.5 | 3.20 | 4.05 | 3.90 | 4.0 | 4.5 | 3.0 | 4.0 | 4.0 | 627 | 103660 | 12055 | 0.5 |
Do You Believe? (2015) | Do You Believe? (2015) | 18 | 84 | 22 | 4.7 | 5.4 | 5.0 | 4.5 | 0.90 | 4.2 | 1.10 | 2.35 | 2.70 | 1.0 | 4.0 | 1.0 | 2.5 | 2.5 | 31 | 3136 | 1793 | 0.5 |
Hot Tub Time Machine 2 (2015) | Hot Tub Time Machine 2 (2015) | 14 | 28 | 29 | 3.4 | 5.1 | 3.5 | 3.0 | 0.70 | 1.4 | 1.45 | 1.70 | 2.55 | 0.5 | 1.5 | 1.5 | 1.5 | 2.5 | 88 | 19560 | 1021 | 0.5 |
对多列索引查询
fandango[['FILM','RottenTomatoes']].head()
FILM | RottenTomatoes | |
---|---|---|
FILM | ||
Avengers: Age of Ultron (2015) | Avengers: Age of Ultron (2015) | 74 |
Cinderella (2015) | Cinderella (2015) | 85 |
Ant-Man (2015) | Ant-Man (2015) | 80 |
Do You Believe? (2015) | Do You Believe? (2015) | 18 |
Hot Tub Time Machine 2 (2015) | Hot Tub Time Machine 2 (2015) | 14 |
对多行索引查询
fandango.loc[['Avengers: Age of Ultron (2015)','Hot Tub Time Machine 2 (2015)']]
FILM | RottenTomatoes | RottenTomatoes_User | Metacritic | Metacritic_User | IMDB | Fandango_Stars | Fandango_Ratingvalue | RT_norm | RT_user_norm | Metacritic_norm | Metacritic_user_nom | IMDB_norm | RT_norm_round | RT_user_norm_round | Metacritic_norm_round | Metacritic_user_norm_round | IMDB_norm_round | Metacritic_user_vote_count | IMDB_user_vote_count | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FILM | ||||||||||||||||||||||
Avengers: Age of Ultron (2015) | Avengers: Age of Ultron (2015) | 74 | 86 | 66 | 7.1 | 7.8 | 5.0 | 4.5 | 3.7 | 4.3 | 3.30 | 3.55 | 3.90 | 3.5 | 4.5 | 3.5 | 3.5 | 4.0 | 1330 | 271107 | 14846 | 0.5 |
Hot Tub Time Machine 2 (2015) | Hot Tub Time Machine 2 (2015) | 14 | 28 | 29 | 3.4 | 5.1 | 3.5 | 3.0 | 0.7 | 1.4 | 1.45 | 1.70 | 2.55 | 0.5 | 1.5 | 1.5 | 1.5 | 2.5 | 88 | 19560 | 1021 | 0.5 |
对多行切片查询 设定步长 默认步长为1
#对多行切片查询 设定步长 默认步长为1
fandango.loc['Avengers: Age of Ultron (2015)':'Hot Tub Time Machine 2 (2015)':2]
FILM | RottenTomatoes | RottenTomatoes_User | Metacritic | Metacritic_User | IMDB | Fandango_Stars | Fandango_Ratingvalue | RT_norm | RT_user_norm | Metacritic_norm | Metacritic_user_nom | IMDB_norm | RT_norm_round | RT_user_norm_round | Metacritic_norm_round | Metacritic_user_norm_round | IMDB_norm_round | Metacritic_user_vote_count | IMDB_user_vote_count | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FILM | ||||||||||||||||||||||
Avengers: Age of Ultron (2015) | Avengers: Age of Ultron (2015) | 74 | 86 | 66 | 7.1 | 7.8 | 5.0 | 4.5 | 3.7 | 4.3 | 3.30 | 3.55 | 3.90 | 3.5 | 4.5 | 3.5 | 3.5 | 4.0 | 1330 | 271107 | 14846 | 0.5 |
Ant-Man (2015) | Ant-Man (2015) | 80 | 90 | 64 | 8.1 | 7.8 | 5.0 | 4.5 | 4.0 | 4.5 | 3.20 | 4.05 | 3.90 | 4.0 | 4.5 | 3.0 | 4.0 | 4.0 | 627 | 103660 | 12055 | 0.5 |
Hot Tub Time Machine 2 (2015) | Hot Tub Time Machine 2 (2015) | 14 | 28 | 29 | 3.4 | 5.1 | 3.5 | 3.0 | 0.7 | 1.4 | 1.45 | 1.70 | 2.55 | 0.5 | 1.5 | 1.5 | 1.5 | 2.5 | 88 | 19560 | 1021 | 0.5 |
查看数据类型
#查看数据类型
fandango.dtypes
FILM object
RottenTomatoes int64
RottenTomatoes_User int64
Metacritic int64
Metacritic_User float64
IMDB float64
Fandango_Stars float64
Fandango_Ratingvalue float64
RT_norm float64
RT_user_norm float64
Metacritic_norm float64
Metacritic_user_nom float64
IMDB_norm float64
RT_norm_round float64
RT_user_norm_round float64
Metacritic_norm_round float64
Metacritic_user_norm_round float64
IMDB_norm_round float64
Metacritic_user_vote_count int64
IMDB_user_vote_count int64
Fandango_votes int64
Fandango_Difference float64
dtype: object
查看所有浮点型数据的索引
#查看所有浮点型数据的索引
fandango.dtypes[fandango.dtypes.values == 'float64'].index
Index(['Metacritic_User', 'IMDB', 'Fandango_Stars', 'Fandango_Ratingvalue',
'RT_norm', 'RT_user_norm', 'Metacritic_norm', 'Metacritic_user_nom',
'IMDB_norm', 'RT_norm_round', 'RT_user_norm_round',
'Metacritic_norm_round', 'Metacritic_user_norm_round',
'IMDB_norm_round', 'Fandango_Difference'],
dtype='object')
修改数据类型
#修改数据类型
fandango['RottenTomatoes'].astype(np.float64,inplace = False).head()
0 74.0
1 85.0
2 80.0
3 18.0
4 14.0
Name: RottenTomatoes, dtype: float64
使用自定义函数计算float类型的列的分数标准差
#使用自定义函数计算float类型的列的分数标准差
#使用匿名函数在我的博文匿名函数lambda与过滤器filter中讲到
fandango[fandango.dtypes[fandango.dtypes.values == 'float64'].index].apply(lambda x:np.std(x))
Metacritic_User 1.505529
IMDB 0.955447
Fandango_Stars 0.538532
Fandango_Ratingvalue 0.501106
RT_norm 1.503265
RT_user_norm 0.997787
Metacritic_norm 0.972522
Metacritic_user_nom 0.752765
IMDB_norm 0.477723
RT_norm_round 1.509404
RT_user_norm_round 1.003559
Metacritic_norm_round 0.987561
Metacritic_user_norm_round 0.785412
IMDB_norm_round 0.501043
Fandango_Difference 0.152141
dtype: float64
将所有的int64改为float64
#将所有的int64改为float64
fandango[fandango.dtypes[fandango.dtypes.values == 'int64'].index].astype(np.float64).head()
RottenTomatoes | RottenTomatoes_User | Metacritic | Metacritic_user_vote_count | IMDB_user_vote_count | Fandango_votes | |
---|---|---|---|---|---|---|
0 | 74.0 | 86.0 | 66.0 | 1330.0 | 271107.0 | 14846.0 |
1 | 85.0 | 80.0 | 67.0 | 249.0 | 65709.0 | 12640.0 |
2 | 80.0 | 90.0 | 64.0 | 627.0 | 103660.0 | 12055.0 |
3 | 18.0 | 84.0 | 22.0 | 31.0 | 3136.0 | 1793.0 |
4 | 14.0 | 28.0 | 29.0 | 88.0 | 19560.0 | 1021.0 |
计算每部电影RT_user_norm ,Metacritic_user_nom评分的标准差
#计算每部电影RT_user_norm ,Metacritic_user_nom评分的标准差
fandango.set_index('FILM').loc[:][['Metacritic_user_nom','RT_user_norm']].apply(lambda x:np.std(x),axis = 1).head()
FILM
Avengers: Age of Ultron (2015) 0.375
Cinderella (2015) 0.125
Ant-Man (2015) 0.225
Do You Believe? (2015) 0.925
Hot Tub Time Machine 2 (2015) 0.150
dtype: float64
来源:CSDN
作者:志存高远脚踏实地
链接:https://blog.csdn.net/weixin_44451032/article/details/99473101