Pandas数据处理基本操作——Pandas的索引重置,索引与切片索引,自定义函数的应用

吃可爱长大的小学妹 提交于 2019-12-03 08:40:11

Pandas数据处理基本操作

以下是本次实验使用的数据表,如有需要联系,请留言
在这里插入图片描述

import pandas as pd
import numpy as np
#读取文件  不同媒体或网站对电影的评分
fandango = pd.read_csv('fandango_score_comparison.csv')
fandango.head()
FILM RottenTomatoes RottenTomatoes_User Metacritic Metacritic_User IMDB Fandango_Stars Fandango_Ratingvalue RT_norm RT_user_norm Metacritic_norm Metacritic_user_nom IMDB_norm RT_norm_round RT_user_norm_round Metacritic_norm_round Metacritic_user_norm_round IMDB_norm_round Metacritic_user_vote_count IMDB_user_vote_count Fandango_votes Fandango_Difference
0 Avengers: Age of Ultron (2015) 74 86 66 7.1 7.8 5.0 4.5 3.70 4.3 3.30 3.55 3.90 3.5 4.5 3.5 3.5 4.0 1330 271107 14846 0.5
1 Cinderella (2015) 85 80 67 7.5 7.1 5.0 4.5 4.25 4.0 3.35 3.75 3.55 4.5 4.0 3.5 4.0 3.5 249 65709 12640 0.5
2 Ant-Man (2015) 80 90 64 8.1 7.8 5.0 4.5 4.00 4.5 3.20 4.05 3.90 4.0 4.5 3.0 4.0 4.0 627 103660 12055 0.5
3 Do You Believe? (2015) 18 84 22 4.7 5.4 5.0 4.5 0.90 4.2 1.10 2.35 2.70 1.0 4.0 1.0 2.5 2.5 31 3136 1793 0.5
4 Hot Tub Time Machine 2 (2015) 14 28 29 3.4 5.1 3.5 3.0 0.70 1.4 1.45 1.70 2.55 0.5 1.5 1.5 1.5 2.5 88 19560 1021 0.5

重置索引值

可以看到在最左侧默认产生了索引,可以使用set_index重置索引,下面将FILM置为索引值

fandango.set_index('FILM').head()#将FILM置为索引后可以看到数据表中FILM被删除了  这是因为set_index的默认参数drop为True
RottenTomatoes RottenTomatoes_User Metacritic Metacritic_User IMDB Fandango_Stars Fandango_Ratingvalue RT_norm RT_user_norm Metacritic_norm Metacritic_user_nom IMDB_norm RT_norm_round RT_user_norm_round Metacritic_norm_round Metacritic_user_norm_round IMDB_norm_round Metacritic_user_vote_count IMDB_user_vote_count Fandango_votes Fandango_Difference
FILM
Avengers: Age of Ultron (2015) 74 86 66 7.1 7.8 5.0 4.5 3.70 4.3 3.30 3.55 3.90 3.5 4.5 3.5 3.5 4.0 1330 271107 14846 0.5
Cinderella (2015) 85 80 67 7.5 7.1 5.0 4.5 4.25 4.0 3.35 3.75 3.55 4.5 4.0 3.5 4.0 3.5 249 65709 12640 0.5
Ant-Man (2015) 80 90 64 8.1 7.8 5.0 4.5 4.00 4.5 3.20 4.05 3.90 4.0 4.5 3.0 4.0 4.0 627 103660 12055 0.5
Do You Believe? (2015) 18 84 22 4.7 5.4 5.0 4.5 0.90 4.2 1.10 2.35 2.70 1.0 4.0 1.0 2.5 2.5 31 3136 1793 0.5
Hot Tub Time Machine 2 (2015) 14 28 29 3.4 5.1 3.5 3.0 0.70 1.4 1.45 1.70 2.55 0.5 1.5 1.5 1.5 2.5 88 19560 1021 0.5

将默认参数改为False

fandango.set_index('FILM',drop=False,inplace=True)
fandango.head()
FILM RottenTomatoes RottenTomatoes_User Metacritic Metacritic_User IMDB Fandango_Stars Fandango_Ratingvalue RT_norm RT_user_norm Metacritic_norm Metacritic_user_nom IMDB_norm RT_norm_round RT_user_norm_round Metacritic_norm_round Metacritic_user_norm_round IMDB_norm_round Metacritic_user_vote_count IMDB_user_vote_count Fandango_votes Fandango_Difference
FILM
Avengers: Age of Ultron (2015) Avengers: Age of Ultron (2015) 74 86 66 7.1 7.8 5.0 4.5 3.70 4.3 3.30 3.55 3.90 3.5 4.5 3.5 3.5 4.0 1330 271107 14846 0.5
Cinderella (2015) Cinderella (2015) 85 80 67 7.5 7.1 5.0 4.5 4.25 4.0 3.35 3.75 3.55 4.5 4.0 3.5 4.0 3.5 249 65709 12640 0.5
Ant-Man (2015) Ant-Man (2015) 80 90 64 8.1 7.8 5.0 4.5 4.00 4.5 3.20 4.05 3.90 4.0 4.5 3.0 4.0 4.0 627 103660 12055 0.5
Do You Believe? (2015) Do You Believe? (2015) 18 84 22 4.7 5.4 5.0 4.5 0.90 4.2 1.10 2.35 2.70 1.0 4.0 1.0 2.5 2.5 31 3136 1793 0.5
Hot Tub Time Machine 2 (2015) Hot Tub Time Machine 2 (2015) 14 28 29 3.4 5.1 3.5 3.0 0.70 1.4 1.45 1.70 2.55 0.5 1.5 1.5 1.5 2.5 88 19560 1021 0.5

对多列索引查询

fandango[['FILM','RottenTomatoes']].head()
FILM RottenTomatoes
FILM
Avengers: Age of Ultron (2015) Avengers: Age of Ultron (2015) 74
Cinderella (2015) Cinderella (2015) 85
Ant-Man (2015) Ant-Man (2015) 80
Do You Believe? (2015) Do You Believe? (2015) 18
Hot Tub Time Machine 2 (2015) Hot Tub Time Machine 2 (2015) 14

对多行索引查询

fandango.loc[['Avengers: Age of Ultron (2015)','Hot Tub Time Machine 2 (2015)']]
FILM RottenTomatoes RottenTomatoes_User Metacritic Metacritic_User IMDB Fandango_Stars Fandango_Ratingvalue RT_norm RT_user_norm Metacritic_norm Metacritic_user_nom IMDB_norm RT_norm_round RT_user_norm_round Metacritic_norm_round Metacritic_user_norm_round IMDB_norm_round Metacritic_user_vote_count IMDB_user_vote_count Fandango_votes Fandango_Difference
FILM
Avengers: Age of Ultron (2015) Avengers: Age of Ultron (2015) 74 86 66 7.1 7.8 5.0 4.5 3.7 4.3 3.30 3.55 3.90 3.5 4.5 3.5 3.5 4.0 1330 271107 14846 0.5
Hot Tub Time Machine 2 (2015) Hot Tub Time Machine 2 (2015) 14 28 29 3.4 5.1 3.5 3.0 0.7 1.4 1.45 1.70 2.55 0.5 1.5 1.5 1.5 2.5 88 19560 1021 0.5

对多行切片查询 设定步长 默认步长为1

#对多行切片查询  设定步长  默认步长为1
fandango.loc['Avengers: Age of Ultron (2015)':'Hot Tub Time Machine 2 (2015)':2]
FILM RottenTomatoes RottenTomatoes_User Metacritic Metacritic_User IMDB Fandango_Stars Fandango_Ratingvalue RT_norm RT_user_norm Metacritic_norm Metacritic_user_nom IMDB_norm RT_norm_round RT_user_norm_round Metacritic_norm_round Metacritic_user_norm_round IMDB_norm_round Metacritic_user_vote_count IMDB_user_vote_count Fandango_votes Fandango_Difference
FILM
Avengers: Age of Ultron (2015) Avengers: Age of Ultron (2015) 74 86 66 7.1 7.8 5.0 4.5 3.7 4.3 3.30 3.55 3.90 3.5 4.5 3.5 3.5 4.0 1330 271107 14846 0.5
Ant-Man (2015) Ant-Man (2015) 80 90 64 8.1 7.8 5.0 4.5 4.0 4.5 3.20 4.05 3.90 4.0 4.5 3.0 4.0 4.0 627 103660 12055 0.5
Hot Tub Time Machine 2 (2015) Hot Tub Time Machine 2 (2015) 14 28 29 3.4 5.1 3.5 3.0 0.7 1.4 1.45 1.70 2.55 0.5 1.5 1.5 1.5 2.5 88 19560 1021 0.5

查看数据类型

#查看数据类型
fandango.dtypes
FILM                           object
RottenTomatoes                  int64
RottenTomatoes_User             int64
Metacritic                      int64
Metacritic_User               float64
IMDB                          float64
Fandango_Stars                float64
Fandango_Ratingvalue          float64
RT_norm                       float64
RT_user_norm                  float64
Metacritic_norm               float64
Metacritic_user_nom           float64
IMDB_norm                     float64
RT_norm_round                 float64
RT_user_norm_round            float64
Metacritic_norm_round         float64
Metacritic_user_norm_round    float64
IMDB_norm_round               float64
Metacritic_user_vote_count      int64
IMDB_user_vote_count            int64
Fandango_votes                  int64
Fandango_Difference           float64
dtype: object

查看所有浮点型数据的索引

#查看所有浮点型数据的索引
fandango.dtypes[fandango.dtypes.values == 'float64'].index
Index(['Metacritic_User', 'IMDB', 'Fandango_Stars', 'Fandango_Ratingvalue',
       'RT_norm', 'RT_user_norm', 'Metacritic_norm', 'Metacritic_user_nom',
       'IMDB_norm', 'RT_norm_round', 'RT_user_norm_round',
       'Metacritic_norm_round', 'Metacritic_user_norm_round',
       'IMDB_norm_round', 'Fandango_Difference'],
      dtype='object')

修改数据类型

#修改数据类型
fandango['RottenTomatoes'].astype(np.float64,inplace = False).head()
0    74.0
1    85.0
2    80.0
3    18.0
4    14.0
Name: RottenTomatoes, dtype: float64

使用自定义函数计算float类型的列的分数标准差

#使用自定义函数计算float类型的列的分数标准差
#使用匿名函数在我的博文匿名函数lambda与过滤器filter中讲到
fandango[fandango.dtypes[fandango.dtypes.values == 'float64'].index].apply(lambda x:np.std(x)) 
Metacritic_User               1.505529
IMDB                          0.955447
Fandango_Stars                0.538532
Fandango_Ratingvalue          0.501106
RT_norm                       1.503265
RT_user_norm                  0.997787
Metacritic_norm               0.972522
Metacritic_user_nom           0.752765
IMDB_norm                     0.477723
RT_norm_round                 1.509404
RT_user_norm_round            1.003559
Metacritic_norm_round         0.987561
Metacritic_user_norm_round    0.785412
IMDB_norm_round               0.501043
Fandango_Difference           0.152141
dtype: float64

将所有的int64改为float64

#将所有的int64改为float64
fandango[fandango.dtypes[fandango.dtypes.values == 'int64'].index].astype(np.float64).head()
RottenTomatoes RottenTomatoes_User Metacritic Metacritic_user_vote_count IMDB_user_vote_count Fandango_votes
0 74.0 86.0 66.0 1330.0 271107.0 14846.0
1 85.0 80.0 67.0 249.0 65709.0 12640.0
2 80.0 90.0 64.0 627.0 103660.0 12055.0
3 18.0 84.0 22.0 31.0 3136.0 1793.0
4 14.0 28.0 29.0 88.0 19560.0 1021.0

计算每部电影RT_user_norm ,Metacritic_user_nom评分的标准差

#计算每部电影RT_user_norm ,Metacritic_user_nom评分的标准差
fandango.set_index('FILM').loc[:][['Metacritic_user_nom','RT_user_norm']].apply(lambda x:np.std(x),axis = 1).head()
FILM
Avengers: Age of Ultron (2015)    0.375
Cinderella (2015)                 0.125
Ant-Man (2015)                    0.225
Do You Believe? (2015)            0.925
Hot Tub Time Machine 2 (2015)     0.150
dtype: float64
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!