字体安装好了,数据也有了,做一个词语,看下招聘的hr承诺给大家都是什么样的福利
安装jieba
pip install jieba
安装 wordcloud
pip install wordcloud
下面上代码
import pandas as pd
data=pd.read_csv("data_df.csv",index_col=0)
data.info()
'''
<class 'pandas.core.frame.DataFrame'>
Index: 450 entries, Java to java高级工程师
Data columns (total 12 columns):
1 450 non-null object
2 450 non-null object
3 450 non-null object
4 450 non-null object
5 450 non-null object
6 450 non-null object
7 450 non-null object
8 450 non-null object
9 399 non-null object
10 450 non-null object
11 435 non-null object
12 415 non-null object
dtypes: object(12)
memory usage: 55.7+ KB
从打印的数据来看提取出来的数据列索引为9,11,12都有缺失
'''
因为要选第11列内容作为词云的数据源,可以对11列内容做缺失补偿,具体代码如下
from sklearn.impute import SimpleImputer
#众数填补缺失值
mode=SimpleImputer(strategy='most_frequent')
mode=mode.fit_transform(treatment)
mode[:20]
data.loc[:,'11']=mode
data.info()
'''
9 399 non-null object
10 450 non-null object
11 450 non-null object
12 415 non-null object
从index=11的查询出来的个数可以判定此列已经被众数填补缺失值
'''
下面收集数据写入到切片
temp=[]
for item in data['11']:
items=item.split("|")
for each in items:
temp.append(each)
temp
#打印temp数据结果
["年底双薪","定期体检","绩效奖金","技能培训","股票期权","带薪年假","交通补助","健身房","股票期权","带薪年假","交通补助","健身房","技能培训","节日礼物","带薪年假","岗位晋升","绩效奖金","五险一金","年度旅游","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","绩效奖金","带薪年假","管理规范","五险一金","技术大牛","两次年度旅游","福利倍儿好","年终奖丰厚","带薪年假","计算机软件","管理规范","定期体检","技能培训","节日礼物","带薪年假","岗位晋升","年底双薪","定期体检","带薪年假","晋升透明","绩效奖金","带薪年假","定期体检","节日礼物","绩效奖金 ","五险一金 ","带薪年假 ","年度旅游 ","绩效奖金","带薪年假","定期体检","节日礼物","年底双薪","定期体检","绩效奖金","技能培训","带薪年假","计算机软件","管理规范","定期体检","六险一金","扁平化管理","丰厚年终","丰富技术交流","带薪年假","美女多","领导好","帅哥多","绩效奖金","专项奖金","五险一金","带薪年假","绩效奖金","专项奖金","五险一金","带薪年假","绩效奖金","带薪年假","定期体检","节日礼物","六险一金","扁平化管理","丰厚年终","丰富技术交流","年底双薪","节日礼物","技能培训","绩效奖金","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","绩效奖金","带薪年假","年终分红","定期体检","六险一金","扁平化管理","丰厚年终","丰富技术交流","股票期权","带薪年假","交通补助","健身房","绩效奖金","专项奖金","五险一金","带薪年假","技能培训","节日礼物","带薪年假","岗位晋升","扁平管理","领导好","五险一金","绩效奖金","年底双薪","定期体检","绩效奖金","技能培训","技能培训","年度旅游","岗位晋升","五险一金","绩效奖金","年底双薪","五险一金","带薪年假","节日礼物","技能培训","绩效奖金","岗位晋升","技能培训","年度旅游","岗位晋升","五险一金","技能培训","年度旅游","岗位晋升","五险一金","扁平管理","领导好","五险一金","绩效奖金","扁平管理","弹性工作","大厨定制三餐","就近租房补贴","年底双薪","带薪年假","定期体检","绩效奖金","节日礼物","技能培训","免费班车","带薪年假","带薪年假","计算机软件","管理规范","定期体检","绩效奖金","交通补助","定期体检","通讯津贴","年底双薪","带薪年假","房屋补贴","零食饮料","扁平管理","领导好","五险一金","绩效奖金","技能培训","年度旅游","岗位晋升","管理规范","扁平管理","弹性工作","大厨定制三餐","就近租房补贴","年底双薪","带薪年假","房屋补贴","零食饮料","绩效奖金","年底双薪","五险一金","带薪年假","技能培训","节日礼物","带薪年假","岗位晋升","扁平管理","弹性工作","大厨定制三餐","就近租房补贴","技能培训","股票期权","带薪年假","岗位晋升","股票期权","扁平管理","弹性工作","五险一金","年底双薪","专项奖金","交通补助","午餐补助","绩效奖金","年底双薪","五险一金","带薪年假","年底双薪","节日礼物","年度旅游","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","带薪年假","定期体检","六险一金","婚育礼金","绩效奖金","带薪年假","领导好","扁平管理","绩效奖金","年底双薪","五险一金","带薪年假","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","带薪年假","定期体检","六险一金","婚育礼金","带薪年假","定期体检","六险一金","婚育礼金","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","绩效奖金","带薪年假","领导好","扁平管理","年底双薪","带薪年假","股票期权","绩效奖金","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","福利完善","国际化平台","定期体检","绩效奖金","朝十晚七","周末双休","五险一金","股票期权","绩效奖金","岗位晋升","年度旅游","股票期权","绩效奖金","岗位晋升","年度旅游","绩效奖金","领导好","帅哥多","美女多","社区o2o","原始股权","弹性工作制","互联网+","优秀同事","股票期权","年终分红","带薪年假","优秀同事","股票期权","年终分红","带薪年假","持续盈利","国际龙头企业","SaaS平台","极客氛围","股票期权","重职业规划","团队有趣高效","大数据+AI","股票期权","重职业规划","团队有趣高效","大数据+AI","年底双薪","带薪年假","弹性工作","扁平管理","年底双薪","带薪年假","年度旅游","岗位晋升","扁平管理","股权激励","五险一金","创业蓝海","技能培训","年底双薪","节日礼物","绩效奖金","合伙人","地图社交","扁平管理","岗位晋升","五险一金","技能培训","绩效奖金","带薪年假","管理规范","五险一金","优秀同事","股票期权","年终分红","带薪年假","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","年底双薪","五险一金","通讯津贴","交通补助","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技术大牛","长期激励","弹性考勤","绩效奖金","高额六险一金","年底双薪","绩效奖金","N项经费","弹性工作","领导好","五险一金","技能培训","节日礼物","带薪年假","岗位晋升","持续盈利","国际龙头企业","SaaS平台","极客氛围","带薪年假","年底双薪","定期体检","节日礼物","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","扁平管理","技能培训","岗位晋升","Bat团队","免费班车","成长空间","年度旅游","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","年底双薪","股票期权","专项奖金","技能培训","节日礼物","带薪年假","岗位晋升","带薪年假","午餐补助","定期体检","年底双薪","带薪年假","午餐补助","定期体检","年底双薪","五险一金","弹性工作","岗位晋升","顶尖团队","年底双薪","节日礼物","技能培训","岗位晋升","技能培训","岗位晋升","扁平管理","专项奖金","专项奖金","股票期权","岗位晋升","年度旅游","股票期权","专项奖金","扁平管理","年度旅游","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","免费班车","成长空间","年度旅游","岗位晋升","岗位晋升","顶尖团队","福利优厚","股票期权","年底双薪","节日礼物","绩效奖金","年度旅游","技能培训","节日礼物","带薪年假","岗位晋升","带薪年假","节日礼物","五险一金","大小周工作制","技能培训","带薪年假","岗位晋升","五险一金","年度旅游","岗位晋升","帅哥多","定期体检","技能培训","节日礼物","带薪年假","岗位晋升","股票期权","绩效奖金","岗位晋升","年度旅游","年底双薪","专项奖金","年终分红","定期体检","技能培训","岗位晋升","扁平管理","专项奖金","扁平管理","股权激励","五险一金","创业蓝海","节日礼物","年底双薪","带薪年假","定期体检","扁平管理","股权激励","五险一金","创业蓝海","节日礼物","年底双薪","带薪年假","定期体检","技能培训","年底双薪","节日礼物","绩效奖金","足球场","提供住宿","绩效奖金","岗位晋升","带薪年假","扁平管理","五险一金","绩效奖金","五险一金","通讯津贴","带薪年假","年底双薪","五险一金","通讯津贴","带薪年假","年底双薪","技能培训","节日礼物","带薪年假","岗位晋升","专项奖金","技能培训","岗位晋升","五险一金","带薪年假","扁平管理","五险一金","绩效奖金","定期体检","绩效奖金","带薪年假","专项奖金","带薪年假","绩效奖金","定期体检","专业培训","技能培训","年底双薪","节日礼物","落户办理","技能培训","年底双薪","节日礼物","落户办理","技能培训","年底双薪","节日礼物","落户办理","带薪年假","年终分红","绩效奖金","交通补助","年底双薪","股票期权","带薪年假","绩效奖金","年底双薪","交通补助","技能培训","年底双薪","节日礼物","绩效奖金","岗位晋升","顶尖团队","福利优厚","股票期权","免费班车","丰厚年终奖","定期体检","节日礼金","技能培训","节日礼物","带薪年假","岗位晋升","带薪年假","年底双薪","定期体检","节日礼物","股票期权","绩效奖金","扁平管理","高速发展","技能培训","年底双薪","节日礼物","绩效奖金","节日礼物","技能培训","绩效奖金","岗位晋升","开源软件","全球运营","股票期权","带薪年假","技能培训","节日礼物","带薪年假","岗位晋升","股票期权","扁平管理","弹性工作","五险一金","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","免费班车","丰厚年终奖","定期体检","节日礼金","年底双薪","节日礼物","绩效奖金","年度旅游","定期体检","年终分红","绩效奖金","技能培训","年底双薪","节日礼物","绩效奖金","绩效奖金","带薪年假","管理规范","五险一金","带薪年假","年终分红","绩效奖金","交通补助","金融科技","上市公司","六险一金","带薪年假","足球场","提供住宿","绩效奖金","岗位晋升","足球场","提供住宿","绩效奖金","岗位晋升","技术大牛","长期激励","弹性考勤","绩效奖金","股票期权","绩效奖金","扁平管理","高速发展","免费班车","成长空间","年度旅游","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","绩效奖金","带薪年假","午餐补助","年度旅游","技能培训","节日礼物","专项奖金","带薪年假","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","带薪年假","年终分红","绩效奖金","交通补助","零食水果供应","带薪年假","绩效奖金","扁平管理","工程师氛围","弹性工作","扁平管理","上班不打卡","五险一金","岗位晋升","扁平管理","带薪年假","绩效奖金","年度旅游","领导好","五险一金","年底双薪","股票期权","带薪年假","绩效奖金","年底双薪","股票期权","带薪年假","绩效奖金","带薪年假","午餐补助","年度旅游","节日礼物","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","工程师氛围","弹性工作","扁平管理","上班不打卡","技能培训","节日礼物","带薪年假","岗位晋升","工程师氛围","弹性工作","扁平管理","上班不打卡","绩效奖金","定期体检","带薪年假","弹性工作","全薪病假","股票期权","年度体检","节日礼物","全薪病假","股票期权","年度体检","节日礼物","弹性工作","领导好","五险一金","年底双薪","带薪年假","股票期权","绩效奖金","年底双薪","绩效奖金","带薪年假","定期体检","五险一金","通讯津贴","带薪年假","定期体检","五险一金","岗位晋升","扁平管理","带薪年假","年底双薪","带薪年假","定期体检","绩效奖金","绩效奖金","扁平管理","美女多","领导好","绩效奖金","午餐补助","定期体检","弹性工作","年底双薪","免费班车","带薪年假","岗位晋升","零食水果供应","带薪年假","绩效奖金","扁平管理","年底双薪","绩效奖金","带薪年假","股票期权","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","带薪年假","弹性工作时间","年度旅游","岗位晋升","节日礼物","年底双薪","绩效奖金","岗位晋升","节日礼物","年底双薪","绩效奖金","岗位晋升","专项奖金","绩效奖金","年终分红","股票期权","股票期权","定期体检","下午茶","年度旅游","节日礼物","年底双薪","股票期权","扁平管理","五险一金","弹性工作","丰盛三餐","发展空间大","绩效奖金","定期体检","交通补助","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","五险一金","年底双薪","带薪年假","弹性工作","年底双薪","带薪年假","定期体检","绩效奖金","股票期权","定期体检","下午茶","年度旅游","5星办公环境","年底奖金","团队牛B","五险一金","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","年底双薪","带薪年假","全牌照券商","双A级","上市公司","优质培训体系","全牌照券商","双A级","上市公司","优质培训体系","五险一金","通讯津贴","带薪年假","定期体检","技能培训","节日礼物","带薪年假","岗位晋升","节日礼物","技能培训","带薪年假","绩效奖金","节日礼物","技能培训","绩效奖金","扁平管理","技能培训","节日礼物","带薪年假","岗位晋升","节日礼物","技能培训","年度旅游","岗位晋升","节日礼物","技能培训","绩效奖金","岗位晋升","五险一金","弹性工作","丰盛三餐","发展空间大","股票期权","扁平管理","弹性工作","五险一金","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","弹性工作","领导好","五险一金","节日礼物","年底双薪","股票期权","扁平管理","带薪年假","定期体检","弹性工作","年度旅游","技能培训","节日礼物","带薪年假","岗位晋升","年底双薪","专项奖金","定期体检","带薪年假","技能培训","节日礼物","免费班车","股票期权","带薪年假","年底三薪","定期体检","五险一金","节日礼物","年底双薪","绩效奖金","岗位晋升","技能培训","节日礼物","带薪年假","岗位晋升","绩效奖金","美女多","领导好","弹性工作","技能培训"]
统计每个词出现的频率排序取前20个作为云图词汇的代表
counts = {} #创建字典,统计公司福利对应的个数
for word in temp:
counts[word]= counts.get(word,0)+1
items = list(counts.items())
items.sort(key=lambda x:x[1], reverse =True) #降序排序
wordlist=list()
for i in range(20):
word,count = items[i]
print("{0:<10}{1:<5}".format(word,count)) #输出前N个词频的词语
wordlist.append(word) #把词语word放进一个列表
wl=' '.join(wordlist)
wl
#20个热词排序
'''
带薪年假 255
岗位晋升 174
节日礼物 171
技能培训 166
绩效奖金 138
年底双薪 97
五险一金 79
定期体检 74
股票期权 66
扁平管理 59
年度旅游 46
弹性工作 43
专项奖金 36
领导好 21
交通补助 18
年终分红 18
免费班车 17
通讯津贴 16
管理规范 14
午餐补助 12
'带薪年假 岗位晋升 节日礼物 技能培训 绩效奖金 年底双薪 五险一金 定期体检 股票期权 扁平管理 年度旅游 弹性工作 专项奖金 领导好 交通补助 年终分红 免费班车 通讯津贴 管理规范 午餐补助'
'''
生成词条
from wordcloud import WordCloud
import matplotlib.pyplot as plt
wc = WordCloud(
background_color = "black", #背景颜色
max_words=100, #最大词语数目
font_path = r'/usr/share/fonts/simsun.ttc', #调用font里的simsun.tff字体,需要提前安装
height=2400, #设置高度
width=1600, #设置宽度
max_font_size=1000, #最大字体号
random_state=1000, #设置随机生成状态,即有多少种配色方案
)
myword = wc.generate(wl)
plt.imshow(myword)
plt.axis("off")
plt.show()
wc.to_file('1.jpg')
上面图太小,服务器里把保存的图片保存到本地,这个图够大了
来源:CSDN
作者:li_yan_spring
链接:https://blog.csdn.net/li_yan_sring/article/details/104080043