pd

linq 数据插入 删除 更新

两盒软妹~` 提交于 2019-12-21 13:04:05
插入: ProductDataContext pa = new ProductDataContext(); PandC pd = new PandC(); pd.CNAME="aaaa"; pd.PID=100; pd.PNAME = "aaaa"; pa.PandC.InsertOnSubmit(pd); pa.SubmitChanges(); ----------或者 ProductDataContext pa = new ProductDataContext(); PandC pd = new PandC { CNAME = "bbb", PID = 100, PNAME = "abbbaaa" }; pa.PandC.InsertOnSubmit(pd); pa.SubmitChanges(); PandC 为数据表名 也是类名 将该类实例化 传参数 再插入数据 删除: ProductDataContext pa = new ProductDataContext(); var deleted=from pc in pa.PandC where pc.CNAME=="bbb" select pc; foreach (var detial in deleted) { pa.PandC.DeleteOnSubmit(detial); } pa.SubmitChanges();

Time Series / Date functionality

萝らか妹 提交于 2019-12-17 15:57:48
详细内容见: http://pandas.pydata.org/pandas-docs/stable/timeseries.html 以下是一些可能会用到的代码: 代码1 df = pd.DataFrame({'year': [2015, 2016],'month': [2, 3],'day': [4, 5],'hour': [2, 3]}) print(pd.to_datetime(df)) 0 2015-02-04 02:00:00 1 2016-03-05 03:00:00 dtype: datetime64[ns] print(pd.to_datetime(df[['year', 'month', 'day']])) 0 2015-02-04 1 2016-03-05 dtype: datetime64[ns] 代码2 stamps = pd.date_range('2012-10-08 18:15:05', periods=4, freq='D') print(stamps) DatetimeIndex(['2012-10-08 18:15:05', '2012-10-09 18:15:05', '2012-10-10 18:15:05', '2012-10-11 18:15:05'], dtype='datetime64[ns]', freq='D') 代码3

Get count of occurrences inside two columns inside a csv [duplicate]

我与影子孤独终老i 提交于 2019-12-13 02:39:43
问题 This question already has answers here : DataFrame: add column with the size of a group (2 answers) Closed last year . Hello I have the following set of data in csv: Group Size Some_other_column1 Some_other_column2 Short Small blabla1 blabla6 Moderate Medium babla3 blabla8 Short Small blabla2 blabla7 Moderate Small blabla4 blabla9 Tall Large blabla5 blabla10 Short Medium blabla11 blabla12 I would like to get the following result using python code: Group Size Count Some_other_column1 Some

合并多个工作表

南笙酒味 提交于 2019-12-10 09:17:29
由于要连接四年的excel数据文件,为了节省时间,自己写了一段代码,虽然这只是我在工作中遇到的一个小的知识点,但是我希望把这记录下来,以免以后会有相同的处理,可以大大提升我们的工作效率。有需要的可以看一下 import xlrd import pandas as pd import time filename = ['本地户存活周期研究(2015-08-01至2016-07-31)', '本地户存活周期研究(2016-08-01至2017-07-31)', '本地户存活周期研究(2017-08-01至2018-07-31)', '本地户存活周期研究(2018-08-01至2019-07-31)'] desktop_root_direct = "C:/Users/liuqiping/Desktop/" def read_data(): # 读取Excel数据 data_list = [] for file in filename: workbook = xlrd.open_workbook(desktop_root_direct+file+".xlsx") worksheet = workbook.sheet_by_index(0) nrows = worksheet.nrows ncols = worksheet.ncols print(nrows, ncols) dicts =

python-pandas模块(很详细归类),pd.concat(后续补充)

a 夏天 提交于 2019-12-10 05:06:05
一.pandas模块 import pandas as pd 约定俗称为pd 1.模块官方文档地址 https://pandas.pydata.org/pandas-docs/stable/?v=20190307135750 2.对一维的数据处理成列表 1.pd.Serirs功能 import numpy as np import pandas as pd arr = np.array([1, 2, 3, 4, np.nan, ]) s = pd.Series(arr) print(s) #也可以不转换,但是转换后可以减少内存,尽量进行转换 # arr = np.array([1, 2, 3, 4, np.nan, ]) s = pd.Series([1, 2, 3, 4, np.nan, ]) print(s) 推荐Python大牛在线分享技术 扣qun:855408893 领域:web开发,爬虫,数据分析,数据挖掘,人工智能 3.对二维数据处理成列表 1.pd.DataFrame功能 df = pd.DataFrame(数据内容,index=纵坐标,columns=横坐标)#数据内容必须是列表或者np.array格式,尽量用np.array格式减少内存 #生成的数据列表预定俗称最好命名成df #对df的取值 2.pd.DataFrame参数表 属性 详解 dtype

考勤清洗

杀马特。学长 韩版系。学妹 提交于 2019-12-06 16:37:15
import pandas as pdimport numpy as npdata=pd.read_excel("C:/Users/mgxx/Desktop/工作簿1.xlsx")#填文件路径data.dropna(axis = 0)#将姓名填满那一行for i in data.index: if i % 2==0: data.iloc[i:i+2:2]=data[11].at[i]#构造空数组list1=[]for i in range(0,len(data[11]),2): for j in range(1,32): list1.append(j)aa = pd.DataFrame((x for x in list1),columns=["日期"])aa["姓名"]=pd.DataFrame((str(x) for x in list1))aa["时间"]=pd.DataFrame((str(x) for x in list1))#把姓名按顺序提取放到列表name=[]for i in range(len(data[11])): for j in range(1,32): if i % 2 ==0: name.append(str(data[j].at[i]))#将打卡时间按顺序提取放到列表time=[]#将提取的数据拼接成表for i in range(len(data[11

Pandas Exercises for Data Analysis (Continuously updated)

最后都变了- 提交于 2019-12-05 16:42:01
location # 1. How to import pandas and check the version? import pandas as pd print(pd.__version__) print(pd.show_versions(as_json=True)) 0.23.4 {'system': {'commit': None, 'python': '3.7.0.final.0', 'python-bits': 64, 'OS': 'Windows', 'OS-release': '10', 'machine': 'AMD64', 'processor': 'Intel64 Family 6 Model 142 Stepping 10, GenuineIntel', 'byteorder': 'little', 'LC_ALL': 'None', 'LANG': 'None', 'LOCALE': 'None.None'}, 'dependencies': {'pandas': '0.23.4', 'pytest': '3.8.0', 'pip': '19.2.1', 'setuptools': '40.2.0', 'Cython': '0.28.5', 'numpy': '1.17.2', 'scipy': '1.1.0', 'pyarrow': None,

pandas-文件加载

三世轮回 提交于 2019-12-05 11:02:52
读取csv 使用 read_csv 读取 sms = pd.read_csv('./data/SMSSpamCollection', sep='\t',header=None) sep: 分隔符 header: 不要表头 读取txt pd.read_csv('./type-.txt', sep='-', header=None) 使用 read_table读取 pd.read_table('./data/SMSSpamCollection', header=None) 读取excel表格 pd.read_excel('./read_xlsx.xlsx', sheet_name=2) 读取sqlite文件 conn = sqlite3.connect('./data.sqlite') weather_2017 = pd.read_sql('select * from weather_2017 limit 30', conn, index_col='index') - 设置行索引index_col 写入文件 weather_2017.to_csv('./weather_2017.csv') weather_2017.to_json('./weather_2017.json') weather_2017.to_html('./weather_2017.html') weather

Series与list

时光总嘲笑我的痴心妄想 提交于 2019-12-05 03:11:38
一、索引 1.1 索引顺序 list的索引为从0到n-1。不可更改索引。 Series的索引:如果未定义为从0到n-1。如果定义。则为定义的索引,可以更改索引。 import pandas as pd series_1 = pd.Series([1, 2, 3]) series_2 = pd.Series([1, 2, 3], index=['a', 'b', 'c']) list_1 = list([1, 2, 3]) print(series_1) print(series_2) print(list_1) 1.2 索引查值 1.2.1 有对应的索引 import pandas as pd series_1 = pd.Series([1, 2, 3]) series_2 = pd.Series([1, 2, 3], index=['a', 'b', 'c']) list_1 = list([1, 2, 3]) result_1 = series_1[1] result_2 = series_2['b'] result_3 = list_1[1] print('result_1', result_1) print('result_2', result_2) print('result_3', result_3) 1.2.2 无对应的索引 都会报错。 二、加减乘除操作 2.1

python快速获取网页标准表格内容

半世苍凉 提交于 2019-12-04 20:24:57
from html_table_parser import HTMLTableParser def tableParse(value): p = HTMLTableParser() p.feed(value) print(p.tables) import pandas as pd def framParse(value): soup=BeautifulSoup(value, 'html.parser') tables = soup.select('table') print(tables) df_list = [] for table in tables: print(pd.read_html(table.prettify())) df_list.append(pd.concat(pd.read_html(table.prettify()))) df = pd.concat(df_list) df.to_excel('vscode快捷键大全.xlsx') 以上两种方式均可以解析标准表格 来源: https://www.cnblogs.com/jestin/p/11881557.html