一,使用df.loc[]按照条件来查询
File "/Users/jiangmengliang/PycharmProjects/jiang.ml/budget/cost_budget/expense_down_api.py", line 2434, in main
rev = ExpenseApiDown(cfg_path, log=logging)
File "/Users/jiangmengliang/PycharmProjects/jiang.ml/budget/cost_budget/expense_down_api.py", line 793, in __init__
self.done = self.split() # 合并里需要做层级判断 ,生成层级.json
File "/Users/jiangmengliang/PycharmProjects/jiang.ml/budget/cost_budget/expense_down_api.py", line 1288, in split
done = self.boss_split()
File "/Users/jiangmengliang/PycharmProjects/jiang.ml/budget/cost_budget/expense_down_api.py", line 1617, in boss_split
self.get_two_rounds_for_data()
File "/Users/jiangmengliang/PycharmProjects/jiang.ml/budget/cost_budget/expense_down_api.py", line 1234, in get_two_rounds_for_data
df = self.calculate_two_rounds_of_data(input_sheet_name,columns_dict)
File "/Users/jiangmengliang/PycharmProjects/jiang.ml/budget/cost_budget/expense_down_api.py", line 1201, in calculate_two_rounds_of_data
df1 = self.two_rounds_util_for_president_df(df1,df2,sale_rate_df,screen_column,screen_column_value,columns_dict)
File "/Users/jiangmengliang/PycharmProjects/jiang.ml/budget/cost_budget/expense_down_api.py", line 1058, in two_rounds_util_for_president_df
df1.loc[(df1["brand_no"] == brand_no) & (df1["fy_year"] == self.fin_year), column] = result
File "/Users/jiangmengliang/PycharmProjects/workspacespython3/venv/lib/python3.7/site-packages/pandas/core/ops/__init__.py", line 1229, in wrapper
res = na_op(values, other)
File "/Users/jiangmengliang/PycharmProjects/workspacespython3/venv/lib/python3.7/site-packages/pandas/core/ops/__init__.py", line 1115, in na_op
result = method(y)
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
这个警告出现的原因:
1.多个条件用&连接但是条件没有用()括起来。
示例:df1.loc[(df1["brand_no"] == brand_no) & df1["fy_year"] == self.fin_year, column]
2.是在df中,默认的把fy_year转化为数字类型,然而self.fin_year却是字符串类型,类型不统一进行比较,python会自动识别,但是这个不确定性,会使python给出警告。
二,使用pd.merge(df1,df2,how='left',on=[columns])来实现excel跨sheet的匹配
三,使用df.groupby([columns]).sum().reset_index()来简单的按照columns这些字段分别汇总,sum()会自动把数据类型的进行分组求和
四,json和df互相转换
json.loads(df.to_json(orient="records"), encoding="utf8")#把df转换为json字符串
其中指定 orient=”records“可以按照列名和值来生成键值对
df1 = pd.read_json(json.dumps(data_json, ensure_ascii=False), orient="records")
这个是json读取为df,其中列名为:键值对的键
其中ensure_ascii=False 用来指定编码不使用默认的ASCII编码
五,把一个df追加到另外一个df后面
result_df = df1.append(df2, ignore_index=True, sort=True)
ignore_index = True 是按照列名相同来追加的
sort = True 是说行号追加后是向后排的
来源:oschina
链接:https://my.oschina.net/jiangmengl/blog/3234832