Sorting Excel column with Python

对着背影说爱祢 提交于 2021-02-18 14:00:28

问题


Let's say I have a list like this:

time    type    value
80      1A      10
100     1A      20
60      18      56
80      18      7
80      2A      10
100     2A      10
80      28      10
100     28      20

and I need to change it to be like this:

            time        
type    60  80  100
1A          10  20
1B      56  7   
2A          10  10
2B          10  20

So far what I did is just basic sorting of the column:

target_column = 0
book = open_workbook('result.xls')
sheet = book.sheets()[0]
data = [sheet.row_values(i) for i in range(sheet.nrows)]
labels = data[0]
data = data[1:]
data.sort(key= lambda x: x[target_column])

bk = xlwt.Workbook()
sheet = bk.add_sheet(sheet.name)
for idx, label in enumerate(labels):
    sheet.write(0, idx, label)

for idx_r, row in enumerate(data):
    for idx_c, value in enumerate(row):
        sheet.write(idx_r+1, idx_c, value)

bk.save('resul.xls')

How can I it with Python?


回答1:


You can use pandas.DataFrame.pivot() to do that like:

Code:

df.pivot(index='type', columns='time', values='value')

Test Code:

df = pd.read_fwf(StringIO(u"""
    time    type    value
    80      1A      10
    100     1A      20
    60      18      56
    80      18      7
    80      2A      10
    100     2A      10
    80      28      10
    100     28      20"""), header=1)
print(df)

print(df.pivot(index='type', columns='time', values='value'))

Results:

   time type  value
0    80   1A     10
1   100   1A     20
2    60   18     56
3    80   18      7
4    80   2A     10
5   100   2A     10
6    80   28     10
7   100   28     20

time   60    80    100
type                  
18    56.0   7.0   NaN
1A     NaN  10.0  20.0
28     NaN  10.0  20.0
2A     NaN  10.0  10.0



回答2:


This is just a educational. Right answer is Pandas way by @Stephen Rauch

from xlrd import open_workbook
from openpyxl import Workbook


book = open_workbook('pivot.xls')
sheet = book.sheet_by_index(0)
pivot = {}
for row_index in range(1, sheet.nrows):
    time = sheet.cell(row_index, 0).value
    type = sheet.cell(row_index, 1).value
    value = sheet.cell(row_index, 2).value

    if type not in pivot:
        pivot[type] = {}
        pivot[type][time] = value
    else:
        pivot[type][time] = value
wb = Workbook()
ws1 = wb.active
ws1.merge_cells('B1:D1')
ws1.append(("", "time"))
ws1.append(("type", "60", "80", "100"))
for type, value in pivot.items():
    ws1.append((type, value.get(60, None), value.get(80, None), value.get(100, None)))
wb.save('out.xlsx')



回答3:


import pandas as pd
df = pd.read_excel('pivot.xls')
df_pivot = df.pivot(index='type', columns='time', values='value')
df_pivot.to_excel('output.xlsx')


来源:https://stackoverflow.com/questions/48982201/sorting-excel-column-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!