问题
I want to take a certain part of data from a sheet and copy it to another sheet.
So far, I have a dictionary with key as start row and value as end row.
Using this, I would like to do the following:
-Get the first range from sheet0 and append it to sheet1
-Get the second range from sheet0 and append it to sheet2
-Get the third range from sheet0 and append it to sheet3
I tried the following:
#First range starts at 1 and ends at 34, second range from 34-52 and third from 52-75
myDict = {1: 34, 34: 52, 52: 75}
#store all the sheets, ignoring main sheet
sheet = wb.worksheets[1:]
for item in myDict:
for col in ws.iter_cols(min_row=item, max_row=myDict[item], min_col=1 , max_col=ws.max_column):
for cell in col:
for z in sheet:
z.append(col)
Another approach was to use a function and lists:
startRow=[1,34,52]
endRow=[34,52,75]
def addRange(first, second):
for col in ws.iter_cols(min_row=first, max_row=second, min_col=1 , max_col=ws.max_column):
for cell in col:
for z in sheet:
z.append(col)
#Call function
for start, end in zip(startRow, endRow):
addRange(start, end)
But on both occasions, I get the following error "ValueError: Cells cannot be copied from other worksheets"
Does anyone have a clue on what am I missing in here?
Thanks in advance!
回答1:
from openpyxl import load_workbook
from itertools import product
filename = 'wetransfer-a483c9/testFile.xlsx'
wb = load_workbook(filename)
sheets = wb.sheetnames[1:]
sheets
['Table 1', 'Table 2', 'Table 3']
#access the main worksheet
ws = wb['Main']
#get span for Table
#this allows us determine the boundaries for each table
#where one starts and the other stops
span = []
for row in ws:
for cell in row:
if (cell.value
and (cell.column == 2) #restrict search to column2, which is where the Table entries are
#this also avoids the int error, since integers are not iterable
and ("Table" in cell.value)):
span.append(cell.row)
#add sheet length to d
#allows us effectively capture the data boundaries
span.append(ws.max_row + 1)
span
[1, 29, 42, 58]
#get pairing of boundaries
#+1 ensures the end is included when capturing the tables
#convert to string format
#openpyxl refers to the boundaries in string form
#openpyxl has a 1 index notation
#as such, instead of adding 1, u take one off
boundaries = [":".join(map(str,(start, end-1))) for start, end in zip(span,span[1:])]
boundaries
['1:28', '29:41', '42:57']
#create a cartesian of the main sheet, the boundaries and the other sheets
#note that boundaries and sheets are zipped - essentially they are a pair
#as such, we paired each table with a boundary -
#table 1 is bound to 1:28,
#table 2 is bound to 29:41, ...
#next we combine the main sheet with the pair
#so main sheet is paired with (table 1, 1:28)
#same main sheet is paired with (table 2, 29:41) ...
for main,(ref, table) in product([ws],zip(boundaries, sheets)):
#get the data within the ranges
#since we have successfully paired the main sheet with every pair of table and boundary
#we can safely get the data for that particular region
#and shift it to the particular table
#that is all this part does
#so for table 1, the main sheet picks only 1:28, since it is bound to it for this particular table
#when it's done with table 1, it goes back to the loop and picks off from table 2, and picks only 29:41, since that is the boundary in that section, and on and on
sheet_content = main[ref]
#append row to the specified table
for row in sheet_content:
#here we iterate through the main sheet
#get one row of data
#append it to the table
#move to the next row, append to the table beneath the previous one
#and repeat the process till the boundary has been exhausted
wb[table].append([cell.value for cell in row])
#excel file should be closed
#else it might fail
wb.save(filename)
来源:https://stackoverflow.com/questions/61888310/openpyxl-transfer-range-of-rows-from-a-worksheet-to-another