Can anyone tell me how to read the Excel file without hidden columns in Python with Pandas or any other modules?
When I try to read excel file using Pandas, for exam
I don't think pandas
does it out of the box.
Input
You will have to unfortunately do some redundant reading (twice). openpyxl
does what you want -
import openpyxl
import pandas as pd
loc = 'sample.xlsx'
wb = openpyxl.load_workbook(loc)
ws = wb.get_sheet_by_name('Sheet1')
hidden_cols = []
for colLetter,colDimension in ws.column_dimensions.items():
if colDimension.hidden == True:
hidden_cols.append(colLetter)
df = pd.read_excel(loc)
unhidden = list( set(df.columns) - set(hidden_cols) )
df = df[unhidden]
print(df)
Output
C A
0 1 7
1 9 7
2 5 10
3 7 7
4 4 8
5 4 6
6 9 9
7 10 3
8 1 2
Explanation
Reading the file first using openpyxl
-
loc = 'C:/Users/FGB3140/Desktop/sample.xlsx'
wb = openpyxl.load_workbook(loc)
ws = wb.get_sheet_by_name('Sheet1')
Searching for hidden property in cells (this is where the hidden columns are captured)
hidden_cols = []
for colLetter,colDimension in ws.column_dimensions.items():
if colDimension.hidden == True:
hidden_cols.append(colLetter)
Read the same file using pandas - df = pd.read_excel(loc)
Find the unhidden columns by subtracting the hidden ones from the rest -
unhidden = list( set(df.columns) - set(hidden_cols) )
Finally, filter out the unhidden columns -
df = df[unhidden]
P.S
I know I could have done colDimension.hidden == False
or simple if not colDimension.hidden
- The goal here is to capture the hidden columns and then do the filtering accordingly. Hope this helps!