Identify external workbook links using openpyxl

我怕爱的太早我们不能终老 提交于 2021-01-27 11:21:28

问题


I am trying to identify all cells that contain external workbook references, using openpyxl in Python 3.4. But I am failing. My first try consisted of:

def find_external_value(cell):
# identifies an external link in a given cell

    if '.xls' in cell.value:
        has_external_reference = True

    return has_external_value

However, when I print the cell values that have external values to the console, it yields this:

=[1]Sheet1!$B$4
=[2]Sheet1!$B$4

So, openpyxl obviously does not parse formulas containing external values in the way I imagined and since square brackets are used for table formulas, there is no sense in trying to pick up on external links in this manner.

I dug a little deeper and found the detect_external_links function in the openpyxl.workbook.names.external module (reference). I have no idea if one can actually call this function to do what I want.

From the console results it seems as if openpyxl understands that there are references, and seems to contain them in a list of sorts. But can one access this list? Or detect if such a list exists?

Whichever way - all I need is to figure out if a cell contains a link to an external workbook.


回答1:


I have found a solution to this. Use the openpyxl library for load the xlsx file as

import openpyxl
wb=openpyxl.load_workbook("Myworkbook.xlsx")

"""len(wb._external_links)        *Add this line to get count of linked workbooks*"""

items=wb._external_links
for index, item in enumerate(items):
    Mystr =wb._external_links[index].file_link.Target
    Mystr=Mystr.replace("file:///","")
    print(Mystr.replace("%20"," "))


----------------------------
Out[01]: ##Indicates that the workbook has 4 external workbook links##
/Users/myohannan/AppData/Local/Temp/49/orion/Extension Workpapers_Learning Extension Calc W_83180610.xlsx
/Users/lmmeyer/AppData/Local/Temp/orion/Complete Set of Workpapers_PPS Workpapers 123112_111698213.xlsx
\\SF-DATA-2\IBData\TEMP\ie5\Temporary Internet Files\OLK8A\LBO Models\PIGLET Current.xls
/WINNT/Temporary Internet Files/OLK3/WINDOWS/Temporary Internet Files/OLK8304/DEZ.XLS     



回答2:


I decided to veer outside of openpyxl in order to achieve my goal - even though openpyxl has numerous functions that refer to external links I was unable to find a simple way to achieve my goal.

Instead I decided to use ZipFile to open the workbook in memory, then search for the externalLink1.xml file. If it exists, then the workbook contains external links:

import tkinter as tk
from tkinter import filedialog
from zipfile import ZipFile
Import xml.etree.ElementTree

root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()

with ZipFile(file_path) as myzip:
    try:
        my_file = myzip.open('xl/externalLinks/externalLink1.xml')
        e = xml.etree.ElementTree.parse(my_file).getroot()
        print('Has external references')
    except:
        print('No external references')

Once I have the XML file, I can proceed to identify the cell address, value and other information by running through the XML tree using ElementTree.




回答3:


There is no way to do what you want from within openpyxl. You are free to try and use the library to work with a file archive yourself but this will entail working closely with the file format specification.



来源:https://stackoverflow.com/questions/34096323/identify-external-workbook-links-using-openpyxl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!