Using PDFMiner (Python) with online pdf files. Encode the url?

前端 未结 1 948
鱼传尺愫
鱼传尺愫 2021-01-14 08:37

I am wishing to extract the content of pdf files available online using PDFMiner.

My code is based on the one available in the documentation used to ext

相关标签:
1条回答
  • 2021-01-14 09:21

    Well, I finally found out a solution,

    I resorted on Request and StringIO and got rid off the open('my_file', 'rd') command

    from urllib2 import Request
    from StringIO import StringIO
    
    url = 'my_url'
    
    open = urllib2.urlopen(Request(url)).read()
    memoryFile = StringIO(open)
    
    parser = PDFParser(memoryFile)
    

    That way Python considers the url as a file (to say so).

    0 讨论(0)
提交回复
热议问题