How to extract text from a PDF file in Python?

后端 未结 1 1557
感情败类
感情败类 2020-12-13 05:15

How can I extract text from a PDF file in Python?

I tried the following:

import sys
import pyPdf

def convertPdf2String(path):
      content = \"\"
          


        
相关标签:
1条回答
  • 2020-12-13 05:43

    if you are running linux or mac you can use ps2ascii command in your code:

    import os
    
    input="someFile.pdf"
    output="out.txt"
    os.system(("ps2ascii %s %s") %( input , output))
    
    0 讨论(0)
提交回复
热议问题