How can I extract text from a PDF file in Python?
I tried the following:
import sys import pyPdf def convertPdf2String(path): content = \"\"
if you are running linux or mac you can use ps2ascii command in your code:
import os input="someFile.pdf" output="out.txt" os.system(("ps2ascii %s %s") %( input , output))