问题
I have spent a good amount of time trying to determine what is going wrong exactly, with the code I am using to convert pdf to docx (and doc to docx) using LibreOffice.
I have used both the windows run interface to test-run some of the code I have found to be relevant, and have tried on python as well, neither of which works.
I have LibreOffice v6.0.2 installed on windows. I have been using variations of this code to attempt to convert some pdfs to docx of which the specific pdf file is not really relevant:
import subprocess
lowriter='C://Program Files/LibreOffice/program/swriter.exe'
subprocess.run('{} --invisible --convert-to docx --outdir "{}" "{}"'
.format(lowriter,'dir',
'filepath.pdf',),shell=True)
I hvae tried code, again, in both the run interface on the windows os, and through python using the above code, with no luck. I have tried without the outdir as well, just in case I was writing that incorrectly, but always get a return code of 1:
CompletedProcess(args='C://Program Files/LibreOffice/program/swriter.exe
--invisible --convert-to docx --outdir "{dir}"
{filepath.pdf}"', returncode=1)
The dir and filepath.pdf are place holders I have put.
I have a similar problem with the doc to docx conversion.
回答1:
There are a number of problems here. You should first get the --convert-to
call to work from the command line as @CristiFati commented, and then implement in python.
Here is the code that works on my system. No //
in the path, and quotes are needed. Also, the folder is LibreOffice 5
on my system.
import subprocess
lowriter = 'C:/Program Files (x86)/LibreOffice 5/program/swriter.exe'
subprocess.run(
'"{}" --convert-to docx --outdir "{}" "{}"'
.format(lowriter,'dir', 'filepath.doc',), shell=True)
Finally, it looks like converting from PDF to DOCX is not supported. LibreOffice Draw can open a PDF file and save as ODG format.
EDIT:
Here is working code to convert from PDF. I upgraded to LO 6, so the version number ("LibreOffice 5") is no longer required in the path.
import subprocess
loffice = 'C:/Program Files/LibreOffice/program/soffice.exe'
subprocess.run(
'"{}" --convert-to odg --outdir "{}" "{}"'
.format(loffice,'dir', 'filepath.pdf',), shell=True)
来源:https://stackoverflow.com/questions/49739245/having-trouble-using-python-and-libreoffice-to-convert-pdf-to-docx-and-doc-to-do