问题
I have many PDFs in a folder. I want to extract the text from these PDFs using xpdf. For example :
- example1.pdf extract to example1.txt
- example2.pdf extract to example2.txt
- etc..
here is my code :
<?php
$path = 'C:/AppServ/www/pdfs/';
$dir = opendir($path);
$f = readdir($dir);
while ($f = readdir($dir)) {
if (eregi("\.pdf",$f)){
$content = shell_exec('C:/AppServ/www/pdfs/pdftotext '.$f.' ');
$read = strtok ($f,".");
$testfile = "$read.txt";
$file = fopen($testfile,"r");
if (filesize($testfile)==0){}
else{
$text = fread($file,filesize($testfile));
fclose($file);
echo "</br>"; echo "</br>";
}
}
}
I get blank result. What's wrong with my code?
回答1:
try using this :
$dir = opendir($path);
$filename = array();
while ($filename = readdir($dir)) {
if (eregi("\.pdf",$filename)){
$content = shell_exec('C:/AppServ/www/pdfs/pdftotext '.$filename.' ');
$read = strtok ($filename,".");
$testfile = "$read.txt";
$file = fopen($testfile,"r");
if (filesize($testfile)==0){}
else{
$text = fread($file,filesize($testfile));
fclose($file);
echo "</br>"; echo "</br>";
}
}
回答2:
You do not have to create a temporary txt file
$command = '/AppServ/www/pdfs/pdftotext ' . $filename . ' -';
$a = exec($command, $text, $retval);
echo $text;
if it does not work check the error logs of the server.
回答3:
The lines
echo "</br>";
echo "</br>";
should be
echo "</br>";
echo $text."</br>";
Hope this helps
来源:https://stackoverflow.com/questions/9286036/how-to-extract-texts-from-pdfs-using-xpdf