I'm using XFDF files to fill out PDF-forms serverside with PHP and pdftk but my problem is that no non-english characters (ä, ö, å etc.) are printed to the form fields.
Here is the function I use to parse the XFDF file:
function createFDF($file,$info,$enc='UTF-8'){
$data='<?xml version="1.0" encoding="'.$enc.'"?>'."\n".
'<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">'."\n".
'<fields>'."\n";
foreach($info as $field => $val){
$data.='<field name="'.$field.'">'."\n";
if(is_array($val)){
foreach($val as $opt)
$data.='<value>'.htmlentities($opt,ENT_COMPAT,$enc).'</value>'."\n";
}else{
$data.='<value>'.htmlentities($val,ENT_COMPAT,$enc).'</value>'."\n";
}
$data.='</field>'."\n";
}
$data.='</fields>'."\n".
'<ids original="'.md5($file).'" modified="'.time().'" />'."\n".
'<f href="'.$file.'" />'."\n".
'</xfdf>'."\n";
return $data;
And the resulting XFDF file looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>
<field name="loadman-pudotuspainolaitteen-mittaustulosten-tallenne">
<value>1201</value>
</field>
<field name="tutkittavarakenne-rivi1">
<value>a</value>
</field>
<field name="tutkittavarakenne-rivi2">
<value></value>
</field>
<field name="tutk-pvm">
<value>11.12.2012</value>
</field>
<field name="mittauksen_suorittaja">
<value>o</value>
</field>
<field name="vast-tyonjohtaja">
<value>ö</value>
</field>
<field name="rakennemateriaali">
<value>ä</value>
</field>
<field name="laatuvaatimukset">
<value>å</value>
</field>
<field name="mittauspaikan_tiivistysmenetelma">
<value>á</value>
</field>
<field name="pohjalevy">
<value>é</value>
</field>
<field name="pohjamaa-alusrakenne">
<value>í</value>
</field>
<field name="mittauspaikan-tiivistysmenetelma">
<value>è</value>
</field>
<field name="emoduli">
<value>ö</value>
</field>
<field name="tiiveys">
<value>öä</value>
</field>
<field name="huomautukset_ja_loppupaatelmat1">
<value>öä</value>
</field>
<field name="huomautukset_ja_loppupaatelmat2">
<value>öä</value>
</field>
<field name="huomautukset_ja_loppupaatelmat3">
<value>öä</value>
</field>
<field name="empa1">
<value>ö</value>
</field>
<field name="empa1-e">
<value>ö</value>
</field>
<field name="empa2">
<value>ö</value>
</field>
<field name="empa2-e">
<value>ö</value>
</field>
<field name="allekirjoitus">
<value>Einomies Porkkakoski</value>
</field>
</fields>
<ids original="84b0ff7a04b017303be186faa0d1254a" modified="1343290963" />
<f href="assets/loadman.pdf" />
</xfdf>
The fields with english letters print perfectly but letters with acutes, graves or scandinavian additions wont transfer to the PDF file. EXCEPT for some reason
<field name="huomautukset_ja_loppupaatelmat1">
<value>öä</value>
</field>
works perfectly and prints öä!
The command I run is
pdftk <pdf-file> fill_form <xfdf-file> output <output file> flatten
This does not result any errors.
I'm using Debian 6.0, PHP 5.3.3-7+squeeze13 and the pdftk version is 1.44-5
UPDATE I noticed that if I don't flatten the generated file and open it, the characters are printed correctly when the field is activated but hidden again when the field is unfocused. If I manually type anything to the file, the special characters will show up also. Saved and reopened file however doesn't show the text unless again some text is added.
UPDATE 2 Got the damn thing fixed. Originally the forms were made with Adobe Acrobat Pro on OSX Snow Leopard. Now I remade the forms with LibreOffice + Oracle PDF Import plugin and everything seems to be working!
I think you will have more luck if you use the following list:
Ä
for Ä (instead ofÄ
)Å
for Å (instead ofÅ
)Ö
for Ö (instead ofÖ
)Ü
for Ü (instead ofÜ
)ß
for ß (instead ofß
)ä
for ä (instead ofä
)å
for å (instead ofå
)ö
for ö (instead ofö
)ü
for ü (instead ofü
)
I'll let you yourself find out how to extend that list until it reaches completeness :-)
It's because you use htmlentities
in your PHP script. That converts the accented symbols to &xxxx;
Set your XML encoding to iso-8859-1
or WINDOWS-1252
and leave out the htmlentities
in your PHP script
Another thing to try is to use utf8_encode
instead of htmlentities
(and not modify the XML-encoding)
To support any UFT-8 characters, I wrote PdfFormFillerUTF-8: http://sourceforge.net/projects/pdfformfiller2/
来源:https://stackoverflow.com/questions/11665394/pdftk-xfdf-php-cant-handle-umlauts