Data encoding when submitting a PDF form using AcroForm technology

前端 未结 1 1829
遥遥无期
遥遥无期 2020-12-01 22:40

When I create a PDF form (for instance using Acrobat) that contains text fields in AcroForm format (PDF dictionaries, no XFA), and I submit the data to a server, how can I s

相关标签:
1条回答
  • 2020-12-01 23:00

    I've just found the answer to my main question myself. I didn't find anything in ISO-32000-1 or the ISO-32000-2 draft, but studying the Acrobat JavaScript reference, I found the cCharset parameter that is available for the submitForm() method. That parameter defines:

    The encoding for the values submitted. String values are utf-8, utf-16, Shift-JIS, BigFive, GBK, and UHC. If not passed, the current Acrobat behavior applies. For XML-based formats, utf-8 is used. For other formats, Acrobat tries to find the best host encoding for the values being submitted. XFDF submission ignores this value and always uses utf-8.

    In other words: in my case GBK was used because it fits best to submit Chinese characters. However, one could force UTF-8 by using the submitForm() JavaScript method using the appropriate value.

    Based on this question, I have asked the ISO committee to fix this problem in ISO-32000-2. As a result, an extra possible entry was added to the table entitled Additional entries specific to a submit-form action in section 12.7.6.2:

    CharSet: string

    (Optional; inheritable) Possible values include: utf-8, utf-16, Shift-JIS, BigFive, GBK, or UHC.

    Starting with PDF 2.0, this problem will no longer exist.

    Update: my suggestion made ISO 32000-2 (aka PDF 2.0):

    The CharSet key doesn't exist in ISO 32000-1; it was introduced in ISO 32000-2.

    0 讨论(0)
提交回复
热议问题