How to I mark an empty translation (msgstr) as translated in po gettext files?

后端 未结 3 1261
天涯浪人
天涯浪人 2021-02-08 02:11

I found that is the translation for a string (msgid) is empty all gettext tools will consider the string as untranslated.

Is there a workaround for this? I do want to ha

3条回答
  •  清酒与你
    2021-02-08 03:00

    I realize this is an old question, but I wanted to point out an alternate approach:

    msgid "This is a string"
    msgstr "\0"
    

    Since gettext uses embedded nulls to signal the end of a string, and it properly translates C escape sequences, I would guess that this might work and result in the empty string translation? It seemed to work in my program (based on GNU libintl) but I can't tell if this is actually standard / permitted by the system. As I understand gettext PO is not formally specified so there may be no authoritative answer other than looking at source code...

    https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html

    It's often not a nice thing to do to programmers to put embedded nulls in things but it might work in your case? Arguably it's less evil than the zero-width-space trick, since it will actually result in a string whose size is zero.


    Edit:

    Basically, the worst thing that can happen is you get a segfault / bad behavior when running msgfmt, if it would get confused about the size of strings which it assumes don't have embedded null, and overflow a buffer somewhere.

    Assuming that msgfmt can tolerate this though, libintl is going to have to do the right thing with it because the only means it has to return strings is char *, so the final application can only see up to the null character no matter what.

    For what it's worth, my po-parser library spirit-po explicitly supports this :)

    https://github.com/cbeck88/spirit-po


    Edit: In gettext documentation, it appears that they do mention the possibility of embedded nulls in MO files and said "it was strongly debated":

    https://www.gnu.org/software/gettext/manual/html_node/MO-Files.html

    Nothing prevents a MO file from having embedded NULs in strings. However, the program interface currently used already presumes that strings are NUL terminated, so embedded NULs are somewhat useless. But the MO file format is general enough so other interfaces would be later possible, if for example, we ever want to implement wide characters right in MO files, where NUL bytes may accidentally appear. (No, we don’t want to have wide characters in MO files. They would make the file unnecessarily large, and the ‘wchar_t’ type being platform dependent, MO files would be platform dependent as well.)

    This particular issue has been strongly debated in the GNU gettext development forum, and it is expectable that MO file format will evolve or change over time. It is even possible that many formats may later be supported concurrently. But surely, we have to start somewhere, and the MO file format described here is a good start. Nothing is cast in concrete, and the format may later evolve fairly easily, so we should feel comfortable with the current approach.

    So, at the least it's not like they're going to say "man, embedded null in message string? We never thought of that!" Most likely it works, if msgfmt doesn't crash then I would assume it's kosher.

提交回复
热议问题