TMX(Translation Memory eXchange) files in python

烈酒焚心 提交于 2020-04-12 20:38:32

问题


Is there a module for handling TMX(Translation Memory eXchange) files in python, if not, what would be another way to do it?

As it stands, I have a giant 2gb file with French-English subtitles. Would it be possible to even handle such a file or would I have to break it down?


回答1:


As @hurrial said, you can use translate-toolkit.

Install

This toolkit is only available using pip. To install it, run:

pip install translate-toolkit

Usage

Assume that you have the following simple sample.tmx file:

<tmx version="1.4">
  <header
    creationtool="XYZTool" creationtoolversion="1.01-023"
    datatype="PlainText" segtype="sentence"
    adminlang="en-us" srclang="en"
    o-tmf="ABCTransMem"/>
  <body>
    <tu>
      <tuv xml:lang="en">
        <seg>Hello world!</seg>
      </tuv>
      <tuv xml:lang="ar">
        <seg>اهلا بالعالم!</seg>
      </tuv>
    </tu>
  </body>
</tmx>

You can parse this simple file like so:

>>> from translate.storage.tmx import tmxfile
>>>
>>> with open("sample.tmx", 'rb') as fin:
...     tmx_file = tmxfile(fin, 'en', 'ar')
>>>
>>> for node in tmx_file.unit_iter():
...     print(node.getsource(), node.gettarget())
Hello world! اهلا بالعالم!

For more info, check the official documentation from here.




回答2:


You may check the following links:

  • pretranslate: http://translate-toolkit.readthedocs.org/en/latest/commands/pretranslate.html
  • Translate toolkit: http://en.wikipedia.org/wiki/Translate_Toolkit
  • Translate toolkit package: https://pypi.python.org/pypi/translate-toolkit
  • Translate API: https://github.com/translate/translate

Cheers,



来源:https://stackoverflow.com/questions/20356149/tmxtranslation-memory-exchange-files-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!