How to diff .odt files with difftool? kdiff3 diff outputs unreadable characters

*爱你&永不变心* 提交于 2019-12-07 18:16:40

问题


In git I'm trying to use .gitattributes to compare .odt files, libreofice writer files, with difftool. From following this guide: http://www-verimag.imag.fr/~moy/opendocument/ I made a .gitattributes file .gitattributes with this:

*.ods diff=odf
*.odt diff=odf
*.odp diff=odf

*.ods difftool=odf
*.odt difftool=odf
*.odp difftool=odf

This made git diff compare the text in .odt, however when git difftool launches kdiff3 to compare the .odt files, I get this pop-up error:

Some input characters could not be converted to valid unicode.
You might be using the wrong codec. (e.g. UTF-8 for non UTF-8 files).
Don't save the result if unsure. Continue at your own risk.
Affected input files are in A, B.

...and all of the characters in the files are mumbo jumbo.

What went wrong? How do I fix this?

PS:

I don't know if this is important but I guess I haven't configured 'diff.tool', because every time I command:

$ git difftool 

I get this output:

This message is displayed because 'diff.tool' is not configured.
See 'git difftool --tool-help' or 'git help config' for more details.
'git difftool' will now attempt to use one of the following tools:
opendiff kdiff3 tkdiff xxdiff meld kompare gvimdiff diffuse diffmerge ecmerge p4merge araxis bc codecompare emerge vimdiff

Viewing (1/1): 'diffexperiment.odt'
Launch 'kdiff3' [Y/n]:

Could that be why kdiff3 doesn't seem to work with odt2txt?

EDIT: I retried to do this with Microsoft Word documents and got a little further here.

I played around with .kdiff3rc configuration... none of the options I added seemed to make the unreadable characters readable. I changed the comparison tool to vimdiff; and when I did git difftool on microsoft word documents, vimdiff displayed a list of files ending in .xml instead of unreadable characters.

When I pushed enter on one of the files this displayed:

<?xml version="1.0" encoding="UTF-8"?>
  " Browsing zipfile /tmp/4LMJbj_HI I am writing something here..docx                          |<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"><Override PartName
  " Select a file with cursor and press ENTER                                                  |="/_rels/.rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/><Overr
                                                                                               |ide PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.w
  _rels/.rels                                                                                  |ordprocessingml.settings+xml"/><Override PartName="/word/_rels/document.xml.rels" ContentType=
  word/settings.xml                                                                            |"application/vnd.openxmlformats-package.relationships+xml"/><Override PartName="/word/fontTabl
  word/_rels/document.xml.rels                                                                 |e.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+x
  word/fontTable.xml                                                                           |ml"/><Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officed
  word/numbering.xml                                                                           |ocument.wordprocessingml.styles+xml"/><Override PartName="/word/document.xml" ContentType="app
  word/styles.xml                                                                              |lication/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/><Override Part
  word/document.xml                                                                            |Name="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-p
  docProps/app.xml                                                                             |roperties+xml"/><Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlfo
  docProps/core.xml                                                                            |rmats-package.core-properties+xml"/>
  [Content_Types].xml                                                                          |</Types>

I posted a new question on this issue here.


回答1:


You would need, in addition of the .gitattribute, to configure what odf means:

git config diff.odf.textconv odt2txt

And you need odt2txt (a simple converter from OpenDocument Text to plain text) in your $PATH (Linux/Mac) or %PATH% (Windows).

No need to configure difftool, as kdiff3 is enough by default.
But kdiff3 needs to open a text file, hence the need to odt2txt (in order to convert first the doc into a text file)


For more on textconv, see "Performing text diffs of binary files":

Sometimes it is desirable to see the diff of a text-converted version of some binary files. For example, a word processor document can be converted to an ASCII text representation, and the diff of the text shown.
Even though this conversion loses some information, the resulting diff is useful for human viewing (but cannot be applied directly).

The textconv config option is used to define a program for performing such a conversion. The program should take a single argument, the name of a file to convert, and produce the resulting text on stdout.

Note

The text conversion is generally a one-way conversion; This means that diffs generated by textconv are not suitable for applying.

For this reason, only git diff and the git log family of commands (i.e., log, whatchanged, show) will perform text conversion.
git format-patch will never generate this output.

If you want to send somebody a text-converted diff of a binary file (e.g., because it quickly conveys the changes you have made), you should generate it separately and send it as a comment in addition to the usual binary diff that you might send.


The OP Jack mentions in the comments:

On Linux I ran in my home directory:

$ git config diff.odf.textconv odt2txt

I had odt2txt installed... and I assume odt2txt is in $PATH, because when I run $ odt2txt, I get information on odt2txt.
However, none of those things seem to make git diff .odt files for some reason.
When I $ git diff fileone.odt filetwo.odt, I still get the output of Binary files fileone.odt and filetwo.odt differ instead of exactly how the text differentiates.
Not sure why it's not working.




回答2:


My guess is that kdiff3 in your case

Some input characters could not be converted to valid unicode. You might be using the wrong codec. (e.g. UTF-8 for non UTF-8 files)....

complains because it cannot find glyph for a certain character(s) for particular font, i.e. it cannot draw it (them).

kdiff3 has lots of configuration options that can be set in ~/.kdiff3rc configuration file (here is example). I would play with some of them related to encoding and font. For example, start with changing fonts, e.g.

Font=Arial

BTW, when you open these odt files with your editor - which readable for you font it is?

PS Options can be also passed to kdiff3 in command line: kdiff3 --cs "Option1=Val1" --cs "Option2=Val2" --cs ...



来源:https://stackoverflow.com/questions/33448260/how-to-diff-odt-files-with-difftool-kdiff3-diff-outputs-unreadable-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!