Is it possible to diff PowerPoint version-controlled with git?

前端 未结 4 760
太阳男子
太阳男子 2021-02-06 10:39

I have some PowerPoint documents that I keep version-controlled with git. I want to know what differences are between versions of a file. Text is most important, images and form

相关标签:
4条回答
  • 2021-02-06 11:18

    I was unable to install python-pptx, as suggested by the accepted answer, so I looked for a node.js solution (that may also work for several other file formats that it can handle).

    Install https://github.com/dbashford/textract (npm install --global textract).

    Define how to diff "textract" in your .git config. For my Windows machine,

    [diff "textract"]
        binary = true
        textconv=textract.cmd
    

    Define in your .gitattributes that *.pptx file should use diff "textract"

    *.pptx diff=textract
    

    git diff happily.

    0 讨论(0)
  • 2021-02-06 11:18

    Not really. PowerPoint file is essentially an archive (zip) of the folder full of files. Git will treat it as a binary file (cause it is).

    Maybe there's a 3rd party extension to do it but I've never heard of it.

    0 讨论(0)
  • 2021-02-06 11:36

    I can't speak directly to git as we use Visual Studio + TFS at work. However, a bit of research reveals this should work. What I do on VS is to integrate WinMerge and its plugin which supports a text comparison of MS Office and PDF files. This allows me to do diffs of pptx, docx, pdf, etc. files published to version control.

    For git, the way it should work is:

    1) Get WinMerge with the xdocdiff plugin: http://freemind.s57.xrea.com/xdocdiffPlugin/en/index.html 2) Integrate WinMerge with git: https://coderwall.com/p/76wmzq/winmerge-as-git-difftool-on-windows

    Hopefully this will allow you to see the text-based diffs for your PowerPoint.

    0 讨论(0)
  • 2021-02-06 11:38

    I wrote this for use with git on the command-line (requires Python and the python-pptx library):

    """
    Setup -- Add these lines to the following files:
    --- .gitattributes
    *.pptx diff=pptx
    
    --- .gitconfig (or repo\.git\config    or your_user_home\.gitconfig) (change the path to point to your local copy of the script)
    [diff "pptx"]
        binary = true
        textconv = python C:/Python27/Scripts/git-pptx-textconv.py
    
    usage:
    git diff your_powerpoint.pptx
    
    
    Thanks to the  python-pptx docs and this snippet:
    http://python-pptx.readthedocs.org/en/latest/user/quickstart.html#extract-all-text-from-slides-in-presentation
    """
    
    import sys
    from pptx import Presentation
    
    
    if __name__ == '__main__':
        if len(sys.argv) != 2:
            print "Usage: git-pptx-textconv file.xslx"
    
        path_to_presentation = sys.argv[1]
    
        prs = Presentation(path_to_presentation)
    
        for slide in prs.slides:
            for shape in slide.shapes:
                if not shape.has_text_frame:
                    continue
                for paragraph in shape.text_frame.paragraphs:
                    par_text = ''
                    for run in paragraph.runs:
                        s = run.text
                        s = s.replace(r"\\", "\\\\")
                        s = s.replace(r"\n", " ")
                        s = s.replace(r"\r", " ")
                        s = s.replace(r"\t", " ")
                        s = s.rstrip('\r\n')
    
                        # Convert left and right-hand quotes from Unicode to ASCII
                        # found http://stackoverflow.com/questions/816285/where-is-pythons-best-ascii-for-this-unicode-database
                        # go here if more power is needed  http://code.activestate.com/recipes/251871/
                        # or here                          https://pypi.python.org/pypi/Unidecode/0.04.1
                        punctuation = { 0x2018:0x27, 0x2019:0x27, 0x201C:0x22, 0x201D:0x22 }
                        s.translate(punctuation).encode('ascii', 'ignore')
                        s = s.encode('utf-8')
                        if s:
                            par_text += s
                    print par_text
    
    0 讨论(0)
提交回复
热议问题