How does Open Office compress its files?

孤人 提交于 2019-11-30 05:16:13

This worked for me:

  1. Uncompress the original document file (it's a normal zip file) to some directory:

    $ mkdir document
    $ cd document
    $ unzip ../document.odt
    
  2. Modify the uncompressed data.

  3. Create a new odt:

    $ zip -0 -X ../document2.odt mimetype
    $ zip -r ../document2.odt * -x mimetype
    

Section 17 Of the OASIS OpenOffice Specification defines how OpenDocument Packages need to be packaged.

Section 17.4 MIME Type Stream reads like this:

If a MIME type for a document that makes use of packages is existing, then the package SHOULD contain a stream called "mimetype". This stream SHOULD be first stream of the package's zip file, it MUST NOT be compressed, and it MUST NOT use an 'extra field' in its header (see [ZIP])..

The purpose is to allow packaged files to be identified through 'magic number' mechanisms, such as Unix's file/magic utility. If a ZIP file contains a stream at the beginning of the file that is uncompressed, and has no extra data in the header, then the stream name and the stream content can be found at fixed positions. More specifically, one will find:

  • a string 'PK' at position 0 of all zip files
  • a string 'mimetype' at position 30 of all such package files
  • the mimetype itself at position 38 of such a package.

I have tried tokland suggestion, but I have founded that LibreOffice 4 require specific order (only for the first ones?):

  1. mimetype (uncompressed)
  2. meta.xml
  3. settings.xml
  4. content.xml
  5. Thumbnails/thumbnail.png
  6. Configurations2/images/Bitmaps/
  7. Configurations2/popupmenu/
  8. Configurations2/toolpanel/
  9. Configurations2/statusbar/
  10. Configurations2/progressbar/
  11. Configurations2/toolbar/
  12. Configurations2/menubar/
  13. Configurations2/accelerator/current.xml
  14. Configurations2/floater/
  15. styles.xml
  16. META-INF/manifest.xml

I create a script to do that folder2od.sh:

#!/bin/sh

# Convert folder (unzipped OpenDocument file) to OpenDocument file (odt, ods, etc.)
# Usage: ./folder2od.sh "path/to/folder" "file.odt"

cmdfolder=$(cd `dirname "$0"`; pwd -P)
folder=$(cd `dirname "$2"`; pwd -P)
file=$(basename "$2")
absfile="$folder/$file"

cd "$1"
zip -0 -X "$file" "mimetype"

list=$(cat <<'END_HEREDOC'
meta.xml
settings.xml
content.xml
Thumbnails/thumbnail.png
Configurations2/images/Bitmaps/
Configurations2/popupmenu/
Configurations2/toolpanel/
Configurations2/statusbar/
Configurations2/progressbar/
Configurations2/toolbar/
Configurations2/menubar/
Configurations2/accelerator/current.xml
Configurations2/floater/
styles.xml
META-INF/manifest.xml
END_HEREDOC
)

for f in $list
do
    zip "$absfile" "$f"
done

cd "$cmdfolder"

I've found some interesting infos here: http://www.jejik.com/articles/2010/03/how_to_correctly_create_odf_documents_using_zip/

The shell script worked for me, too :) I had problems zipping back up, after unzipping an odt file. Guess the manifest part was what's missing.

The shell script above did not handle inline pictures/graphics, however, so I made some small adjustments which worked for me (also, the script had a bug in that END_HEREDOC was not on a dedicated line):

#!/bin/sh

# Convert folder (unzipped OpenDocument file) to OpenDocument file (odt, ods, etc.)
# Usage: ./folder2od.sh "path/to/folder" "file.odt"

cmdfolder=$(cd `dirname "$0"`; pwd -P)
folder=$(cd `dirname "$2"`; pwd -P)
file=$(basename "$2")
absfile="$folder/$file"

cd "$1"
zip -0 -X "$file" "mimetype"

list=$(cat <<'END_HEREDOC'
meta.xml
settings.xml
content.xml
Pictures/
Thumbnails/
Configurations2/
styles.xml
manifest.rdf
META-INF/manifest.xml
END_HEREDOC
)

for f in $list
do
    zip -r "$absfile" "$f"
done

cd "$cmdfolder"
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!