How to zip a WordprocessingML folder into readable docx

13,592

Solution 1

The most common problem around manually zipping together Open XML documents is that it will not work if you zip the directory instead of the contents. In other words, the[content_types].xml file, and the word, docProps, and _rels directories need to reside at the root level of the zip file.

Solution 2

Here are steps to unzip my.docx and re-zip:

% mkdir unzipped
% cd unzipped/
% unzip ../my.docx    
% zip -r ../rezipped.docx *
% open ../rezipped.docx 

Solution 3

The compression algorithm used is "Zip" (Base 64) compression.

7zip seems to offer this, though i have no tested it.

Solution 4

Further to what Mica said, the contents of the ZIP file are organised according to the Open Packaging Convention; cf. Microsoft's Essentials of the Open Packaging Convention.

You can use the .NET System.IO.Packaging to make and manipulate .docx files; this class is implemented in the Mono project.

Share:
13,592
Admin
Author by

Admin

Updated on June 02, 2022

Comments

  • Admin
    Admin almost 2 years

    I have been trying to write a simple Markdown -> docx parser/writer, but am completely stuck with the last part, which should be the easiest: i.e. compressing the folder into a .docx that Word, or any other .docx reader, will recognize.

    My parser-writer is irrelevant really: I have this problem if I simply unzip any old Word-produced *.docx and then try to recompress it with the usual compression utilities, giving it the file-ending docx. Is there some mysterious header I should be adding, or do I need a special OPC compression utility, or what?

    I don't so much want a tool that will do this, as to figure out what is supposed to be there. It seems to be independent of the WordprocessingML specification.

    Needless to say I don't know anything about compression. Everything I can find via Google has to do with fancy utilities you can use in business, but I'm making a little executable that would be GPLd or something, and should work on anything.

  • applicative
    applicative about 13 years
    Hi, I am the original poster, but I lost this S.O. account, else I would mark this as the 'right answer'. You are right that my mistake was to zip the directory that included all the material, thinking I needed the right incantation, form of compression ... some subtlety. MSWord is quite willing to open the file if I accumulate all relevant files (including wholesale addition of subdirectories like word that are themselves at the root level.) to a single zip file. So far I have tried this on OS X without incident. I will study things more.
  • Lei Yang
    Lei Yang over 10 years
    Truly open, self-made docx by WinZip and WinRAR are all readable!