How to open and manipulate Word document/template in Java?

前端 未结 5 1004
日久生厌
日久生厌 2020-12-13 20:35

I need to open a .doc/.dot/.docx/.dotx (I\'m not picky, I just want it to work) document, parse it for placeholders (or something similar), put my own data, an

相关标签:
5条回答
  • 2020-12-13 21:06

    I have recently dealt with similar problem: "A tool which accepts a template '.docx' file, processes the file by evaluation of passed parameter context and outputs a '.docx' file as the result of the process."

    finally god brought us scriptlet4dox :). the key features for this product is: 1. groovy code injection as scripts in template file (parameter injection, etc.) 2. loop over collection items in table

    and so many other features. but as I checked the last commit on the project is performed about a year ago, so there is a probability that the project is not supported for new features and new bug-fixes. this is your choice to use it or not.

    0 讨论(0)
  • 2020-12-13 21:07

    I've been in more or less the same situation as you, I had to modify a whole bunch of MS Word merge templates at once. After having googled a lot to try to find a Java solution I finally installed Visual Studio 2010 Express which is free and did the job in C#.

    0 讨论(0)
  • 2020-12-13 21:13

    I ended up relying on Apache Poi 3.12 and processing paragraphs (separately extracting paragraphs also from tables, headers/footers, and footnotes, as such paragraphs aren't returned by XWPFDocument.getParagraphs() ).

    The processing code (~100 lines) and unit tests are here on github.

    0 讨论(0)
  • 2020-12-13 21:22

    Since a docx file is merely a zip-archive of xml files (plus any binary files for embedded objects such as images), we met that requirement by unpacking the zip file, feeding the document.xml to a template engine (we used freemarker) that does the merging for us, and then zipping the output document to get the new docx file.

    The template document then is simply an ordinary docx with embedded freemarker expressions / directives, and can be edited in Word.

    Since (un)zipping can be done with the JDK, and Freemarker is open source, you don't incur any licence fees, not even for word itself.

    The limitation is that this approach can only emit docx or rtf files, and the output document will have the same filetype as the template. If you need to convert the document to another format (such as pdf) you'll have to solve that problem separately.

    0 讨论(0)
  • 2020-12-13 21:23

    I know it's been a long time since I've posted this question, and I said that I would post my solution when I'm finished. So here it is.

    I hope that it will help someone someday. This is a full working class, and all you have to do is put it in your application, and place TEMPLATE_DIRECTORY_ROOT directory with .docx templates in your root directory.

    Usage is very simple. You put placeholders (key) in your .docx file, and then pass file name and Map containing corresponding key-value pairs for that file.

    Enjoy!

    import java.io.BufferedInputStream;
    import java.io.BufferedOutputStream;
    import java.io.BufferedReader;
    import java.io.Closeable;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.InputStreamReader;
    import java.io.OutputStream;
    import java.net.URI;
    import java.util.Deque;
    import java.util.Enumeration;
    import java.util.HashMap;
    import java.util.Iterator;
    import java.util.LinkedList;
    import java.util.Map;
    import java.util.UUID;
    import java.util.zip.ZipEntry;
    import java.util.zip.ZipFile;
    import java.util.zip.ZipOutputStream;
    
    import javax.faces.context.ExternalContext;
    import javax.faces.context.FacesContext;
    import javax.servlet.http.HttpServletResponse;
    
    public class DocxManipulator {
    
        private static final String MAIN_DOCUMENT_PATH = "word/document.xml";
        private static final String TEMPLATE_DIRECTORY_ROOT = "TEMPLATES_DIRECTORY/";
    
    
        /*    PUBLIC METHODS    */
    
        /**
         * Generates .docx document from given template and the substitution data
         * 
         * @param templateName
         *            Template data
         * @param substitutionData
         *            Hash map with the set of key-value pairs that represent
         *            substitution data
         * @return
         */
        public static Boolean generateAndSendDocx(String templateName, Map<String,String> substitutionData) {
    
            String templateLocation = TEMPLATE_DIRECTORY_ROOT + templateName;
    
            String userTempDir = UUID.randomUUID().toString();
            userTempDir = TEMPLATE_DIRECTORY_ROOT + userTempDir + "/";
    
            try {
    
                // Unzip .docx file
                unzip(new File(templateLocation), new File(userTempDir));       
    
                // Change data
                changeData(new File(userTempDir + MAIN_DOCUMENT_PATH), substitutionData);
    
                // Rezip .docx file
                zip(new File(userTempDir), new File(userTempDir + templateName));
    
                // Send HTTP response
                sendDOCXResponse(new File(userTempDir + templateName), templateName);
    
                // Clean temp data
                deleteTempData(new File(userTempDir));
            } 
            catch (IOException ioe) {
                System.out.println(ioe.getMessage());
                return false;
            }
    
            return true;
        }
    
    
        /*    PRIVATE METHODS    */
    
        /**
         * Unzipps specified ZIP file to specified directory
         * 
         * @param zipfile
         *            Source ZIP file
         * @param directory
         *            Destination directory
         * @throws IOException
         */
        private static void unzip(File zipfile, File directory) throws IOException {
    
            ZipFile zfile = new ZipFile(zipfile);
            Enumeration<? extends ZipEntry> entries = zfile.entries();
    
            while (entries.hasMoreElements()) {
              ZipEntry entry = entries.nextElement();
              File file = new File(directory, entry.getName());
              if (entry.isDirectory()) {
                file.mkdirs();
              } 
              else {
                file.getParentFile().mkdirs();
                InputStream in = zfile.getInputStream(entry);
                try {
                  copy(in, file);
                } 
                finally {
                  in.close();
                }
              }
            }
          }
    
    
        /**
         * Substitutes keys found in target file with corresponding data
         * 
         * @param targetFile
         *            Target file
         * @param substitutionData
         *            Map of key-value pairs of data
         * @throws IOException
         */
        @SuppressWarnings({ "unchecked", "rawtypes" })
        private static void changeData(File targetFile, Map<String,String> substitutionData) throws IOException{
    
            BufferedReader br = null;
            String docxTemplate = "";
            try {
                br = new BufferedReader(new InputStreamReader(new FileInputStream(targetFile), "UTF-8"));
                String temp;
                while( (temp = br.readLine()) != null)
                    docxTemplate = docxTemplate + temp; 
                br.close();
                targetFile.delete();
            } 
            catch (IOException e) {
                br.close();
                throw e;
            }
    
            Iterator substitutionDataIterator = substitutionData.entrySet().iterator();
            while(substitutionDataIterator.hasNext()){
                Map.Entry<String,String> pair = (Map.Entry<String,String>)substitutionDataIterator.next();
                if(docxTemplate.contains(pair.getKey())){
                    if(pair.getValue() != null)
                        docxTemplate = docxTemplate.replace(pair.getKey(), pair.getValue());
                    else
                        docxTemplate = docxTemplate.replace(pair.getKey(), "NEDOSTAJE");
                }
            }
    
            FileOutputStream fos = null;
            try{
                fos = new FileOutputStream(targetFile);
                fos.write(docxTemplate.getBytes("UTF-8"));
                fos.close();
            }
            catch (IOException e) {
                fos.close();
                throw e;
            }
        }
    
        /**
         * Zipps specified directory and all its subdirectories
         * 
         * @param directory
         *            Specified directory
         * @param zipfile
         *            Output ZIP file name
         * @throws IOException
         */
        private static void zip(File directory, File zipfile) throws IOException {
    
            URI base = directory.toURI();
            Deque<File> queue = new LinkedList<File>();
            queue.push(directory);
            OutputStream out = new FileOutputStream(zipfile);
            Closeable res = out;
    
            try {
              ZipOutputStream zout = new ZipOutputStream(out);
              res = zout;
              while (!queue.isEmpty()) {
                directory = queue.pop();
                for (File kid : directory.listFiles()) {
                  String name = base.relativize(kid.toURI()).getPath();
                  if (kid.isDirectory()) {
                    queue.push(kid);
                    name = name.endsWith("/") ? name : name + "/";
                    zout.putNextEntry(new ZipEntry(name));
                  } 
                  else {
                    if(kid.getName().contains(".docx"))
                        continue;  
                    zout.putNextEntry(new ZipEntry(name));
                    copy(kid, zout);
                    zout.closeEntry();
                  }
                }
              }
            } 
            finally {
              res.close();
            }
          }
    
        /**
         * Sends HTTP Response containing .docx file to Client
         * 
         * @param generatedFile
         *            Path to generated .docx file
         * @param fileName
         *            File name of generated file that is being presented to user
         * @throws IOException
         */
        private static void sendDOCXResponse(File generatedFile, String fileName) throws IOException {
    
            FacesContext facesContext = FacesContext.getCurrentInstance();
            ExternalContext externalContext = facesContext.getExternalContext();
            HttpServletResponse response = (HttpServletResponse) externalContext
                    .getResponse();
    
            BufferedInputStream input = null;
            BufferedOutputStream output = null;
    
            response.reset();
            response.setHeader("Content-Type", "application/msword");
            response.setHeader("Content-Disposition", "attachment; filename=\"" + fileName + "\"");
            response.setHeader("Content-Length",String.valueOf(generatedFile.length()));
    
            input = new BufferedInputStream(new FileInputStream(generatedFile), 10240);
            output = new BufferedOutputStream(response.getOutputStream(), 10240);
    
            byte[] buffer = new byte[10240];
            for (int length; (length = input.read(buffer)) > 0;) {
                output.write(buffer, 0, length);
            }
    
            output.flush();
            input.close();
            output.close();
    
            // Inform JSF not to proceed with rest of life cycle
            facesContext.responseComplete();
        }
    
    
        /**
         * Deletes directory and all its subdirectories
         * 
         * @param file
         *            Specified directory
         * @throws IOException
         */
        public static void deleteTempData(File file) throws IOException {
    
            if (file.isDirectory()) {
    
                // directory is empty, then delete it
                if (file.list().length == 0)
                    file.delete();
                else {
                    // list all the directory contents
                    String files[] = file.list();
    
                    for (String temp : files) {
                        // construct the file structure
                        File fileDelete = new File(file, temp);
                        // recursive delete
                        deleteTempData(fileDelete);
                    }
    
                    // check the directory again, if empty then delete it
                    if (file.list().length == 0)
                        file.delete();
                }
            } else {
                // if file, then delete it
                file.delete();
            }
        }
    
        private static void copy(InputStream in, OutputStream out) throws IOException {
    
            byte[] buffer = new byte[1024];
            while (true) {
              int readCount = in.read(buffer);
              if (readCount < 0) {
                break;
              }
              out.write(buffer, 0, readCount);
            }
          }
    
          private static void copy(File file, OutputStream out) throws IOException {
            InputStream in = new FileInputStream(file);
            try {
              copy(in, out);
            } finally {
              in.close();
            }
          }
    
          private static void copy(InputStream in, File file) throws IOException {
            OutputStream out = new FileOutputStream(file);
            try {
              copy(in, out);
            } finally {
              out.close();
            }
         }
    
    }
    
    0 讨论(0)
提交回复
热议问题