Simplify/ Clean up XML of a DOCX word document

前端 未结 4 430
甜味超标
甜味超标 2021-02-04 02:40

I have a Microsoft Word Document (docx) and I use Open XML SDK 2.0 Productivity Tool to generate C# code from it.

I want to programmatically insert some database values

4条回答
  •  梦毁少年i
    2021-02-04 03:22

    I have found a solution: the Open XML PowerTools Markup Simplifier.

    I followed the steps described at http://ericwhite.com/blog/2011/03/09/getting-started-with-open-xml-powertools-markup-simplifier/, but it didn't work 1:1 (maybe because it is now version 2.2 of Power Tools?). So, I compiled PowerTools 2.2 in "Release" mode and made a reference to the OpenXmlPowerTools.dll in my TestMarkupSimplifier.csproj. In the Program.cs I only changed the path to my DOCX file. I ran the program once and my document seems to be fairly clean now.

    Code quoted from Eric's blog in the link above:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using OpenXmlPowerTools;
    using DocumentFormat.OpenXml.Packaging;
    
    class Program
    {
        static void Main(string[] args)
        {
            using (WordprocessingDocument doc = WordprocessingDocument.Open("Test.docx", true))
            {
                SimplifyMarkupSettings settings = new SimplifyMarkupSettings
                {
                    RemoveComments = true,
                    RemoveContentControls = true,
                    RemoveEndAndFootNotes = true,
                    RemoveFieldCodes = false
                    RemoveLastRenderedPageBreak = true,
                    RemovePermissions = true,
                    RemoveProof = true,
                    RemoveRsidInfo = true,
                    RemoveSmartTags = true,
                    RemoveSoftHyphens = true,
                    ReplaceTabsWithSpaces = true,
                };
                MarkupSimplifier.SimplifyMarkup(doc, settings);
            }
        }
    }
    

提交回复
热议问题