Batch conversion of docx to clean HTML

后端 未结 3 2045
别那么骄傲
别那么骄傲 2020-12-14 03:05

I\'m starting to wonder if this is even possible. I\'ve searched for solutions on Google and come up with nothing that works exactly how I\'d like it to.

I think it\

相关标签:
3条回答
  • 2020-12-14 03:44

    Hi not sure what the rules are on promoting your own solutions, so do let me know if I am out of line.

    I am a web developer who had the same issues, so I created my own tool: http://www.convertwordtohtml.com

    We are also working on a new version that will have even better conversion quality and one click conversion eg you can right click on a word file and it will be directly converted to html and the code placed into the clipboard. The current version also supports command line access and the new version will have a server version to.

    There is a free trial version downloadable from the site , and if you have any questions do contact me any time.

    0 讨论(0)
  • 2020-12-14 03:49

    Since I'm a big fan of Aspose.Words, a commercial library to create/process Word documents, I would do something like:

    1. Open the Word document with Aspose.Words.
    2. Save the Word document as HTML.
    3. Use something like SgmlReader or HTML Agility Pack (or even Regular Expressions if it is suitable) to remove unwanted HTML tags/attributes.

    Since you wrote you work at an university, I'm not sure whether commercial packages are an option, though.

    0 讨论(0)
  • 2020-12-14 04:01

    This looks like just what you need: http://msdn.microsoft.com/en-us/library/ff628051(v=office.14).aspx

    The author Eric White blogged about his experiences developing that tool. You can see that list of posts on his blog here: http://blogs.msdn.com/b/ericwhite/archive/2008/10/20/eric-white-s-blog-s-table-of-contents.aspx#Open_XML_to_XHtml

    0 讨论(0)
提交回复
热议问题