Why Office OpenXML splits text between tags and how to prevent it?

隐身守侯 提交于 2019-12-04 10:27:14

I have worked with this problem a lot:

In word, the document can be saved like this

  <w:t>{</w:t>...
  <w:t>variable</w:t>
  <w:t>}</w:t>

I have therefore create a JS library that works even if variable names are splitted: DocXgenjs (works server side too) . What I have found out during development is that variables names aren't splitted if:

  • The text to find is only composed of a-zA-Z characters (no {, $ or })
  • The text might be splitted if the text wasn't written in one stroke: For example, if you make a spelling mistake, and write ${varuable} , then make an edit -> ${variable}, the text inside the xml is highly probably going to be splitted. Basically you have to write your variable names in one stroke, and if you wish to edit one, rewrite the variable name completely.

I don't think there's a way to fix a docx document with one command in Word, , but rewriting the variables to write them in one Stroke should work.

The primary cause of this is proofErr element. Whereby Word identifies something that it deems spelt incorrectly and wraps it in the <w:proofErr> element, inevitably splitting the original text.

If this happens to you I recommend the following, it's tedious, but the only sure-fire way:

  1. Rename .docx to .zip.
  2. Extract contents of the archive.
  3. Open word\document.xml.
  4. Make the corrections (i.e. put the split text together) and save.
  5. Rename .zip to .docx.

EDIT

This Visual Studio Extension lets you edit the contents of the OpenXML package directly. This allows you to skip steps 1 & 2.

JD from AT

Word does this for certain reasons, e.g. to mark spelling errors or to keep track of changes and achieve a better result when merging documents based on rsid-numbers (http://blogs.msdn.com/b/brian_jones/archive/2006/12/11/what-s-up-with-all-those-rsids.aspx).

And here you can find a solution to get the document cleaned up: https://stackoverflow.com/a/7768161

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!