Sax parsing and encoding

你离开我真会死。 提交于 2019-11-28 11:24:24

The characters() method is not guaranteed to give you the complete character content of a text element in one pass - the full text may span buffer boundaries. You need to buffer the characters yourself between the start and end element events.

e.g.

StringBuilder builder;

public void startElement(String uri, String localName, String qName, Attributes atts) {
   builder = new StringBuilder();
}

public void characters(char[] ch, int start, int length) {
   builder.append(ch,start,length);
}

public void endElement(String uri, String localName, String qName) {
  String theFullText = builder.toString();
}

XML entities generate special events in SAX. You can catch them with a LexicalHandler, though it's generally not necessary. But this explain why can't assume that you will recieve only one characters event per tag. Use a buffer as explained in other answers.

For instance hello&world will generate the sequence

  • startElement
  • characters hello
  • startEntity
  • characters &
  • endEntity
  • characters world

Have a look at Auxialiary SAX interface, if you want some more examples. Other special events are external entities, comments, CDATA, etc.

How are you passing the input to SAX? As InputStream (recommended) or Reader? So, starting from your byte[], try using the ByteArrayInputStream.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!