I am using stax for the first time to parse an XML String. I have found some examples but can\'t get my code to work. This is the latest version of my code:
Here is an example with XMLStreamReader:
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
Map<String, String> elements = new HashMap<>();
try {
XMLStreamReader xmlReader = inputFactory.createXMLStreamReader(file);
String elementValue = "";
while (xmlReader.hasNext()) {
int xmlEventType = xmlReader.next();
switch (xmlEventType) {
// Check for Start Elements
case XMLStreamConstants.START_ELEMENT:
//Get current Element Name
String elementName = xmlReader.getLocalName();
if(elementName.equals("td")) {
//Get Elements Value
elementValue = xmlReader.getElementText();
}
//Add the new Start Element to the Map
elements.put(elementName, elementValue);
break;
default:
break;
}
}
//Close Session
xmlReader.close();
} catch (Exception e) {
log.error(e.getMessage(), e);
}
I faced a similar issue as I was getting "IllegalStateException: Not a textual event" message When I looked through your code I figured out that if you had a condition:
if (event == XMLStreamConstants.START_ELEMENT){
....
addressId = reader.getText(); // it throws exception here
....
}
(Please note: StaXMan did point out this in his answer!)
This happens since to fetch text, XMLStreamReader instance must have encountered 'XMLStreamConstants.CHARACTERS' event!
There maybe a better way to do this...but this is a quick and dirty fix (I have only shown lines of code that may be of interest) Now to make this happen modify your code slightly:
// this will tell the XMLStreamReader that it is appropriate to read the text
boolean pickupText = false
while(reader.hasNext()){
if (event == XMLStreamConstants.START_ELEMENT){
if( (reader.getLocalName().equals(STATUS) )
|| ( (reader.getLocalName().equals(STATUS) )
|| ((reader.getLocalName().equals(STATUS) ))
// indicate the reader that it has to pick text soon!
pickupText = true;
}
}else if (event == XMLStreamConstants.CHARACTERS){
String textFromXML = reader.getText();
// process textFromXML ...
//...
//set pickUpText false
pickupText = false;
}
}
Hope that helps!
I found a solution that uses XMLEventReader instead of XMLStreamReader:
public MyObject parseXML(String xml)
throws XMLStreamException, UnsupportedEncodingException
{
byte[] byteArray = xml.getBytes("UTF-8");
ByteArrayInputStream inputStream = new ByteArrayInputStream(byteArray);
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLEventReader reader = inputFactory.createXMLEventReader(inputStream);
MyObject object = new MyObject();
while (reader.hasNext())
{
XMLEvent event = (XMLEvent) reader.next();
if (event.isStartElement())
{
StartElement element = event.asStartElement();
if (element.getName().getLocalPart().equals("ElementOne"))
{
event = (XMLEvent) reader.next();
if (event.isCharacters())
{
String elementOne = event.asCharacters().getData();
object.setElementOne(elementOne);
}
continue;
}
if (element.getName().getLocalPart().equals("ElementTwo"))
{
event = (XMLEvent) reader.next();
if (event.isCharacters())
{
String elementTwo = event.asCharacters().getData();
object.setElementTwo(elementTwo);
}
continue;
}
}
}
return object;
}
I would still be interested in seeing a solution using XMLStreamReader.
Make sure you read javadocs for Stax: since it is fully streaming parsing mode, only information contained by the current event is available. There are some exceptions, however; getElementText() for example must start at START_ELEMENT, but will then try to combine all textual tokens from inside current element; and when returning, it will point to matching END_ELEMENT.
Conversely, getText() on START_ELEMENT will not returning anything useful (since START_ELEMENT refers to tag, not child text tokens/nodes 'inside' start/end element pair). If you want to use it instead, you have to explicitly move cursor in stream by calling streamReader.next(); whereas getElementText() does it for you.
So what is causing the error? After you have consumed all start/end-element pairs, next token will be END_ELEMENT (matching whatever was the parent tag). So you must check for the case where you get END_ELEMENT, instead of yet another START_ELEMENT.