I am using SAX parser to parse a XML response but it throws an exception.
ExpatParser$ParseException : (not well formed) invalid token
Is there any
First Answer
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in your xml output in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped using either numeric character references or the strings " & " and "< " respectively.
The right angle bracket (>) may be represented using the string " > ", and MUST, for compatibility, be escaped using either " > " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.
Please check your xml seems that it comes the these special characters(&,<,>)
After discussion with Vaibhav Jani
Here is the sample xml file
<?xml version="1.0"?>
<first_screen>
<first_screen_object id="1">
<name><![CDATA[मानक हिन्दी]]></name>
<desc><![CDATA[मानक हिन्दीमानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी]]></desc>
</first_screen_object>
<first_screen_object id="2">
<name><![CDATA[मानक हिन्दी]]></name>
<desc><![CDATA[मानक हिन्दीमानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी]]></desc>
</first_screen_object>
<first_screen_object id="3">
<name><![CDATA[मानक हिन्दी]]></name>
<desc><![CDATA[मानक हिन्दीमानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी]]></desc>
</first_screen_object>
</first_screen>
And this the SAX parser for the sample XML
import java.io.InputStream;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import android.sax.Element;
import android.sax.EndTextElementListener;
import android.sax.RootElement;
import android.util.Xml;
public class HindiParser {
// Constructor
public HindiParser() {
}
public static InputStream getInputStreamFromUrl(String url) {
InputStream content = null;
try {
HttpGet httpGet = new HttpGet(url);
HttpClient httpclient = new DefaultHttpClient();
// Execute HTTP Get Request
HttpResponse response = httpclient.execute(httpGet);
content = response.getEntity().getContent();
} catch (Exception e) {
// handle the exception !
}
return content;
}
/*
* <?xml version="1.0"?> <first_screen> <first_screen_object id="1">
* <name><![CDATA[मानक हिन्दी]]></name> <desc><![CDATA[मानक हिन्दीमानक
* हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक
* हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी]]></desc>
* </first_screen_object>
*
* <first_screen_object id="2"> <name><![CDATA[मानक हिन्दी]]></name>
* <desc><![CDATA[मानक हिन्दीमानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी
* मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक
* हिन्दी]]></desc> </first_screen_object> </first_screen_object>
*
*
* <first_screen_object id="3"> <name><![CDATA[मानक हिन्दी]]></name>
* <desc><![CDATA[मानक हिन्दीमानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी
* मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक हिन्दी मानक
* हिन्दी]]></desc> </first_screen_object>
*
* </first_screen>
*/
public void parse() {
try {
RootElement root = new RootElement("first_screen");
Element firstScreenElemnet = root.getChild("first_screen_object");
firstScreenElemnet.getChild("name").setEndTextElementListener(
new EndTextElementListener() {
public void end(String body) {
System.out.println("Name is " + body);
}
});
firstScreenElemnet.getChild("desc").setEndTextElementListener(
new EndTextElementListener() {
public void end(String body) {
System.out.println("Description is " + body);
}
});
try {
Xml.parse(
getInputStreamFromUrl("http://pastebin.com/raw.php?i=M6zrbJ0W"),
Xml.Encoding.UTF_8, root.getContentHandler());
} catch (Exception e) {
e.printStackTrace();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Try with android.util.Xml.parse()
First argument InputStream => HttpResponse.getEntity().getContent()
Second argument Xml.Encoding => Xml.Encoding.UTF_8
Last argument ContentHandler => your handler
what encoding are you using?
if you are using ISO-8859-1, try using UTF-8
<?xml version="1.0" encoding="UTF-8"?>
I'm not entirely sure that it will solve your problem but I'd set the charset on the InputSource
using its setEncoding()
method.
InputSource inputSource = new InputSource(byteArrayInputStream);
inputSource.setEncoding("UTF-8");
xr.parse(inputSource);
This should solve the problem:
InputSource inputSource = new InputSource(is);
inputSource.setEncoding("ISO-8859-1");