问题
Sample XML
<feed xmlns="http://www.w3.org/2005/Atom">
<title>NDTV News - Top Stories</title>
<link>http://www.ndtv.com/</link>
<description>Latest entries</description>
<language>en</language>
<pubDate>Wed, 31 Jul 2013 22:33:00 GMT</pubDate>
<lastBuildDate>Wed, 31 Jul 2013 22:33:00 GMT</lastBuildDate>
<entry>
<title>Narendra Modi to be BJP's PM candidate, announcement before crucial assembly polls: sources</title>
<link>http://feedproxy.google.com/~r/NdtvNews-TopStories/~3/XN7dMIDe5YI/story01.htm</link>
<published>Wed, 31 Jul 2013 13:58:31 GMT</published>
<author>
<name>user42715</name>
</author>
<content type="html"><![CDATA[<div align="center"><a href="http://www.ndtv.com/news/images/topstory_thumbnail/ Shatrughan_Sinha_agency_120.jpg"><img border="0" src="http://www.ndtv.com/news/images/topstory_thumbnail/Shatrughan_Sinha_agency_120.jpg" alt="2013-07-29-08-43-05" /></a></div><p><span style="font-size: large;">The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September.</span><br /><br /><span style="font-size: large;"> The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September. </span><br /><br /><span style="font-size: large;">The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September. </span><br /><br /></p>]]></content>
</entry>
</feed>
With the below code I was able to retrieve , and values within the tag.
XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
private XmlPullParser parser = factory.newPullParser();
private InputStream urlStream = downloadUrl(urlString);
parser.setInput(urlStream, null);
int eventType = parser.getEventType();
boolean done = false;
while (eventType != XmlPullParser.END_DOCUMENT && !done) {
tagName = parser.getName();
switch (eventType) {
case XmlPullParser.START_DOCUMENT:
break;
case XmlPullParser.START_TAG:
if (tagName.equals("entry")) {
}
if (tagName.equals("title")) {
title = parser.nextText().toString();
Log.i(TITLE, title);
}
if (tagName.equals("published")) {
pubDate = parser.nextText().toString();
Log.i(PUBLISHEDDATE, pubDate);
}
if (tagName.equals("author")) {
readAuthor(parser);
Log.i(AUTHOR, author);
}
break;
case XmlPullParser.END_TAG:
if (tagName.equals("feed")) {
done = true;
} else if (tagName.equals("entry")) {
rssFeed = new RssFeedStructure(title);
rssFeedList.add(rssFeed);
}
break;
}
eventType = parser.next();
}
private String readAuthor(XmlPullParser parser) throws IOException,
XmlPullParserException {
parser.nextTag();
parser.require(XmlPullParser.START_TAG, null, "name");
author = parser.nextText().toString();
parser.require(XmlPullParser.END_TAG, null, "name");
return author;
}
From the tag how can I retrieve the "href" value within the and the text value(The BJP is likely to anoint Narendra Modi.....) from the
tag.
回答1:
You can use JSoup. Download @ http://jsoup.org/download. Add the jar to the libs folder.
To parser i copied the rss feed to xml file in assests folder. (localy)
XmlPullParser xpp = factory.newPullParser();
InputStream is = this.getAssets().open("xmlparser.xml");
xpp.setInput(is, "UTF_8");
You can use the below since you have the url. I ave shown how to extract the url and the content. you need to extract the contents of other tags as you would do normally.
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(urlStream, null);
boolean insideItem = false;
// Returns the type of current event: START_TAG, END_TAG, etc..
int eventType = xpp.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT) {
if (eventType == XmlPullParser.START_TAG) {
if (xpp.getName().equalsIgnoreCase("entry")) {
insideItem = true;
}
else if (xpp.getName().equalsIgnoreCase("content")) {
if (insideItem)
{
Document doc = Jsoup.parse(xpp.nextText());
Elements links = doc.select("a[href]"); // a with href
for (Element link : links) {
Log.i("........",""+link.attr("abs:href"));
}
Element divcontent = doc.select("span").first();
Log.i("..........",""+divcontent.text());
}
}
} else if (eventType == XmlPullParser.END_TAG
&& xpp.getName().equalsIgnoreCase("entry")) {
insideItem = false;
}
eventType = xpp.next(); // move to next element
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (XmlPullParserException e1) {
e1.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
Log :
08-03 08:03:04.413: I/........(1524): http://www.ndtv.com/news/images/topstory_thumbnail/ Shatrughan_Sinha_agency_120.jpg
08-03 08:03:04.423: I/..........(1524): The BJP is likely to anoint Narendra Modi as its prime ministerial candidate for the 2014 elections and make a formal announcement to that effect by September.
Edit: To loop through the elements
Elements divcontent = doc.select("span");
for(int k= 1;k<divcontent.size();k++)
{
String spancontent =divcontent.get(k).text();
Log.i("..........",spancontent);
}
来源:https://stackoverflow.com/questions/18030164/parsing-the-cdata-section-in-xml-using-xml-pull-parser