How to parse an RSS feed with XmlPullParser?

∥☆過路亽.° 提交于 2019-11-29 19:14:18

问题


I would like to parse a RSS feed. My question is how I can parse all tags between the <item>and </item> tags.

Given this very simple XML:

<?xml version="1.0" ?>
<rss version="2.0">
<channel>
  <title>MyRSSPage</title>
  <link>http://www.example.com</link>
  <item>
  <link>www.example.com/example1</link>
  <title>Example title 1</title>
  </item>
  <item>
  <link>www.example.com/example2</link>
  <title>Example title 2</title>
  </item>
</channel>
</rss>

I would like to parse just the stuff between the <item>...</item> tags.

            List<RssMessage> messages = new ArrayList<RssMessage>();

            // parser is a XmlPullParser instance
            while(parser.next() != XmlPullParser.END_DOCUMENT) {
                if (parser.getEventType() != XmlPullParser.START_TAG) {
                    continue;
                }
            String name = parser.getName();
            // START OF HEADER
            if(name.equals("title")) {
                title = parser.nextText();
            }
            else if(name.equals("link")) {
                link = parser.nextText();
            }
            else if(name.equals("description")) {
                description = parser.nextText();
            }
            else if(name.equals("language")) {
                language = parser.nextText();
            }
            else if(name.equals("copyright")) {
                copyright = parser.nextText();
            }
            else if(name.equals("pubDate")) {
                pubdate = parser.nextText();
            }
            // END OF HEADER

            else if(name.equals("item")) {
                RssMessage rssMessage = processItem(parser);
                messages.add(rssMessage);
            }
        }

In the below method I would like to just parse the tags within the <item>...</item>tags. How do I construct a loop that just goes through the item between <item> and </item>?

EDIT
This is almost working. But sometimes not all elements are initiated even if the corresponding element in the RSS xml DO exist! Is something wrong with the below code?

private RssMessage processItem(XmlPullParser parser) throws IOException, XmlPullParserException {
        RssMessage rssMessage = new RssMessage();
    parser.require(XmlPullParser.START_TAG, ns, "item");
    while (parser.next() != XmlPullParser.END_TAG) {
        if (parser.getEventType() != XmlPullParser.START_TAG) {
            continue;
        }
        String name = parser.getName();
        if(name.equals("link")) {
            rssMessage.setLink(parser.nextText());
        }
        else if(name.equals("guid")) {
            rssMessage.setGuid(parser.nextText());
        }
        else if(name.equals("category")) {
            rssMessage.setCategory(parser.nextText());
        }
        else if(name.equals("title")) {
            rssMessage.setTitle(parser.nextText());
        }
        else if(name.equals("pubDate")) {
            rssMessage.setPubDate(parser.nextText());
        }
    }
    return rssMessage;
    }

回答1:


Try the below.

try {
    XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
    factory.setNamespaceAware(false);
    XmlPullParser xpp = factory.newPullParser();
    xpp.setInput(url.openConnection().getInputStream(), "UTF_8"); 
    //xpp.setInput(getInputStream(url), "UTF-8");

    boolean insideItem = false;

    // Returns the type of current event: START_TAG, END_TAG, etc..
    int eventType = xpp.getEventType();
    while (eventType != XmlPullParser.END_DOCUMENT) {
        if (eventType == XmlPullParser.START_TAG) {

            if (xpp.getName().equalsIgnoreCase("item")) {
                insideItem = true;
            } 
            else if(xpp.getName().equalsIgnoreCase("title")) 
            {

            }
        }
        eventType = xpp.next(); //move to next element
    }

} catch (MalformedURLException e) {
    e.printStackTrace();
} catch (XmlPullParserException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}

Edit:

XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(false);
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(open,null);
// xpp.setInput(getInputStream(url), "UTF-8");

boolean insideItem = false;

// Returns the type of current event: START_TAG, END_TAG, etc..
int eventType = xpp.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT) {
    if (eventType == XmlPullParser.START_TAG) {

        if (xpp.getName().equalsIgnoreCase("item")) {
            insideItem = true;
        } else if (xpp.getName().equalsIgnoreCase("title")) {
            if (insideItem)
                Log.i("....",xpp.nextText()); // extract the headline
        } else if (xpp.getName().equalsIgnoreCase("link")) {
            if (insideItem)
                Log.i("....",xpp.nextText());  // extract the link of article
        }
    } else if (eventType == XmlPullParser.END_TAG && xpp.getName().equalsIgnoreCase("item")) {
        insideItem = false;
    }

    eventType = xpp.next(); // move to next element
}

Output

www.example.com/example1
Example title 1
www.example.com/example2
Example title 2


来源:https://stackoverflow.com/questions/17434135/how-to-parse-an-rss-feed-with-xmlpullparser

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!