Need to parse image src from HTML page then display it

后端 未结 3 1373
一整个雨季
一整个雨季 2021-01-15 21:05

I\'m currently trying to develop an app whereby it visits the following site (Http://lulpix.com) and parses the HTML and gets the img src from the following section

相关标签:
3条回答
  • 2021-01-15 21:46

    No need to use webview now check this sample project

    https://github.com/meetmehdi/HTMLImageParser.git

    In this sample project I am parsing html and image tag, than extracting the image from image URL. Image is downloaded and is displayed.

    0 讨论(0)
  • 2021-01-15 21:57

    I recently used JSoup to parse invalid HTML, it works well! Do something like...

        Document doc = Jsoup.parse(str);
        Element img = doc.body().select("div[class=pic rounded-8] img").first();
        String src = img.attr("src");
    

    Play with the "selector string" to get it right, but I think the above will work. It first selects the outer div based on the value of its class attribute, and then any descendent img element.

    0 讨论(0)
  • 2021-01-15 21:59

    Here's an AsyncTask that connects to lulpix, fakes a referrer & user-agent (lulpix tries to block scraping with some pretty lame checks apparently). Starts like this in your Activity:

    new ForTheLulz().execute();
    

    The resulting Bitmap is downloaded in a pretty lame way (no caching or checks if the image is already DL:ed) & error handling is overall pretty non-existent - but the basic concept should be ok.

    class ForTheLulz extends AsyncTask<Void, Void, Bitmap> {
            @Override
            protected Bitmap doInBackground(Void... args) {
                Bitmap result = null;
                try {
                    Document doc = Jsoup.connect("http://lulpix.com")
                            .referrer("http://www.google.com")
                            .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
                            .get();
                            //parse("http://lulpix.com");
                    if (doc != null) {
                        Elements elems = doc.getElementsByAttributeValue("class", "pic rounded-8");
                        if (elems != null && !elems.isEmpty()) {
                            Element elem = elems.first();
                            elems = elem.getElementsByTag("img");
                            if (elems != null && !elems.isEmpty()) {
                                elem = elems.first();
                                String src = elem.attr("src");
                                if (src != null) {
                                        URL url = new URL(src);
                                        // Just assuming that "src" isn't a relative URL is probably stupid.
                                        InputStream is = url.openStream();
                                        try {
                                            result = BitmapFactory.decodeStream(is);
                                        } finally {
                                            is.close();
                                        }
                                }
                            }
                        }
                    }
                } catch (IOException e) {
                    // Error handling goes here
                }
                return result;
            }
            @Override
            protected void onPostExecute(Bitmap result) {
                ImageView lulz = (ImageView) findViewById(R.id.lulpix);
                if (result != null) {
                    lulz.setImageBitmap(result);
                } else {
                    //Your fallback drawable resource goes here
                    //lulz.setImageResource(R.drawable.nolulzwherehad);
                }
            }
        }
    
    0 讨论(0)
提交回复
热议问题