问题
When we enter a word in Google image search, a page is returned. This page contains many pictures with thumbnail view. I want to save the location URL of these images in my database (MySQL). I need to code this in PHP and I want to save URL of first 10 images. I am designing a dynamic page, I will pick these image URL addresses from database and will show these on my dynamic page.
I have already tried a lot and problem is my complete URL is not saved because it contains many invalid characters for MySQL. I have searched many sites and found different functions but I am still not clear about this: I am at failure in devising an idea how I can do this job. Can somebody give me some PHP code or a procedure how it could be done?
回答1:
I would suggest you should use PHP's DOM Library. It is very powerful and allows parsing any DOM structure. Refer some of its examples and you could implement it easily.
The idea is that you study the HTML structure of the page returned by Google and accordingly use the DOM library for parsing specific tags. From what I see, the images are organized as <ul>
and <li>
tags, like:
<ul class="rg_ul" data-pg="1" data-cnt="6">
<li class="rg_li" data-row="1" style="width:216px;height:162px"></li>
<li class="rg_li" style="width:231px;height:162px"></li>
<li class="rg_li" style="width:218px;height:162px"></li>
<li class="rg_li" style="width:216px;height:162px"></li>
<li class="rg_li" style="width:216px;height:162px"></li>
<li class="rg_li" style="width:217px;height:162px"></li>
</ul>
Within each <li>
tag there are additional tags, one of which is <a>
. This tag seems to be having 2 attributes - "imgrefurl" and "imgurl" - that might give you the image you need. Which of these 2 attributes you need is for you to explore.
Alternatively, there is an <img>
tag within <li>
that has a "src" attribute containing the actual image binary. So you may parse it as well. Please note that the binary is for the image that you see on the search page and not the actual image.
For some pointers to DOM, this method might be useful - http://www.php.net/manual/en/domelement.getelementsbytagname.php and http://www.php.net/manual/en/domelement.hasattribute.php - to read all <li>
tags and then parse for the ones using class "rg_li".
I hope the above makes sense
回答2:
You can achieve this with PHP curl libraries and the PHP class DOMDocument, then use Mysql or Mysqli libreries to connecto to the database. MySQL help can be found here: MySQL Doc.
来源:https://stackoverflow.com/questions/8862540/save-image-url-from-google-search-to-mysql