simple-html-dom | 易学教程

How to get hover data(ajax) by any crawler php

阅读更多关于 How to get hover data(ajax) by any crawler php

问题 I am crawling one website's data. I am able to whole content on a page. But some data on page comes after hover on some icons and shown as tooltips. So I require that data also. Is it possible with any crawler. I am using PHP and simplehtmldom for parsing/ crawling page. 回答1: Hover data can't be obtained by any crawlers. Crawlers crawl the web page and gets whole data ( HTML page source ). It's view which we can view as soon as we hit URL. Hover need mouse moving action over HTML attribute on

Simple HTML DOM wildcard in attribute

阅读更多关于 Simple HTML DOM wildcard in attribute

问题 I have the following tags <div class="col *">Text</div> * is anything. I want to get all div tag with class attribute contains col (as in my example) using Simple HTML DOM. 回答1: Since Simple HTML DOM does already have a method for selecting attributes that contain a certain value and|or something else. For example $html->find("div[class*=col]", 0)->outertext Or you could just retrieve div nodes that start with col like so $html->find("div[class^=col]", 0)->outertext And for safe keeping you

str_get_html is not loading a valid html string

阅读更多关于 str_get_html is not loading a valid html string

问题 I receive an html string using curl: curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); $html_string = curl_exec($ch); When I echo it I see a perfectly good html as I require for my parsing needs. But, When trying to send this string to HTML DOM PARSER method str_get_html($html_string) , It would not upload it (returns false from the method invocation). I tried saving it to file and opening with file_get_html on the file, but the same thing occurs. What can be the cause of this? As I said, the

simple htmldom parser not parsing microdata

阅读更多关于 simple htmldom parser not parsing microdata

问题 I am trying to parse the price $30 from the below microdata : <div itemprop="offers" itemscope="" itemtype="http://schema.org/Offer"> <span itemprop="price"><strong>$30.00</strong></span></div> here is the code I am trying to,but its throwing error Fatal error: Call to a member function find() $url="http://somesite.com"; $html=file_get_html($url); foreach($html->find('span[itemprop=price]') as $price) echo $price; any suggestions, where its going wrong?, or not sure how to parse with 回答1: You

How to check if a SimpleHTMLDom element does not exist

阅读更多关于 How to check if a SimpleHTMLDom element does not exist

问题 SimpleHtmldom can be used to extract the contents of the first element with class description . $html = str_get_html($html); $html->find('.description', 0) However if this class does not exist, PHP will throw an error Trying to get property of non-object I tried if(!isset($html->find('.description', 0))) { echo 'not set'; } and if(!empty($html->find('.description', 0))) { echo 'not set'; } but both gives the error Can't use method return value in write context What is the proper way to check

php: Get plain text from html - simplehtmldom or php strip_tags?

阅读更多关于 php: Get plain text from html - simplehtmldom or php strip_tags?

问题 I am looking at getting the plain text from html. Which one should I choose, php strip_tags or simplehtmldom plaintext extraction? One pro for simplehtmldom is support of invalid html, is that sufficient in itself? 回答1: You should probably use smiplehtmldom for the reason you mentioned and that strip_tags may also leave you non-text elements like javascript or css contained within script/style blocks You would also be able to filter text from elements that aren't displayed (inline style

Only first word of title retrieved with PHP showing

阅读更多关于 Only first word of title retrieved with PHP showing

问题 So I am trying to display a list from another website in mine, it all works fine but only the first word of the 'title' attribute is stored. I know that the whole title is retrieved from the other website so how do I get it to store all of it. Here is the code if it helps. <?php include "simple_html_dom.php"; $page = file_get_html("http://www.blade-edge.com/images/KSA/Flights/craft.asp?r=true&db=dunai"); echo "<table id=list>"; foreach($page->find('html/body/div/div[2]/ol/a') as $key=>

Find text inside javascript tag using PHP Simple HTML DOM Parser

阅读更多关于 Find text inside javascript tag using PHP Simple HTML DOM Parser

问题 I'm trying to find a text change regularly inside javascript tag : <script type="text/javascript"> jwplayer("mediaplayer").setup({ flashplayer: "player.swf", file:"filename", provider: "rtmp", streamer:"rtmp://192.168.1.1/file?wmsAuthSign=RANDOM-114-Character==", height:500, width:500, }); </script> How to get RANDOM-114-Character (or full value of 'streamer' flashvars) using PHP Simple HTML DOM Parser, I just have no idea to do this. 回答1: You can do it with regular expression: preg_match (

How should parse with PHP (simple html dom parser) background images and other images of webpage?

阅读更多关于 How should parse with PHP (simple html dom parser) background images and other images of webpage?

问题 How should parse with PHP (simple html dom/etc..) background and other images of webpage? case 1: inline css <div id="id100" style="background:url(/mycar1.jpg)"></div> case 2: css inside html page <div id="id100"></div> <style type="text/css"> #id100{ background:url(/mycar1.jpg); } </style> case 3: separate css file <div id="id100" style="background:url(/mycar1.jpg);"></div> external.css #id100{ background:url(/mycar1.jpg); } case 4: image inside img tag solution to case 4 as he appears in

PHP String to Int Conversion Losing Value (using simple_html_dom)

阅读更多关于 PHP String to Int Conversion Losing Value (using simple_html_dom)

问题 so I'm using simple_html_dom to parse a page for elements with a particular class. I successfully retrieve those elements but cannot seem to get them converted into usable variables (ie an integer so I can do an 'if' statement). It seems they are an object of some kind, and I've searched everywhere for hours, but no luck. There doesn't seem to be much support for simple_html_dom. Here's my code: /////////////// $html = new simple_html_dom(); // Load a file $html->load_file($getURL); $getName