I want to get a DIV from an external website with pure PHP.
External website: http://www.isitdownrightnow.com/youtube.com.html
Div text I want from isitdownr
This is what I always use:
$url = 'https://somedomain.com/somesite/';
$content = file_get_contents($url);
$first_step = explode( '<div id="thediv">' , $content );
$second_step = explode("</div>" , $first_step[1] );
echo $second_step[0];
This may be a little overkill, but you'll get the gist.
<?php
$doc = new DOMDocument;
// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;
// Most HTML Developers are chimps and produce invalid markup...
$doc->strictErrorChecking = false;
$doc->recover = true;
$doc->loadHTMLFile('http://www.isitdownrightnow.com/check.php?domain=youtube.com');
$xpath = new DOMXPath($doc);
$query = "//div[@class='statusup']";
$entries = $xpath->query($query);
var_dump($entries->item(0)->textContent);
?>
$contents = file_get_contents($url);
$title = explode('<div class="entry-content">',$contents);
$title = explode("</div>",$title[1]);
$fp = fopen ("s.php", "w+");
fwrite ($fp, "$title[0]");
fclose ($fp);
require_once('s.php');
I used the xpath method proposed by @mightyuhu and it worked great with his addition of the assignment. Depending on the web page you get the info from and the availability of an 'id' or 'class' which identifies the tag you wish to get, you will have to change the query you use. If the tag has an 'id' assigned to it, you can use this (the sample is for extracting the USD exchange rate):
$query = "//div[@id='USD']";
However, the site developers won't make it so easy for us, so there will be several more 'unnamed' tags to dig into, in my example:
<div id="USD" class="tab">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td>Ask Rate</td>
<td align="right">1.77400</td>
</tr>
<tr class="even">
<td>Bid Rate</td>
<td align="right">1.70370</td>
</tr>
<tr>
<td>BNB Fixing</td>
<td align="right">1.735740</td>
</tr>
</tbody>
</table>
</div>
So I had to change the query to get the 'Ask Rate':
$doc->loadHTMLFile('http://www.fibank.bg/en');
$xpath = new DOMXPath($doc);
$query = "//div[@id='USD']/table/tbody/tr/td";
So, I used the query above, but changed the item to 1 instead of 0 to get the second column where the exchange rate is (the first column contains the text 'Ask Rate'):
$entries = $xpath->query($query);
$usdrate = $entries->item(1)->textContent;
Another method is to reference the value directly within the query, which when you don't have names or styles should be done with indexing the tags, which was something I received as knowledge from my Maxthon browser and its "Inspect element' feature combined with the "Copy XPath" right menu option (neat, yeah?):
"//*[@id="USD"]/table/tbody/tr[1]/td[2]"
Notice it also inserts an asterisk (*) after the //
, which I have not digged into. In this case you should again get the value with item(0)
, since there will be no other values.
If you need, you can make any changes to the string you extracted, for example changing the number format to match your preference:
$usdrate = number_format($usdrate, 5, ',', ' ');
I hope someone will find this helpful, as I found the answers above, and will spare this someone time in searching for the correct query and syntax.