Hello I found few and tried few, but nothing really works for me. Best I found was able to extract title of the page, but there are many title tags on the page and it extrac
Try this solution
$text = file_get_contents("http://www.example.com");
preg_match_all('/.*?<\/title>/is', $text, $matches);
foreach($matches[0] as $m)
{
echo htmlentities($m)."
";
}
For example:
// input text
$text = <<Lorem ipsum dolor
sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat.
EOT;
// solution
preg_match_all('/(.+?)<\/title>/is', $text, $matches);
foreach($matches[0] as $m)
{
echo htmlentities($m)."
";
}
Output:
Lorem ipsum dolor
ad minim
ex ea
POST UPDATED (to reflect the changes in the question).
For example you want to load some "a.html" file:
Lorem ipsum dolor
sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua.
Then, you have to write the script as follows:
load('a.html');
foreach ($dom->getElementsByTagName('a') as $tag) {
echo $tag->getAttribute('title').'
';
}
?>
This outputs:
Ravellavegas.com Analysis
Articlesiteslist.com Analysis