PHP script that can extract text between multiple title tags of certain website?

后端未结

关注

 4  1068

独厮守ぢ 2021-01-16 18:28

Hello I found few and tried few, but nothing really works for me. Best I found was able to extract title of the page, but there are many title tags on the page and it extrac

4条回答

臣服心动 (楼主)

2021-01-16 19:18

Try this solution

$text = file_get_contents("http://www.example.com");
preg_match_all('/.*?<\/title>/is', $text, $matches);
foreach($matches[0] as $m)
{
    echo htmlentities($m)."<br />";
}
</code></pre>

<p>For example:</p>

<pre><code>// input text
$text = <<<EOT
<title>Lorem ipsum dolor
sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat.
EOT;

// solution
preg_match_all('/(.+?)<\/title>/is', $text, $matches);
foreach($matches[0] as $m)
{
    echo htmlentities($m)."<br />";
}
</code></pre>

<p>Output:</p>

<pre><code><title>Lorem ipsum dolor
ad minim
ex ea

POST UPDATED (to reflect the changes in the question).

For example you want to load some "a.html" file:



Lorem ipsum dolor

sit amet, consectetur adipisicing elit, sed do eiusmod tempor

incididunt ut labore et dolore magna aliqua.

Then, you have to write the script as follows:

load('a.html');

foreach ($dom->getElementsByTagName('a') as $tag) {
    echo $tag->getAttribute('title').'
';
}

?>

This outputs:

Ravellavegas.com Analysis
Articlesiteslist.com Analysis

0 讨论(0)

查看其它4个回答