问题
I have some code that pulls HTML from an external source:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$xml = @simplexml_import_dom($doc); // just to make xpath more simple
$images = $xml->xpath('//img');
$sources = array();
Then, if I add all of the sources with this code:
foreach ($images as $i) {
array_push($sources, $i['src']);
}
echo "<pre>";
print_r($sources);
die();
I get this result:
Array
(
[0] => SimpleXMLElement Object
(
[0] => /images/someimage.gif
)
[1] => SimpleXMLElement Object
(
[0] => /images/en/someother.jpg
)
....
)
But when I use this code:
foreach ($images as $i) {
$sources[] = (string)$i['src'];
}
I get this result (which is what is desired):
Array
(
[0] => /images/someimage.gif
[1] => /images/en/someother.jpg
...
)
What is causing this difference? What is so different about array_push()?
Thanks,
EDIT: While I realize the answers match what I am asking (I've awarded), I more wanted to know why whether using array_push or other notation adds the SimpleXMLElement Object and not a string when both arent casted. I knew when explicitly casting to a string I'd get a string. See follow up question here:Why aren't these values being added to my array as strings?
回答1:
The difference is not caused by array_push()
-- but by the type-cast you are using in the second case.
In your first loop, you are using :
array_push($sources, $i['src']);
Which means you are adding SimpleXMLElement
objects to your array.
While, in the second loop, you are using :
$sources[] = (string)$i['src'];
Which means (thanks to the cast to string), that you are adding strings to your array -- and not SimpleXMLElement
objects anymore.
As a reference : relevant section of the manual : Type Casting.
回答2:
Sorry, just noticed better answers above, but the regex itself is still valid. Are you trying to get all images in HTML markup? I know you are using PHP, but you can convert use this C# example of where to go:
List<string> links = new List<string>();
if (!string.IsNullOrEmpty(htmlSource))
{
string regexImgSrc = @"<img[^>]*?src\s*=\s*[""']?([^'"" >]+?)[ '""][^>]*?>";
MatchCollection matchesImgSrc = Regex.Matches(htmlSource, regexImgSrc, RegexOptions.IgnoreCase | RegexOptions.Singleline);
foreach (Match m in matchesImgSrc)
{
string href = m.Groups[1].Value;
links.Add(href);
}
}
回答3:
In your first example, you should:
array_push($sources, (string) $i['src']);
Your second example gives an array of strings because you are converting the SimpleXMLElements to strings using the (string)
cast. In your first example you are not, so you get an array of SimpleXMLElements instead.
来源:https://stackoverflow.com/questions/5667526/why-am-i-getting-an-array-of-simplexmlelement-objects-here