问题
I tried find('meta[http-equiv="Content-type"]')
but it failed to retrieve that information.
回答1:
SimpleHTMLDom doesn't use quoted string literals in the selector. It's just elem[attr=value]
. And the comparison of value seems to be case-sensitive (there may be a way to make it case-insensitive, but that I don't know)*
E.g.
require 'simple_html_dom.php';
$html = file_get_html('http://www.google.com/');
// most likely one one element but foreach doesn't hurt
foreach( $html->find('meta[http-equiv=content-type]') as $ct ) {
echo $ct->content, "\n";
}
prints text/html; charset=ISO-8859-1
.
*edit: yes, there is a way to perform a case-insensitive match, use *=
instead of =
find('meta[http-equiv*=content-type]')
edit2: btw that http-equiv*=content-type
thingy would also match <meta http-equiv="haha-no-content-types"...
(it only tests if the string is somewhere in the attribute's value). But it's the only case-insensitive function/operator I could find. I guess you can live with it in this case ;-)
edit 3: It uses preg_match('.../i') and the pattern/selector is directly passed to that function. Therefore you could do something like http-equiv*=^content-type$
to match http-equiv="Content-type"
but not http-equiv="xyzContent-typeabc"
. But I don't know if this is a warranted feature.
回答2:
The Content-Type is typically part of the http-response headers - not in the body. Where did you get the xml document from?
回答3:
I would go foreach
on $this->find('meta');
in case of differently written content-type
- I think that browsers aren't in this case case sensitive, while php might be.
来源:https://stackoverflow.com/questions/2213675/how-to-get-content-type-using-html-simple-dom