PHP Simplexml_Load_File fails

*爱你&永不变心* 提交于 2019-12-06 16:44:27

For some reason, the pubmed server is returning that entire XML file as an HTML file with a single <pre> tag containing the XML. It also contains multiple XML fragments (there's several <PubmedArticle> elements and no container around them). Clearly this is intended to be processed by some wacky custom code.

You could "unwrap" the XML by calling SimpleXML twice, like so:

$outer_xml = simplexml_load_file($local);
$inner_xml = simplexml_load_string('<dummyContainer>' . (string)$outer_xml . '</dummyContainer>');
foreach ( $inner_xml->PubmedArticle as $article )
{
    // etc
}

To explain:

  • the outer "XML document" is the HTML, which has a single outer element of <pre>
  • casting that to string (which I've done explicitly with (string) for clarity and good habit) will give you the contents of that <pre> tag, i.e. all the <PubmedArticle> elements
  • wrapping that content in a <dummyElement> tag will give you a valid XML document, with each of the <PubmedArticle> elements as a top-level child in the document

Try urlencoding.

Note:

Libxml 2 unescapes the URI, so if you want to pass e.g. b&c as the URI parameter a, you have to call simplexml_load_file(rawurlencode('http://example.com/?a=' . urlencode('b&c'))). Since PHP 5.1.0 you don't need to do this because PHP will do it for you.

simplexml_load_file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!