Use DOM and XPath to remove a node from a sitemap file

前端 未结 1 717
孤城傲影
孤城傲影 2020-12-20 07:41

I am trying to develop a function that removes certain URL nodes from my sitemap file. Here is what I have so far.

$xpath = new DOMXpath($DOMfile);
$elements         


        
相关标签:
1条回答
  • 2020-12-20 07:55

    XML from a sitemap should be :

    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
    <loc></loc>
    ...
    </url>
    <url>
    <loc></loc>
    ...
    </url>
    ...
    </urlset>
    

    Since it got a namespace, the query is a little more complicated than my previous answer :

    $xpath = new DOMXpath($DOMfile);
    // Here register your namespace with a shortcut
    $xpath->registerNamespace('sm', "http://www.sitemaps.org/schemas/sitemap/0.9");
    // this request should work
    $elements = $xpath->query('/sm:urlset/sm:url[sm:loc = "'.$pageUrl.'"]');
    
    foreach($elements as $element){
        // This is a hint from the manual comments
        $element->parentNode->removeChild($element);
    }
    echo $DOMfile->saveXML();
    

    I'm writing out of memory just before going to bed. If it doesn't work I'll go test tomorrow morning. (And yes, I'm aware that it could bring some downvotes)

    If you don't have a namespace (you should but that's not an obligation sigh)

    $elements = $xpath->query('/urlset/url[loc = "'.$pageUrl.'"]');
    

    You got a concrete example that it's working here : http://codepad.org/vuGl1MAc

    0 讨论(0)
提交回复
热议问题