What is the best practice for parsing remote content with jQuery?

前端 未结 10 1470
时光说笑
时光说笑 2020-11-27 03:21

Following a jQuery ajax call to retrieve an entire XHTML document, what is the best way to select specific elements from the resulting string? Perhaps there is a library or

相关标签:
10条回答
  • 2020-11-27 03:38

    Just an idea - tested in FF/Safari - seems to work if you create an iframe to store the document temporarily. Of course, if you are doing this it might be smarter to just use the src property of the iframe to load the document and do whatever you want in the "onload" of it.

      $(function() {
        $.ajax({
          type: 'GET', 
          url: 'result.html',
          dataType: 'html',
          success: function(data) {
            var $frame = $("<iframe src='about:blank'/>").hide();
            $frame.appendTo('body');
            var doc = $frame.get(0).contentWindow.document;
            doc.write(data);
            var $title = $("title", doc);
            alert('Title: '+$title.text() );
            $frame.remove();
          }
        });
      });
    

    I had to append the iframe to the body to get it to have a .contentWindow.

    0 讨论(0)
  • 2020-11-27 03:39

    How about this: Load XML from string

    0 讨论(0)
  • 2020-11-27 03:40
    $.get('yourpage.html',function(data){
        var content = $('<div/>').append(data).find('#yourelement').html();
    });
    

    You can also simply temporarily wrap inside a div. You don't even need to add it to the DOM.

    0 讨论(0)
  • 2020-11-27 03:42

    If you wanted to find the value of specifically named fields (i.e. the inputs in a form) something like this would find them for you:

    var fields = ["firstname","surname", ...."foo"];
    
    function findFields(form, fields) {
      var form = $(form);
      fields.forEach(function(field) {
        var val = form.find("[name="+field+"]").val();
        ....
    
    0 讨论(0)
  • 2020-11-27 03:44

    Shamelessly copied and adapted from another of my answers (Simple jQuery ajax example not finding elements in returned HTML), this fetches the HTML of the remote page, then the parseHTML function creates a temporary div element for it and puts the lot inside, runs through it, and returns the requested element. jQuery then alerts the text() inside.

    $(document).ready(function(){
      $('input').click(function(){
        $.ajax({
          type : "POST",
          url : 'ajaxtestload.html',
          dataType : "html",
          success: function(data) {
            alert( data ); // shows whole dom
            var gotcha = parseHTML(data, 'TITLE'); // nodeName property returns uppercase
            if (gotcha) {
              alert($(gotcha).html()); // returns null
            }else{
              alert('Tag not found.');
            }
          },
          error : function() {
            alert("Sorry, The requested property could not be found.");
          }
        });
      });
    });
    
    function parseHTML(html, tagName) {
      var root = document.createElement("div");
      root.innerHTML = html;
      // Get all child nodes of root div
      var allChilds = root.childNodes;
      for (var i = 0; i < allChilds.length; i++) {
        if (allChilds[i].nodeName == tagName) {
          return allChilds[i];
        }
      }
      return false;
    }
    

    To get several items out or a list of script tags, say, I think you'd have to improve the parseHTML function, but hey - proof of concept :-)

    0 讨论(0)
  • 2020-11-27 03:47

    This works. I just split up the building blocks for better readability.

    Check the explanation and the inline comments to grasp the workings of this and why it has to be made like this.

    Of course this can't be used to retrieve cross-domain-content for that you either have to proxy the calls through a script of yours or think about integration something like flXHR (Cross-Domain Ajax with Flash)

    call.html

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
      <head>
        <meta http-equiv="content-type" content="text/html; charset=utf-8" />
        <title>asd</title>
        <script src="jquery.js" type="text/javascript"></script>
        <script src="xmlDoc.js" type="text/javascript"></script>
        <script src="output.js" type="text/javascript"></script>
        <script src="ready.js" type="text/javascript"></script>
      </head>
      <body>
        <div>
          <input type="button" id="getit" value="GetIt" />
        </div>
      </body>
    </html>
    

    jquery.js is (jQuery 1.3.2 uncompressed) test.html a valid XHTML-Document

    xmlDoc.js

    // helper function to create XMLDocument out of a string
    jQuery.createXMLDocument = function( s ) {
      var xmlDoc;
      // is it a IE?
      if ( window.ActiveXObject ) {
        xmlDoc = new ActiveXObject('Microsoft.XMLDOM');
        xmlDoc.async = "false";
        // prevent erros as IE tries to resolve the URL in the DOCTYPE
        xmlDoc.resolveExternals = false;
        xmlDoc.validateOnParse = false;
        xmlDoc.loadXML(s);
      } else {
        // non IE. give me DOMParser
        // theoretically this else branch should never be called
        // but just in case.
        xmlDoc = ( new DOMParser() ).parseFromString( s, "text/xml" );
      }
      return xmlDoc;
    };
    

    output.js

    // Output the title of the loaded page
    // And get the script-tags and output either the
    // src attribute or code
    function headerData(data) {
      // give me the head element
      var x = jQuery("head", data).eq(0);
      // output title
      alert(jQuery("title", x).eq(0).text());
      // for all scripttags which include a file out put src
      jQuery("script[src]", x).each(function(index) {
        alert((index+1)+" "+jQuery.attr(this, 'src'));
      });
      // for all scripttags which are inline javascript output code
      jQuery("script:not([src])", x).each(function(index) {
        alert(this.text);
      });
    }
    

    ready.js

    $(document).ready(function() {
      $('#getit').click(function() {
        $.ajax({
          type : "GET",
          url : 'test.html',
          dataType : "xml",
          // overwrite content-type returned by server to ensure
          // the response getst treated as xml
          beforeSend: function(xhr) {
            // IE doesn't support this so check before using
            if (xhr.overrideMimeType) {
              xhr.overrideMimeType('text/xml');
            }
          },
          success: function(data) {
            headerData(data);
          },
          error : function(xhr, textStatus, errorThrown) {
            // if loading the response as xml failed try it manually
            // in theory this should only happen for IE
            // maybe some
            if (textStatus == 'parsererror') {
              var xmlDoc = jQuery.createXMLDocument(xhr.responseText);
              headerData(xmlDoc);
            } else {
              alert("Failed: " + textStatus + " " + errorThrown);
            }
          }
        });
      });
    });
    

    In Opera the whole thing works without the createXMLDocument and the beforeSend function.

    The extra trickery is needed for Firefox (3.0.11) and IE6 (can't test IE7, IE8, other browsers) as they have a problem when the Content-Type: returned by the server doesn't indicate that it's xml. My webserver returned Content-Type: text/html; charset=UTF-8 for test.html. In those two browsers jQuery called the error callback with textStatus saying parsererror. Because in line 3706 in jQuery.js

    data = xml ? xhr.responseXML : xhr.responseText;
    

    data is being set to null. As in FF and IE the xhr.responseXML is null. This happens because they don't get that the returned data is xml (as Opera does). And only xhr.responseText is set with the whole xhtml-code. As data is null the line 3708

    if ( xml && data.documentElement.tagName == "parsererror" )
    

    throws an exception which is catched in line 3584 and status is set to parsererror.

    In FF I can solve the problem by using the overrideMimeType() function before sending the request.

    But IE doesn't support that function on the XMLHttpRequest-object so I have to generate the XMLDocument myself if the error-callback is run and the error is parsererror.

    example for test.html

    <?xml version="1.0" encoding="UTF-8" ?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
      <head>
        <meta http-equiv="content-type" content="text/html; charset=utf-8" />
        <title>Plugins | jQuery Plugins</title>
        <script type="text/javascript" src="jquery.js"></script>
        <script type="text/javascript">var imagePath = '/content/img/so/';</script>
      </head>
      <body>
      </body>
    </html>
    
    0 讨论(0)
提交回复
热议问题