I'm making an ajax call to fetch content and append this content like this:
$(function(){
var site = $('input').val();
$.get('file.php', { site:site }, function(data){
mas = $(data).find('a');
mas.map(function(elem, index) {
divs = $(this).html();
$('#result').append('' + divs + '');
})
}, 'html');
});
The problem is that when I change a
in body
I get nothing (no error, just no html). Im assuming body is a tag just like 'a' is? What am I doing wrong?
So this works for me:
mas = $(data).find('a');
But this doesn't:
mas = $(data).find('body');
Parsing the returned HTML through a jQuery object (i.e $(data)
) in order to get the body
tag is doomed to fail, I'm afraid.
The reason is that the returned data
is a string
(try console.log(typeof(data))
). Now, according to the jQuery documentation, when creating a jQuery object from a string containing complex HTML markup, tags such as body
are likely to get stripped. This happens since in order to create the object, the HTML markup is actually inserted into the DOM which cannot allow such additional tags.
Relevant quote from the documentation:
If a string is passed as the parameter to $(), jQuery examines the string to see if it looks like HTML.
[...] If the HTML is more complex than a single tag without attributes, as it is in the above example, the actual creation of the elements is handled by the browser's innerHTML mechanism. In most cases, jQuery creates a new element and sets the innerHTML property of the element to the HTML snippet that was passed in. When the parameter has a single tag (with optional closing tag or quick-closing) — $( "< img / >" ) or $( "< img >" ), $( "< a >< /a >" ) or $( "< a >" ) — jQuery creates the element using the native JavaScript createElement() function.
When passing in complex HTML, some browsers may not generate a DOM that exactly replicates the HTML source provided. As mentioned, jQuery uses the browser"s .innerHTML property to parse the passed HTML and insert it into the current document. During this process, some browsers filter out certain elements such as < html >, < title >, or < head > elements. As a result, the elements inserted may not be representative of the original string passed.
I ended up with this simple solution:
var body = data.substring(data.indexOf("<body>")+6,data.indexOf("</body>"));
$('body').html(body);
Works also with head or any other tag.
(A solution with xml parsing would be nicer but with an invalid XML response you have to do some "string parsing".)
I experimented a little, and have identified the cause to a point, so pending a real answer which I would be interested in, here is a hack to help understand the issue
$.get('/',function(d){
// replace the `HTML` tags with `NOTHTML` tags
// and the `BODY` tags with `NOTBODY` tags
d = d.replace(/(<\/?)html( .+?)?>/gi,'$1NOTHTML$2>',d)
d = d.replace(/(<\/?)body( .+?)?>/gi,'$1NOTBODY$2>',d)
// select the `notbody` tag and log for testing
console.log($(d).find('notbody').html())
})
Edit: further experimentation
It seems it is possible if you load the content into an iframe, then you can access the frame content through some dom object hierarchy...
// get a page using AJAX
$.get('/',function(d){
// create a temporary `iframe`, make it hidden, and attach to the DOM
var frame = $('<iframe id="frame" src="/" style="display: none;"></iframe>').appendTo('body')
// check that the frame has loaded content
$(frame).load(function(){
// grab the HTML from the body, using the raw DOM node (frame[0])
// and more specifically, it's `contentDocument` property
var html = $('body',frame[0].contentDocument).html()
// check the HTML
console.log(html)
// remove the temporary iframe
$("#frame").remove()
})
})
Edit: more research
It seems that contentDocument is the standards compliant way to get hold of the window.document
element of an iFrame, but of course IE don't really care for standards, so this is how to get a reference to the iFrame's window.document.body
object in a cross platform way...
var iframeDoc = iframe.contentDocument || iframe.contentWindow.document;
var iframeBody = iframeDoc.body;
// or for extra caution, to support even more obsolete browsers
// var iframeBody = iframeDoc.getElementsByTagName("body")[0]
I FIGURED OUT SOMETHING WONDERFUL (I think!)
Got your html as a string?
var results = //probably an ajax response
Here's a jquery object that will work exactly like the elements currently attached to the DOM:
var superConvenient = $($.parseXML(response)).children('html');
Nothing will be stripped from superConvenient
! You can do stuff like superConvenient.find('body')
or even
superConvenient.find('head > script');
superConvenient
works exactly like the jquery elements everyone is used to!!!!
NOTE
In this case the string results
needs to be valid XML because it is fed to JQuery's parseXML
method. A common feature of an HTML response may be a <!DOCTYPE>
tag, which would invalidate the document in this sense. <!DOCTYPE>
tags may need to be stripped before using this approach! Also watch out for features such as <!--[if IE 8]>...<![endif]-->
, tags without closing tags, e.g.:
<ul>
<li>content...
<li>content...
<li>content...
</ul>
... and any other features of HTML that will be interpreted leniently by browsers, but will crash the XML parser.
Regex solution that worked for me:
var head = res.match(/<head.*?>.*?<\/head.*?>/s);
var body = res.match(/<body.*?>.*?<\/body.*?>/s);
Detailed explanation: https://regex101.com/r/kFkNeI/1
来源:https://stackoverflow.com/questions/14423257/find-body-tag-in-an-ajax-html-response