Extracting Metadata from Website

元气小坏坏 提交于 2019-12-20 05:31:33

问题


I was wondering if there's a way in javascript that allows me to process the html source code that allows me to take out specific tags that I want?

Sorry if it sounds easy or too simple. i am new to programming.


回答1:


Use DOM it can pull data from webpages if you know the structure.




回答2:


If you have the HTML in a string, then you can use:

var str = '<html></html>'; // your html text goes here
var div = document.createElement('div');
div.innerHTML = str;
var dom = div.firstChild; // dom is the object you want,
                          // you can manipulate it using standard dom methods

Alternately, use jQuery. jQuery is a library to help you manipulate and access HTML elements more easily. First, add this to the head of your document:

<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js"></script>

This is a reference to the jQuery library. Then, do:

var foo = $("<html>Your html here</html>");

Or, if your html is in a variable (e.g. str), you can do:

var foo = $(str);

Then, you can manipulate and parse foo in a number of ways. For example, to remove all paragraph elements, you would use

foo.remove('p');

Or, to remove the paragraph element with id="bar", use:

foo.remove('p.bar');

Once you are done your modifications, you can get the new html text using:

foo.html();

Why is your html in a string? Is it not the html of the current page?



来源:https://stackoverflow.com/questions/6376778/extracting-metadata-from-website

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!