问题
I'm trying to scrape text from a website but can't seem to extract anything.
below is the structure and code.
My code:
const rp = require("request-promise");
const $ = require("cheerio");
const url = "xx";
rp(url)
.then(function(html) {
//success!
let token = "ce-bodytext";
console.log($(token, response).length);
console.log($(token, html)).text;
})
.catch(function(err) {
console.log(JSON.stringify(err));
});
While I just need the text, there was no id to the tag.
Also, I was hoping ce-bodytext
would extract all values in order
but all I get is empty output.
{}
How do I just extract the text as shown in the image?
回答1:
Try this:
let token = ".ce-bodytext>p>strong>font>font";
console.log($(token, html).text());
回答2:
ce-bodytext
is a class
, you forgot to add .
before it :
const token = '.ce-bodytext';
It will at least fix the empty output.
来源:https://stackoverflow.com/questions/57053478/scrape-website-using-nodejs-cheerio-deep-nested-element-tags