Puppeteer returning empty object

﹥>﹥吖頭↗ 提交于 2021-02-05 07:26:48

问题


When I run the following code in the page console I'm trying to scrape, I got picture.

document.querySelector('#sb-site > div.sticky_footer > div:nth-child(9)')

However, when I run this in my program, the console log it and returns '{}'

const inputContent = await page.evaluate(() => {
return document.querySelector('#sb-site > div.sticky_footer > div:nth-child(9)'); });

回答1:


puppeteer can transfer two types of data between Node.js and browser context: serializable data (i.e. data that is supported by JSON.stringify()/JSON.parse()) and JavaScript object ids (including DOM elements) — JSHandle and ElementHandle. Later ones have a bit more complicated API (see JSHandle and ElementHandle methods or methods that mention them).

page.evaluate() can only transfer serializable data, and instead of un-serializable data, it returns undefined or empty objects. DOM elements are non-serializable as they contain circular references and methods.

So if you just need some text or element attributes, try to do most of the processing in the browser context and return just serializable data.




回答2:


Make sure the page loads completely before scraping.

page.goto(url, {waitUntil: 'networkidle0'})

Also, according to the docs, .evaluate will return a promise, it will not return a DOM element.

It will print {} on console or the value the promise resolves to on console.




回答3:


In your case you're trying to select a custom dom object injected into the page which is leading to some strange behavior when using the nth-child() css selector. So you should try to target the DOM node directly instead. So let's say you were trying to get a similar element here https://wefunder.com/chattanoogafc

You can do:

const inputContent = await page.evaluate(async () => {
  var elements =  document.querySelectorAll("#sb-site > div.sticky_footer > div")[3].querySelectorAll("*")[0];
  return elements.getAttribute("company-json");
});

console.log("test:" + inputContent);

And that should return the JSON that you want. You can then parse it using JSON.parse(inputContent)



来源:https://stackoverflow.com/questions/55017057/puppeteer-returning-empty-object

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!