cheerio

Extract public posts from Facebook page without API/APP key/token/secret

怎甘沉沦 提交于 2019-12-10 11:37:51
问题 Just to clarify in advance, I don't have a Facebook account and I have no intent to create one. Also, what I'm trying to achieve is perfectly legal in my country and the USA. Instead of using the Facebook API to get the latest timeline posts of a Facebook page, I want to send a get request directly to the page URL (e.g. this page) and extract the posts from the HTML source code. (I'd like to get the text and the creation time of the post.) When I run this in the web console: document

Scraping Google Translate

淺唱寂寞╮ 提交于 2019-12-10 05:58:14
问题 I would like to scraping Google Translate with NodeJS and cheerio library: request("http://translate.google.de/#de/en/hallo%20welt", function(err, resp, body) { if(err) throw err; $ = cheerio.load(body); console.log($('#result_box').find('span').length); } But he can't find the necessary span-elements from translation box (result_box). In source code of the website it looks like this: <span id="result_box"> <span class="hps">hello</span> <span class="hps">world</span> </span> So I think I

Select elements with an attribute with cheerio

此生再无相见时 提交于 2019-12-10 04:21:36
问题 What is the most efficient way to select all dom elements that have a certain attribute. <input name="mode"> With plain javascript I would use : document.querySelectorAll("[name='mode']") or document.querySelectorAll("[name]") if I don't care about the attribute value. 回答1: Ok I found it in the cheerio documentation, here is how you do it: $('[name=mode]') cheerio docs: Selectors 回答2: For some reason, the accepted answer didn't work for me (using cheerio ^1.0.0-rc.2 here). But for the

Can I add more jquery selectors to cheerio? (node.js)

你离开我真会死。 提交于 2019-12-08 08:54:11
问题 I've been playing around with cheerio and I noticed it doesn't seem to support certain selectors specified in the jquery reference, specifically ":odd" and ":even". Is there a way to use these by importing the jquery package into my program? Or is that something that has to be implemented into the cheerio code? Here's my code: //var request = require('request'); var cheerio = require('cheerio'); var jquery = require('./jquery-1.10.2'); var fs = require('fs'); $ = cheerio.load(fs.readFileSync(

Replace the attribute value using cheerio

和自甴很熟 提交于 2019-12-08 07:39:14
问题 The following code is used to replace all the <img> tags src value. But the following code does not modify the original document. $.html prints the original document and not the modified one. $ = cheerio.load(data); $("img").each(function() { var old_src=$(this).attr("src"); var new_src = "/my_cached_image?url=" + encodeURIComponent(old_src); $(this).prop("src", new_src); }); modified_data = $.html(); 回答1: You have a very small error, "src" in an img it's an attribute and not a property. So

node和express和cheerio

邮差的信 提交于 2019-12-07 09:25:53
#引子 nodejs 运行环境 http模块(请求通讯) express模块 (web框架) cheerio 模块(DOM的分析工具) ##1. node版本 ###1.1查看版本 node -v (小写) node -v v0.12.4 ###1.2升级版本 node有一个模块叫n(这名字可够短的。。。),是专门用来管理node.js的版本的。 首先安装n模块: npm install -g n 第二步: 升级node.js到最新稳定版 n stable 是不是很简单?! n后面也可以跟随版本号比如: n v0.10.26 或 n 0.10.26 ###1.3几个npm的常用命令 npm -v #显示版本,检查npm 是否正确安装。 npm install express #安装express模块 npm install -g express #全局安装express模块 npm list #列出已安装模块 npm show express #显示模块详情 npm update #升级当前目录下的项目的所有模块 npm update express #升级当前目录下的项目的指定模块 npm update -g express #升级全局安装的express模块 npm uninstall express #删除指定的模块 ##2. express模块和脚手架工具 2.1

Access to DOM using node.js

自作多情 提交于 2019-12-07 05:56:58
问题 i want to access to html file and get an element by id using node.js, this is my html file : <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Diagram </title> <script> function generatePNG (oViewer) { // some other code reader.onloadend = function() { base64data = reader.result; var image = document.createElement('img'); image.setAttribute("id", "GraphImage"); image.src = base64data; document.body.appendChild(image); } }, "image/png", oImageOptions); return sResult; var sResult =

How do I get the absolute path for '<img src=''>' in node from the a response.body

眉间皱痕 提交于 2019-12-07 04:06:59
问题 So I want to use request-promise to pull the body of a page. Once I have the page I want to collect all the tags and get an array of src's of those images. Assume the src attributes on a page have both relative and absolute paths. I want an array of absolute paths for imgs on a page. I know I can use some string manipulation and the npm path to build the absolute path but I wanted to find a better way of doing it. var rp = require('request-promise'), cheerio = require('cheerio'); var options

How do I get the absolute path for '<img src=''>' in node from the a response.body

别说谁变了你拦得住时间么 提交于 2019-12-05 12:05:16
So I want to use request-promise to pull the body of a page. Once I have the page I want to collect all the tags and get an array of src's of those images. Assume the src attributes on a page have both relative and absolute paths. I want an array of absolute paths for imgs on a page. I know I can use some string manipulation and the npm path to build the absolute path but I wanted to find a better way of doing it. var rp = require('request-promise'), cheerio = require('cheerio'); var options = { uri: 'http://www.google.com', method: 'GET', resolveWithFullResponse: true }; rp(options) .then

Access to DOM using node.js

半腔热情 提交于 2019-12-05 10:09:05
i want to access to html file and get an element by id using node.js, this is my html file : <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Diagram </title> <script> function generatePNG (oViewer) { // some other code reader.onloadend = function() { base64data = reader.result; var image = document.createElement('img'); image.setAttribute("id", "GraphImage"); image.src = base64data; document.body.appendChild(image); } }, "image/png", oImageOptions); return sResult; var sResult = generatePNG (oEditor.viewer); }); </script> </head> <body > <div id="diagramContainer"></div> </body> <