问题
I'm using the package RCurl to download some prices from a website in Brazil, but in order to load the data I must first choose a city from a form.
The website is: "http://www.muffatosupermercados.com.br/Home.aspx"
and I want the prices from CURITIBA, id=53.
I'm trying to use the solution provided in this post: "How do I use cookies with RCurl?"
And this is my code:
library("RCurl")
library("XML")
#Set your browsing links
loginurl = "http://www.muffatosupermercados.com.br"
dataurl = "http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2"
#Set user account data and agent
pars=list(
id = "53"
)
agent="Mozilla/5.0" #or whatever
#Set RCurl pars
curl = getCurlHandle()
curlSetOpt(cookiejar="cookies.txt", useragent = agent, followlocation =TRUE, curl=curl)
#Also if you do not need to read the cookies.
#curlSetOpt( cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)
#Post login form
html=postForm(loginurl, .params = pars, curl=curl)
#Go wherever you want
html=getURL(dataurl, curl=curl)
C1 <- htmlParse(html, asText=TRUE, encoding="UTF-8")
Preco <- C1 %>% html_nodes(xpath = "//li[@class='preco']") %>% html_text(xmlValue, trim = TRUE)
But when I run the code I only get the page behind the form, not the intended page:
"http://www.muffatosupermercados.com.br/CategoriaProduto.aspx?Page=1&c=2"
I have also tried to play with cookies, but with no luck.
Does anyone have an idea on how to submit this form and load the correct page?
tks in advance...
来源:https://stackoverflow.com/questions/28519237/rcurl-submit-a-form-and-load-a-page