问题
If we visit this url in chrome, with devtools open, we can clearly see a cookie appear (in chrome developer tools -> 'application' -> 'cookies').
If we attempt the same thing using httr::GET()
, we expect to see the cookie, but we do not:
library(httr)
r <- GET("https://aps.dac.gov.in/LUS/Public/Reports.aspx")
r$cookies
# [1] domain flag path secure expiration name value
# <0 rows> (or 0-length row.names)
Why is this, and how can we retrieve the cookie (along with the page html) preferably using either httr
and/or rvest
(plus other suggestions but without using an actual browser, headless or otherwise, including selenium)
回答1:
The reason this is happening is because the cookie doesn't actually get generated until the user submits the form (by opening chrome developer tools and watching 'application' -> 'cookies' before and after form submission, we see the cookie appear.
Note this can be emulated using chrome incognito (it won't have access to the cookies in regular chrome, so it can be tried repeatedly for demonstration purposes).
来源:https://stackoverflow.com/questions/58909797/cannot-get-cookie