Flurry scraping using python3 requests.Session()

谁说胖子不能爱 提交于 2019-12-11 06:01:21

问题


This seems really straight forward, but for some reason this isn't connecting to flurry correctly and I unable to scrape the data.

    loginurl = "https://dev.flurry.com/secure/loginPage.do"
    csvurl = "https://dev.flurry.com/eventdata"

    session = requests.Session()
    login = session.post(loginurl, data={'loginEmail': 'user', 'loginPassword': 'pass'})
    data = session.get(csvurl)

Every time I try to use this, I get redirected back to the login screen (loginurl) without fetching the new data. Has anyone been able to connect to flurry like this successfully before?

Any and all help would be greatly appreciated, thanks.


回答1:


There are two more form fields to be populated struts.token.name and the value from struts.token.name i.e token, you also have to post to loginAction.do:

You can do an initial get and parse the values using bs4 then post the data:

from bs4 import BeautifulSoup
import requests 

loginurl = "https://dev.flurry.com/secure/loginAction.do"
csvurl = "https://dev.flurry.com/eventdata"#
data = {'loginEmail': 'user', 'loginPassword': 'pass'}

with requests.Session() as session:
    session.headers.update({
        "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36"})

    soup = BeautifulSoup(session.get(loginurl).content)
    name = soup.select_one("input[name=struts.token.name]")["value"]
    data["struts.token.name"] = name
    data[name] = soup.select_one("input[name={}]".format(name))["value"]
    login = session.post(loginurl, data=data)


来源:https://stackoverflow.com/questions/38670599/flurry-scraping-using-python3-requests-session

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!