urllib2

Beautiful Soup returning nothing

China☆狼群 提交于 2020-01-05 10:11:21
问题 Hi I am working on a project for my school that involves scraping off the HTML. However I get none returned when I look for tables. Here is the segment that experiences the issue. If you need more info I'd be happy to give it to you from bs4 import BeautifulSoup import urllib2 import datetime #This section determines the date of the next Saturday which will go onto the end of the URL d = datetime.date.today() while d.weekday() != 5: d += datetime.timedelta(1) #temporary logic for testing when

Beautiful Soup returning nothing

女生的网名这么多〃 提交于 2020-01-05 10:11:13
问题 Hi I am working on a project for my school that involves scraping off the HTML. However I get none returned when I look for tables. Here is the segment that experiences the issue. If you need more info I'd be happy to give it to you from bs4 import BeautifulSoup import urllib2 import datetime #This section determines the date of the next Saturday which will go onto the end of the URL d = datetime.date.today() while d.weekday() != 5: d += datetime.timedelta(1) #temporary logic for testing when

python登录豆瓣,发帖

喜欢而已 提交于 2020-01-05 03:35:35
学习了urllib、urllib2及cookielib常用方法的使用登录豆瓣,由于有验证码,采取的办法是将验证码图片下载到同目录下,查看图片后输入验证码即可登录、发帖帖子内容写死在代码中了 [Python]代码 # -- coding:gbk -- import sys, time, os, re import urllib, urllib2, cookielib loginurl = 'https://www.douban.com/accounts/login' cookie = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie)) params = { "form_email":"your email", "form_password":"your password", "source":"index_nav" #没有的话登录不成功 } #从首页提交登录 response=opener.open(loginurl, urllib.urlencode(params)) #验证成功跳转至登录页 if response.geturl() == "https://www.douban.com/accounts/login": html=response.read(

Logging in to a web site with Python (urllib,urllib2,cookielib): How does one find necessary information for submission?

╄→尐↘猪︶ㄣ 提交于 2020-01-04 13:47:19
问题 Preface: I understand that there are many responses for similar questions such as this on stack overflow. However, I haven't found anything relating to aspx log ins, nor an exact case such as this. Problem: I need to determine what information is necessary in order to log in to https://cableone.net/login.aspx in order to scrape information from there. Progress: Thus far I have found input fields in the source of login.aspx and have scrapped together a script in python with urllib,urllib2,and

How to make python urllib2 follow redirect and keep post method

雨燕双飞 提交于 2020-01-04 05:42:25
问题 I am using urllib2 to post data to a form. The problem is that the form replies with a 302 redirect. According to Python HTTPRedirectHandler the redirect handler will take the request and convert it from POST to GET and follow the 301 or 302. I would like to preserve the POST method and the data passed to the opener. I made an unsuccessful attempt at a custom HTTPRedirectHandler by simply adding data=req.get_data() to the new Request. I am sure this has been done before so I thought I would

Bitbucket API authentication with Python's HTTPBasicAuthHandler

浪子不回头ぞ 提交于 2020-01-04 04:54:44
问题 I'm trying to get the list of issues on a private repository using bitbucket's API. I have confirmed that HTTP Basic authentication works with hurl, but I am unable to authenticate in Python. Adapting the code from this tutorial, I have written the following script. import cookielib import urllib2 class API(): api_url = 'http://api.bitbucket.org/1.0/' def __init__(self, username, password): self._opener = self._create_opener(username, password) def _create_opener(self, username, password): cj

Bitbucket API authentication with Python's HTTPBasicAuthHandler

戏子无情 提交于 2020-01-04 04:54:31
问题 I'm trying to get the list of issues on a private repository using bitbucket's API. I have confirmed that HTTP Basic authentication works with hurl, but I am unable to authenticate in Python. Adapting the code from this tutorial, I have written the following script. import cookielib import urllib2 class API(): api_url = 'http://api.bitbucket.org/1.0/' def __init__(self, username, password): self._opener = self._create_opener(username, password) def _create_opener(self, username, password): cj

Python urllib2 login to minecraft.net

你。 提交于 2020-01-03 17:24:28
问题 I got a problem. I am writing a simple script to login to minecraft.net, and then list all classic servers. But when I run my script, it just redirects me back to minecraft.net/login. Here is what I have so far: import urllib2 import urllib import re url = "https://www.minecraft.net/login" page = urllib2.urlopen(url) data = page.read() page.close() authToken = re.search('name="authenticityToken"[\s]+value="(.+)"', data).group(1) data_dict = { "username": "USERNAME", "password": "PASSWORD",

In a Python 2.4 script, I would like to execute a os system call `ls -l` or `curl` for example and capture the output in a variable. How to do this?

随声附和 提交于 2020-01-03 05:24:12
问题 I am writing a python script on a remote server with an old version of python 2.4. In the script I want to issue commands like curl -XPUT 'http://somerul/_search' -d file.txt or an ls -ltrh and capture the outputs of these commands into a variable. For the curl command the output will be a json format that I will parse (please advise if an old json parser is available for me to use).. How can I make these kinds of system calls in the python script and capture the output into a variable? Thank

爬虫原理与数据抓取----- urllib2模块的基本使用

China☆狼群 提交于 2020-01-03 04:24:43
urllib2库的基本使用 所谓网页抓取,就是把URL地址中指定的网络资源从网络流中读取出来,保存到本地。 在Python中有很多库可以用来抓取网页,我们先学习 urllib2 。 urllib2 是 Python2.7 自带的模块(不需要下载,导入即可使用) urllib2 官方文档: https://docs.python.org/2/library/urllib2.html urllib2 源码: https://hg.python.org/cpython/file/2.7/Lib/urllib2.py urllib2 在 python3.x 中被改为 urllib.request urlopen 我们先来段代码: # urllib2_urlopen.py # 导入urllib2 库 import urllib2 # 向指定的url发送请求,并返回服务器响应的类文件对象 response = urllib2.urlopen("http://www.baidu.com") # 类文件对象支持 文件对象的操作方法,如read()方法读取文件全部内容,返回字符串 html = response.read() # 打印字符串 print html 执行写的python代码,将打印结果 Power@PowerMac ~$: python urllib2_urlopen.py 实际上