urllib | 易学教程

Python Mutagen: add cover photo/album art by url?

阅读更多关于 Python Mutagen: add cover photo/album art by url?

问题 Using mutagen, I am able to add normal metatags such as title , artist , and genre however when I try to add an image via a url, it doesn't work. from mutagen.mp4 import MP4 from mutagen.mp4 import MP4Cover from PIL import Image import urllib2 as urllib import io, sys, getopt #url is defined elsewhere audio = MP4(url) #clear previous meta tags audio.delete() #get album picture data cover ="http://cont-sv5-2.pandora.com/images/public/amz/5/2/9/7/095115137925_500W_488H.jpg" fd = urllib.urlopen

urlretrieve not working for this site

阅读更多关于 urlretrieve not working for this site

问题 I'm trying to download an image, however it does seem to work. Is it being blocked by ddos protection? Here is the code: urllib.request.urlretrieve("http://archive.is/Xx9t3/scr.png", "test.png") Basically download that image as "test.png." I'm using python3 hence the urllib.request before urlretrieve. import urllib.request Have that as well. Any way I can download the image? thanks! 回答1: For reasons that I cannot even imagine, the server requires a well known user agent. So you must pretend

python read lines of website source code 100 lines at a time

阅读更多关于 python read lines of website source code 100 lines at a time

问题 I'm trying to read the source code from a website 100 lines at a time For example: self.code = urllib.request.urlopen(uri) #Get 100 first lines self.lines = self.getLines() ... #Get 100 next lines self.lines = self.getLines() My getLines code is like this: def getLines(self): res = [] i = 0 while i < 100: res.append(str(self.code.readline())) i+=1 return res But the problem is that getLines() always returns the first 100 lines of the code. I've seen some solutions with next() or tell() and

Python submitting webform using requests

阅读更多关于 Python submitting webform using requests

问题 So I need to crawl this website using python, however I am finding a problem when trying to submit the form. The response I get is the same page with the form and not the result after submitting the form. I tried to use requests library/ mechanize / urllib . The code with requests: url = "http://www.justiceservices.gov.mt/courtservices/Judgements/search.aspx?func=selected" payload = {'ctl00$ContentPlaceHolderMain$search_selected_panel$tb_date_from':'', 'ctl00$ContentPlaceHolderMain$search

Do form parameter names need to be encoded when doing a POST?

阅读更多关于 Do form parameter names need to be encoded when doing a POST?

问题 Quick version: Do the names of parameters of "forms" being sent using the standard multipart/form-data encoding need to be encoded? Longer version: The upload form on 1fichier.com (a service to upload large files) uses the following to specify the file parameter to upload: <input type="file" name="file[]" size="50" title="Select the files to upload" /> The name of the parameter is file[] (notice the brackets). Using LiveHTTPHeaders I see that the parameter is sent like this (i.e. with

Web scrapping remax.com for python

阅读更多关于 Web scrapping remax.com for python

问题 This is similar to the question I had here. Which was answered perfectly. Now that I have something to work with what I am trying to do now is instead of having a url entered manually in to take data. I want to develop a function that will take in just the address, and zipcode and return the data I want. Now the problem is modifying the url to get the correct url. For example url = 'https://www.remax.com/realestatehomesforsale/25-montage-way-laguna-beach-ca-92651-gid100012499996.html' I see

Try to scrape image from image url (using python urllib ) but get html instead

阅读更多关于 Try to scrape image from image url (using python urllib ) but get html instead

问题 I've tried to get the image from the following url. http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg I can do right-click and save-as but when I tried to use urlretrieve like import urllib img_url = 'http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg' urllib.urlretrieve( img_url, 'cover.jpg') I found that it is html instead of .jpg image but I don't know why. Could you please tell me why does my method not work? Are there any option that can mimic right-click save-as method? 回答1

Python web scraping on large html webpages

阅读更多关于 Python web scraping on large html webpages

问题 I am trying to get all the historical information of a particular stock from yahoo finance. I am new to python and web-scraping. I want to download all the historical data into a CSV file. The problem is that the code downloads only the first 100 entries of any stock on the website. When any stock is viewed on the browser, we have to scroll to the bottom of the page for more table entries to load. I think the same thing is happening when I download using the library. Some kind of optimization

python urllib 库

阅读更多关于 python urllib 库

urllib模块中的方法 1.urllib.urlopen(url[,data[,proxies]]) 打开一个url的方法，返回一个文件对象，然后可以进行类似文件对象的操作。本例试着打开google >>> import urllib >>> f = urllib.urlopen('http://www.google.com.hk/') >>> firstLine = f.readline() #读取html页面的第一行 >>> firstLine '<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage"><head><meta content="/images/google_favicon_128.png" itemprop="image"><title>Google</title><script>(function(){\n' urlopen返回对象提供方法： - read() , readline() ,readlines() , fileno() , close() ：这些方法的使用方式与文件对象完全一样 - info()：返回一个httplib.HTTPMessage对象，表示远程服务器返回的头信息 - getcode()：返回Http状态码。如果是http请求，200请求成功完成

AttributeError when creating ZipFile

阅读更多关于 AttributeError when creating ZipFile

问题 Question I get an AttributeError: 'tuple' object has no attribute 'seek' when attempting to create a zipfile.ZipFile from a file path. I have no idea why, the traceback doesn't make any sense in relation to my code, is this a bug in the zipfile module, or did I not set something up properly? I followed all documentation as best as I could, to no avail. What is wrong with what I am doing, and is there a workaround / fix for it? And could it also be a mistake I am making with urllib in any way,