python read file from a web URL

I am currently trying to read a txt file from a website.

My script so far is:

webFile = urllib.urlopen(currURL)

This way, I can work with the file. However, when I try to store the file (in webFile), I only get a link to the socket. Another solution I tried was to use read()

webFile = urllib.urlopen(currURL).read()

However this seems to remove the formating (\n, \t etc) are removed.

If I open the file like this:

 webFile = urllib.urlopen(currURL)

I can read it line by line:

for line in webFile:
    print line

This will should result in:

"this" 
"is" 
"a"
"textfile"

But I get:

't'
'h'
'i'
...

I wish to get the file on my computer, but maintain the format at the same time.

You should use readlines() to read entire line:

response = urllib.urlopen(currURL)
lines = response.readlines()
for line in lines:
    .
    .

But, i strongly recommend you to use requests library. Link here http://docs.python-requests.org/en/latest/

This is because you iterate over a string. And that will result in character for character printing.

Why not save the whole file at once?

import urllib
webf = urllib.urlopen('http://stackoverflow.com/questions/32971752/python-read-file-from-web-site-url')
txt = webf.read()

f = open('destination.txt', 'w+')
f.write(txt)
f.close()

If you really want to loop over the file line for line use txt = webf.readlines() and iterate over that.

If you're just trying to save a remote file to your local server as part of a python script, you could use the PycURL library to download and save it without parsing it. More info here - http://pycurl.sourceforge.net

Alternatively, if you want to read and then write the output, I think you've just got the methods out of sequence. Try the following:

# Assign the open file to a variable
webFile = urllib.urlopen(currURL)

# Read the file contents to a variable
file_contents = webFile.read()
print(file_contents)

> This will be the file contents

# Then write to a new local file
f = open('local file.txt', 'w')
f.write(file_contents)

If neither applies, please update the question to clarify.

来源：https://stackoverflow.com/questions/32971752/python-read-file-from-a-web-url

标签

python

urllib

readfile