问题
i am working on a project that requires me to create multiple threads to download a large remote file. I have done this already but i cannot understand while it takes a longer amount of time to download a the file with multiple threads compared to using just a single thread. I used my xampp localhost to carry out the time elapsed test. I would like to know if its a normal behaviour or is it because i have not tried downloading from a real server.
Thanks Kennedy
回答1:
9 women can't combine to make a baby in one month. If you have 10 threads, they each have only 10% the bandwidth of a single thread, and there is the additional overhead for context switching, etc.
回答2:
Python threading use something call the GIL (Golbal Interpreter Lock) that sometime degrade the programs execution time.
Without doing a lot of talk here i invite you to read this and this maybe it can help you to understand your problem, you can also see the two conference here and here.
Hope this can help :)
回答3:
Twisted uses non-blocking I/O, that means if data is not available on socket right now, doesn't block the entire thread, so you can handle many socket connections waiting for I/O in one thread simultaneous. But if doing something different than I/O (parsing large amounts of data) you still block the thread.
When you're using stdlib's socket module it does blocking I/O, that means when you're call socket.read
and data is not available at the moment — it will block entire thread, so you need one thread per connection to handle concurrent download.
These are two approaches to concurrency:
- Fork new thread for new connection (
threading
+socket
from stdlib). - Multiplex I/O and handle may connections in one thread (
Twisted
).
来源:https://stackoverflow.com/questions/4219134/python-urllib2-threading-single-download-thread-faster-than-multiple-download-t