Python threading or multiprocessing for web-crawler?
问题 I've made simple web-crawler with Python. So far everything it does it creates set of urls that should be visited, set of urls that was already visited. While parsing page it adds all the links on that page to the should be visited set and page url to the already visited set and so on while length of should_be_visited is > 0. So far it does everything in one thread. Now I want to add parallelism to this application, so I need to have same kind of set of links and few threads / processes,