I\'m not very experienced with subjects such as Concurrency and Multithreading. In fact, in most of my web-development career I had never needed to touch these subjects.
Regarding your question of why fork()
instead of threading: when you use separate processes, you get automatic separation of address spaces. In multithreaded programs, it is very common for threads to communicate using their (naturally) shared memory. This is very efficient, but it is also hard to get all the synchronization between threads right, and this is why some languages are better at multithreading than others: they provide better abstractions to handle the common cases of communication between threads.
With separate processes, you don't have these problems to the same extent. Typically, you set up communication between processes to follow some form of message-passing pattern, which is easier to get right. (Well, you can use shared memory between processes too, but that's not as common as message passing.) On Unix systems fork()
has typically been very cheap, so traditional design of concurrent programs in Unix uses processes, and pipes to communicate between them, but on systems where process creation is an expensive operation, threads are often regarded as the better approach.
no language is better than an other one, it's all about the concepts. Doing things simultaniously by processes consumes usually more resources than threads (wich could be seen as lightweight processes) some languages come with easy too use libs. Java threads are easy too use, Posix threads (C on unix) are a bit more complicated.
Re: why some languages are better for concurrency than others: it all depends on the tools that the language offers to the programmer. Some languages, like C++, give you low-level access to the system threads. Java has all kinds of libraries that offer constructs for concurrent programming, kind of like design patterns (see latch, barrier, et al). Some languages make it easier than others to deal with threads. Some languages keep you from sharing state between threads, which is a major source of bugs.
And then some languages have different underlying thread models than others. Python's thread model, as I understand it, uses a single system thread and handles all the context-switching itself, which is not as clean as it is only as granular as a single Python instruction.
As an analogy, it's like asking why some languages are better at handling regular expressions, or searching, or doing complex math when in the end its all just moving bits around.
Edit: frunsi is correct, Python threads are system threads (apparently this is a common misconception). The problem I was referring to was with the GIL, or global interpreter lock, which controls thread execution. Only a single thread can run in the Python interpreter at once, and context is only switched between instructions. My knowledge of Python multithreading mainly comes from this paper: www.dabeaz.com/python/GIL.pdf . Maybe a little off topic, but a good reference nonetheless.