This question is mystifying me for years and considering this site\'s name, this is the place to ask.
Why do we, programmers, still have this StackOverflow
I've never personally encountered a stack overflow that wasn't caused by infinite recursion. In these cases, a dynamic stack size wouldn't help, it would just take a little longer to run out of memory.
I am going to summarize the arguments in the answers so far because I find no answer covering this topic good enough.
Not everyone needs it.
Dynamic stack implementation turns out to be not as straightforward as it seems.
There are some languages or runtime libraries that already have the dynamic stack feature or something similar to it.
I would like to see more examples here.
I hope I didn't forget any important pieces of information on this subject. Making this a community wiki so that anyone can add new information.
Why in every major language does the thread stack memory have to be statically allocated on thread creation?
Stack size and allocation is not necessarily related to the language you are using. It is more a question of processor and architecture.
Stack Segments are limited to 4GB on current Intel processors.
This following link is a good read, that may give you some of the answers you seek.
http://www.intel.com/Assets/PDF/manual/253665.pdf - Chapter 6.2
Old languages implementations have static stack size, thus most new popular languages (that just copied old languages, and broke/fixed whatever they felt like) have the same issue.
There is no logical reason to have a static stack size unless you are in a formal methods setting. Why introduce faults where the code is correct? Erlang for example doesn't do this, because it handles faults, like any sane partial programming language should do.
I think we will see this restriction removed in a few years.
There is simply no fundamental technical reason for fixed size stackes. They exist for historical reasons and because the programmers of compilers and VM's are lazy and don't optimize if it is good enough right now.
But GO the google language already starts with a different approach. It allocates the stack in small 4K pieces. There are also many "stackless" programming language extensions like stackless python etc who are doing the same.
The reason for this is quite simple, the more threads you have the more address space is wasted. For programs which are slower with 64bit pointers it is a serious problem. You can't really have more then hundert threads in practice. This is not good if you write a server which might want to server 60000 clients with a thread for each one (wait for the 100 core/cpu systems in the near future).
On 64bit systems it's not so serious but it still requires more resources. For example TLB entries for pages are extremely serious for good performance. If you can satisfy 4000 normal thread stackes with one single TLB entry (given a page size of 16MB and 4KB active stack space) you can see the difference. Don't waste 1020KB just for stack that you almost never use.
Small grained multithreading will be a very very important technique in the future.