In a typical C program, the linux kernel provides 84K - ~100K of memory. How does the kernel allocate more memory for the stack when the process uses the given memory.
I
The stack for a given process has a limited, fixed size. The reason you can't add more memory as you (theoretically) describe is because the stack must be contiguous, and it grows toward the heap. So, when the stack reaches the heap, no extension is possible.
The stack size for a userland program is not determined by the kernel. The kernel stack size is a configuration option for the kernel (usually 4k or 8k).
Edit: if you already know this, and were merely talking about the allocation of physical pages for a process, then you have the procedure down already. But there's no need to keep track of the "stack size" like this: the virtual pages in the stack with no pagetable entries are just normal overcommitted virtual pages. Physical memory will be granted on their first access. But the kernel does not have to overcommit memory, and thus a stack will probably have complete physical realization when the executable is first loaded.
There are a number of different methods used, depending on the OS (linux realtime vs. normal) and the language runtime system underneath:
1) dynamic, by page fault
typically preallocate a few real pages to higher addresses and assign the initial sp to that. The stack grows downward, the heap grows upward. If a page fault happens somewhat below the stack bottom, the missing intermediate pages are allocated and mapped. Effectively increasing the stack from the top towards the bottom automatically. There is typically a maximum up to which such automatic allocation is performed, which can or can not be specified in the environment (ulimit), exe-header, or dynamically adjusted by the program via a system call (rlimit). Especially this adjustability varies heavily between different OSes. There is also typically a limit to "how far away" from the stack bottom a page fault is considered to be ok and an automatic grow to happen. Notice that not all systems' stack grows downward: under HPUX it (used?) to grow upward so I am not sure what a linux on the PA-Risc does (can someone comment on this).
2) fixed size
other OSes (and especially in embedded and mobile environments) either have fixed sizes by definition, or specified in the exe header, or specified when a program/thread is created. Especially in embedded real time controllers, this is often a configuration parameter, and individual control tasks get fix stacks (to avoid runaway threads taking the memory of higher prio control tasks). Of course also in this case, the memory might be allocated only virtually, untill really needed.
3) pagewise, spaghetti and similar
such mechanisms tend to be forgotten, but are still in use in some run time systems (I know of Lisp/Scheme and Smalltalk systems). These allocate and increase the stack dynamically as-required. However, not as a single contigious segment, but instead as a linked chain of multi-page chunks. It requires different function entry/exit code to be generated by the compiler(s), in order to handle segment boundaries. Therefore such schemes are typically implemented by a language support system and not the OS itself (used to be earlier times - sigh). The reason is that when you have many (say 1000s of) threads in an interactive environment, preallocating say 1Mb would simply fill your virtual address space and you could not support a system where the thread needs of an individual thread is unknown before (which is typically the case in a dynamic environment, where the use might enter eval-code into a separate workspace). So dynamic allocation as in scheme 1 above is not possible, because there are would be other threads with their own stacks in the way. The stack is made up of smaller segments (say 8-64k) which are allocated and deallocated from a pool and linked into a chain of stack segments. Such a scheme may also be requried for high performance support of things like continuations, coroutines etc.
Modern unixes/linuxes and (I guess, but not 100% certain) windows use scheme 1) for the main thread of your exe, and 2) for additional (p-)threads, which need a fix stack size given by the thread creator initially. Most embedded systems and controllers use fixed (but configurable) preallocation (even physically preallocated in many cases).
edit: typo
The stack can only be used up to a certain length, because it has a fixed storage capacity in memory. If your question asks in what direction does the stack being used up? the answer is downwards. It is filled down in memory towards the heap. The heap is a dynamic component of memory by which it can actually grow from the bottom up, based on your need of data storage.