I have to write a not-so-large program in C++, using boost::thread.
The problem at hand, is to process a large (maybe thousands or tens of thousands. Hundreds and millon
I agree with everyone suggesting a thread pool: You schedule tasks with the pool, and the pool assigns threads to do the tasks.
If you're CPU-bound, simply keep adding threads as long as the CPU usage is below 100%. When you're I/O bound, disk thrashing might at some point prevent more threads from improving speed. That you'll have to find out yourself.
Have you seen Intel's Threading Building Blocks? Note that I cannot comment whether this is what you need. I only made a small toy project on Windows and that was a few years ago. (It was somewhat similar to yours, BTW: It recursively traverses a folder hierarchy and counts lines in source code files it finds.)