I recently benchmarked the .NET 4 garbage collector, allocating intensively from several threads. When the allocated values were recorded in an array, I observed no scalability
Not so sure what this is about and exactly what you saw on your machine. There are however two distinct versions of the CLR on your machine. Mscorwks.dll and mscorsvc.dll. The former is the one you get when you run your program on a work station, the latter on one of the server versions of Windows (like Windows 2003 or 2008).
The work station version is kind to your local PC, it doesn't gobble all machine resources. You can still read your email while a GC is going on. The server version is optimized to scale on server level hardware. Lots of RAM (GC doesn't kick in that quick) and lots of CPU cores (garbage gets collected on more than one core). Your quoted article probably talks about the server version.
You can select the server version on your workstation, use the
element in your .config file.