I need to do multiplication on matrices. I\'m looking for a library that can do it fast. I\'m using the Visual C++ 2008 compiler and I have a core i7 860 so if the library i
There's an option to implement this yourself, perhaps using std::valarray because that may be parallelised using OpenMP: gcc certainly has such a version, MSVC++ probably does too.
Otherwise, the following tricks: one of the matrices should be transposed. Then you have:
AB[i,j] = Sum(k) A[i,k] B^t [j,k]
where you're scanning contiguous memory. If you have 8 cores you can fairly easily divide the set of [i,j] indices into 8, and give each core 1/8 of the total job. To make it even faster you can use vector multiply instructions, most compilers will provide a special function for this. The result won't be as fast as a tuned library but it should be OK.
If you're doing longer calculations such as polynomial evaluation, a threading evaluator which also has thread support (gak, two kind of threads) will do a good job even though it won't do low level tuning. If you really want to do stuff fast, you have to use a properly tuned library like Atlas, but then, you probably wouldn't be running Windows if you were serious about HPC.