I am new to programming in general so please keep that in mind when you answer my question.
I have a program that takes a large 3D array (1 billion elements) and sums up
Before you go multithreaded, you should run a profiler against your code. It's probably a different question as to where a good (possibly) free C++ profiler can be found.
This will help you identify any bits of your code that are taking up significant portions of computation time. A tweak here and there after some profiling can sometimes make massive differences to performance.