I am new to programming in general so please keep that in mind when you answer my question.
I have a program that takes a large 3D array (1 billion elements) and sums up
If, and this is a big IF, it is coded appropriately you will most definitely see a speed up. Now as one of my professors always noted, people often try to take an algorithm, thread it and in the end it is slower. This is often because of inefficient synchronization. So basically if you feel like delving into threading (I honestly wouldn't suggest it if you are new to programming) have a go.
In your particular case the synchronization could be quite straightforward. This is to say, you could assign each thread to a quadrant of the large 3-d matrix, where each thread is guaranteed to have sole access to a specific area of the input and output matrices, thus there is no real need to 'protect' the data from multiple access/writes.
In summary, in this specific simple case threading may be quite easy, but in general synchronization when done poorly can cause the program to take longer. It really all depends.