In C, I have a task where I must do multiplication, inversion, trasposition, addition etc. etc. with huge matrices allocated as 2-dimensional arrays, (arrays of
You are getting a theoretical background about the issue and it leaves enough space to guess what you are getting in a real run. It is said that the option is not always increasing performance because it depends on a variety of factors, for instance on the loop implementation, its load/body and others.
Each code is different and if you are interested in finding the better performance solution it is good idea just to run both variants, measure theirs execution times and compare.
Look at this approach in the answer below to have an idea of time measurement. In two words, you just wrap your code into the cycle which will lead your program running to take several seconds. As you are optimizing loops themselves it is good idea to write a shell script, which runs your app many times.