I have the following code doing Sin/Cos function using a pre-calculated memory table. in the following example the table has 1024*128 items covering all the Sin/Cos values f
If you have to compute it so many times,
One thing you could try would be to use the fact that cos(x) = sin(x + pi/2). And make the sine table one quarter larger, so you can reuse it as the cosine table starting one quarter in. Not sure if C# allows you to get a pointer to the middle of the table, as C would. But even if not, the decreased cache usage might be worth more than the added time for the offset into the sine table.
That, is, expressed with C:
double* _CosineDoubleTable = &_SineDoubleTable[TABLESIZE / 4];
No guarantee that it'll do a lot of good, but depending on your processor, integer math is often faster than floating point math. That being the case, I'd probably rearrange the first three lines to first calculate an integer, then reduce its range (if necessary). Of course, as BlueRaja pointed out, using C++ will almost certainly help as well. Using assembly language probably won't do much good though -- for a table lookup like this, a C++ compiler can usually produce pretty good code.
If possible, I'd also look very hard at your accuracy requirements -- without knowing what you're doing with the values, it's hard to say, but for a lot of purposes, your table size and the precision you're storing are far beyond what's necessary or even close to useful.
Finally, I'd note that it's worth at least looking into whether this whole strategy is worthwhile at all. At one time, there was no question that using tables to avoid complex calculations was a solid strategy. Processors have sped up a lot faster than memory though -- to the point that such a table lookup is often a net loss nowadays. In fact, nearly the only way the table stands a chance at all is if it's small enough to fit in the processor cache.
That's going to be pretty damn fast as it is.
If you really need to squeeze every imaginable drop of performance out of this code, you may want to consider writing this part of it (including the outer loop that loops billions of times) in a C++ dll (or even ASM). Make sure your compiler is set to allow the largest possible instruction set available to you.
[Edit] I missed how large the tables are - this could very well slow down your code significantly due to cache misses. Have you tried benchmarking it against Math.Cos(), or other methods of approximating trig functions (you can get very good approximations with a few simple multiplications using Taylor Series)
I'm assuming Taylor expansions are no use for you. So if you want to use a table: You only need one table half as big.
cos(x) = sin(pi/2-x).
sin(pi + x) = -sin(x)
You can make your's code non-branching. Convert first to int format.
int index = (int)(Value * FACTOR);
index %= TABLE_SIZE; // one instuction (mask)
index = (index >= 0) ? index :TABLE_SIZE-index; // one instruction isel
double sineValue = _SineDoubleTable[index];
Compare with Math.Sin anyway. Profile Profile Priofile. (Cache miss might slow down your code in real examples.)
You could try to use unsafe code to eliminate array bounds checking.
But even a unsafe, optimized version does not seem to come anywhere near Math.Sin.
Results based on 1'000'000'000 iterations with random values:
(1) 00:00:57.3382769 // original version
(2) 00:00:31.9445928 // optimized version
(3) 00:00:21.3566399 // Math.Sin
Code:
static double SinOriginal(double Value)
{
Value %= PI2;
if (Value < 0) Value += PI2;
int index = (int)(Value * FACTOR);
return _SineDoubleTable[index];
}
static unsafe double SinOptimized(double* SineDoubleTable, double Value)
{
int index = (int)(Value * FACTOR) % TABLE_SIZE;
return (index < 0) ? SineDoubleTable[index + TABLE_SIZE]
: SineDoubleTable[index];
}
Test program:
InitializeTrigonometricTables();
Random random = new Random();
SinOriginal(random.NextDouble());
var sw = System.Diagnostics.Stopwatch.StartNew();
for (long i = 0; i < 1000000000L; i++)
{
SinOriginal(random.NextDouble());
}
sw.Stop();
Console.WriteLine("(1) {0} // original version", sw.Elapsed);
fixed (double* SineDoubleTable = _SineDoubleTable)
{
SinOptimized(SineDoubleTable, random.NextDouble());
sw = System.Diagnostics.Stopwatch.StartNew();
for (long i = 0; i < 1000000000L; i++)
{
SinOptimized(SineDoubleTable, random.NextDouble());
}
sw.Stop();
Console.WriteLine("(2) {0} // optimized version", sw.Elapsed);
}
Math.Sin(random.NextDouble());
sw = System.Diagnostics.Stopwatch.StartNew();
for (long i = 0; i < 1000000000L; i++)
{
Math.Sin(random.NextDouble());
}
sw.Stop();
Console.WriteLine("(3) {0} // Math.Sin", sw.Elapsed);