I saw that Fast Minimum Storage Ray/Triangle Intersection by Moller and Trumbore is frequently recommended.
The thing is, I don\'t mind pre-computing and storing any amo
One suggestion could be to implement the octree (http://en.wikipedia.org/wiki/Octree) algorithm to partition your 3D Space into very fine blocks. The finer the partitioning the more memory required, but the better accuracy the tree gets.
You still need to check ray/triangle intersections, but the idea is that the tree can tell you when you can skip the ray/triangle intersection, because the ray is guaranteed not to hit the triangle.
However, if you start moving your triangle around, you need to update the Octree, and then I'm not sure it's going to save you anything.