I am trying to implement AES-256 in CTR mode using nVidia CUDA. I have successfully coded CPU code for key expansion and now I need to implement the actual AES-256 algorithm
The T tables are a straightforward description of the AES round transformation in matrix form. To build them, see the original Rijndael NIST proposal, section 5.2.1.
In case anyone is still interested, these lookup tables can be found in the standard library of the Go programming language - http://golang.org/src/crypto/aes/const.go#L80
There are also instructions on how to generate the tables in the test files of the same package.