I\'m implementing a cryptographic library, which has optimized assembly for ARM Cortex M-0 architecture, including several units using multiply-accumulate macro, which uses 32x3