Interesting issue we came across here this week.
We are working in C on a Harvard Architecture embedded platform, which has 16-bit data addresses and 32-bit code add
You could try a segment hack (really only a hack), so you can use the fast 16bit comparision, without any risk. Create segements at each n*0x10000 boundary with size of 4 (or even smaller), so there never can't exists a real function.
It depends on your embedded device memory space, if this is a good or a really bad solution. It could work if you have 1MB normal Flash, which will never change. It will be painfull if you have 64MB Nand Flash.