I have been measuring clock cycle count on the cortex m4 and would now like to do it on the cortex m7. The board I use is STM32F746ZG.
For the m4 everything worked with
Looking at the docs again, I'm now incredibly suspicious of a typo or copy-paste error in the ARM TRM. 0xe0000fb0 is given as the address of ITM_LAR, DWT_LAR and FP_LSR (and equivalently for *_LSR). Since all the other ITM registers are in page 0xe0000000, it looks an awful lot like whoever was responsible for that part of the Cortex-M7 documentation took the Cortex-M4 register definitions, added the new LAR and LSR to the ITM page, then copied them to the DWT and FPB pages updating the names but overlooking to update the addresses.
I'd bet my dinner that you're unwittingly unlocking ITM_LAR (or the real FP_LAR), and DWT_LAR is actually at 0xe0001fb0.
EDIT by dwelch
Somebody owes somebody a dinner.
hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));
PUT32(0xE000EDFC,0x01000000);
hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));
PUT32(0xE0001000,0x40000001);
hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));
PUT32(0xE0001FB0,0xC5ACCE55);
PUT32(0xE0001000,0x40000001);
hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));
output
00000000
00000000
00000000
00000000
00000003
40000000
00000000
00000000
00000003
40000000
00000000
00000000
00000001
40000001
0000774F
0000B311
The table in the TRM is funny looking and as the other documentation shows you add the 0xFB0 and 0xFB4 to the base, the rest of the DWT for the Cortex-M7 is 0xE0001xxx and indeed it appears that the LAR and LSR are ate 0xE0001FB0 and 0xE0001FB4.