Address canonical form and pointer arithmetic

浪尽此生 提交于 2019-11-27 04:42:39

The canonical address rules mean there is a giant hole in the 64-bit virtual address space. 2^47-1 is not contiguous with the next valid address above it, so a single mmap won't include any of the unusable range of 64-bit addresses.

+----------+
| 2^64-1   |   0xffffffffffffffff
| ...      |
| 2^64-2^47|   0xffff800000000000
+----------+
|          |
| unusable |
|          |
+----------+
| 2^47-1   |   0x00007fffffffffff
| ...      |
| 0        |   0x0000000000000000
+----------+

In other words:

Is there a guarantee by the OS that you will never be allocated memory whose address range does not vary by the 47th bit?

Yes. The 48-bit address space supported by current hardware is an implementation detail. The canonical-address rules ensure that future systems can support more virtual address bits without breaking backwards compatibility to any significant degree. You'd just need a compat flag to have the OS not give the process any memory regions with high bits not all the same. Future hardware won't need to support any kind of flag to ignore high address bits or not, because junk in the high bits is currently an error.


Fun fact: Linux defaults to mapping the stack at the top of the lower range of valid addresses.

e.g.

$ gdb /bin/ls
...
(gdb) b _start
Function "_start" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_start) pending.
(gdb) r
Starting program: /bin/ls

Breakpoint 1, 0x00007ffff7dd9cd0 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) p $rsp
$1 = (void *) 0x7fffffffd850
(gdb) exit

$ calc
2^47-1
              0x7fffffffffff
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!