Linux on arm64: sendto causes “Unhandled fault: alignment fault (0x96000021)” when sending data from mmapped coherent DMA buffer

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-01 13:27:47

That driver needs updating. ARCH_HAS_DMA_MMAP_COHERENT hasn't been defined by anything other than PowerPC for a long time, and even that looks like a forgotten leftover.

There has been a generic dma_mmap_coherent() implementation since 3.6, so that can, and should, be used unconditionally. The result of the current code is that, thanks to the #ifdef, you always take the other path, then thanks to pgprot_noncached() you end up making the userspace mapping of the buffer Strongly-ordered (Device nGnRnE in AArch64 terms). That's generally a bad idea, as userspace code will assume it's always operating on Normal memory (unless explicitly crafted not to), and can safely do things like unaligned or exclusive accesses, both of which are liable to go badly wrong on Device-type memory. I'm not even going to ask what kind of craziness ends up with the kernel copying data back out of a userspace mapping of a kernel buffer*, but suffice to say the kernel - via copy_{to,from,in}_user() - also assumes userspace addresses are mapped as Normal memory and thus safe for unaligned accesses. Frankly I'm a little surprised this doesn't blow up similarly on 32-bit ARM, so I guess your data happens to always be at least 4-byte aligned - that would also explain why reading words (with 32-bit accesses) is fine, if only 64-bit doubleword accesses can potentially be misaligned.

In short, just use dma_mmap_coherent(), and get rid of the open-coded poor equivalent. That will give userspace a Normal non-cacheable mapping (or a cacheable one for a hardware-coherent device) which will work as expected. It's also not broken in terms of assuming a dma_addr_t is a physical address, as your driver code seems to do - that's another thing that's liable to come around and bite you in the bum sooner or later (ZynqMP has a System MMU, so you can presumably update to a 4.9 kernel, wire up some Stream IDs, add them to the DT, and watch that assumption go bang in new and exciting ways).

* Although it does occur to me that there was some circumstance under which copying from the very end of a page may sometimes over-read into the next page, which could trigger this unwittingly if the following page happened to be a Device/Strongly-ordered mapping, which led to this patch in 4.5. Linus' response to such memory layouts was "...and nobody sane actually does that..."

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!