问题
I want all read & write requests to a PCIe device to be cached by CPU caches. However, it does not work as I expected.
These are my assumptions on write-back MMIO regions.
- Writes to the PCIe device happen only on cache write-back.
- The size of TLP payloads is cache block size (64B).
However, captured TLPs do not follow my assumptions.
- Writes to the PCIe device happen on every write to the MMIO region.
- The size of TLP payloads is 1B.
I write 8-byte of 0xff
to the MMIO region with the following user space program & device driver.
Part of User Program
struct pcie_ioctl ioctl_control;
ioctl_control.bar_select = BAR_ID;
ioctl_control.num_bytes_to_write = atoi(argv[1]);
if (ioctl(fd, IOCTL_WRITE_0xFF, &ioctl_control) < 0) {
printf("ioctl failed\n");
}
Part of Device Driver
case IOCTL_WRITE_0xFF:
{
int i;
char *buff;
struct pci_cdev_struct *pci_cdev = pci_get_drvdata(fpga_pcie_dev.pci_device);
copy_from_user(&ioctl_control, (void __user *)arg, sizeof(ioctl_control));
buff = kmalloc(sizeof(char) * ioctl_control.num_bytes_to_write, GFP_KERNEL);
for (i = 0; i < ioctl_control.num_bytes_to_write; i++) {
buff[i] = 0xff;
}
memcpy(pci_cdev->bar[ioctl_control.bar_select], buff, ioctl_control.num_bytes_to_write);
kfree(buff);
break;
}
I modified MTRRs to make the corresponding MMIO region write-back. The MMIO region starts from 0x0c7300000, and the length is 0x100000 (1MB). Followings are cat /proc/mtrr
results for different policies. Please note that I made each region exclusive.
Uncacheable
reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: uncachable
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable
Write-combining
reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-combining
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable
Write-back
reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size= 64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size= 32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size= 16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size= 1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size= 1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size= 1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size= 1MB, count=1: write-back
reg09: base=0x0c7400000 ( 3188MB), size= 1MB, count=1: uncachable
Followings are waveform captures for 8B write with different policies. I have used integrated logic analyzer (ILA) to capture these waveform. Please watch pcie_endpoint_litepcietlpdepacketizer_tlp_req_payload_dat
when pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid
is set. You can count the number of packets by counting pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid
in these waveform example.
- Uncacheable: link -> correct, 1B x 8 packets
- Write-combining: link -> correct, 8B x 1 packet
- Write-back: link -> unexpected, 1B x 8 packets
System configuration is like below.
- CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
- OS: Linux kernel 4.15.0-38
- PCIe Device: Xilinx FPGA KC705 programmed with litepcie
Related Links
- Generating a 64-byte read PCIe TLP from an x86 CPU
- How to Implement a 64B PCIe* Burst Transfer on Intel® Architecture
- Write Combining Buffer Out of Order Writes and PCIe
- Do Ryzen support write-back caching for Memory Mapped IO (through PCIe interface)?
- MTRR (Memory Type Range Register) control
- PATting Linux
- Down to the TLP: How PCI express devices talk (Part I)
回答1:
In short, it seems that mapping MMIO region write-back does not work by design.
Please upload an answer if anyone finds that it is possible.
I came to find John McCalpin's articles and answers. First, mapping MMIO region write-back is not possible. Second, workaround is possible on some processors.
Mapping MMIO region write-back is not possible
Quote from this link
FYI: The WB type will not work with memory-mapped IO. You can program the bits to set up the mapping as WB, but the system will crash as soon as it gets a transaction that it does not know how to handle. It is theoretically possible to use WP or WT to get cached reads from MMIO, but coherence has to be handled in software.
Quote from this link
Only when I set both PAT and MTRR to WB does the kernel crash
Workaround is possible on some processors
Notes on Cached Access to Memory-Mapped IO Regions, John McCalpin
There is one set of mappings that can be made to work on at least some x86-64 processors, and it is based on mapping the MMIO space twice. Map the MMIO range with a set of attributes that allow write-combining stores (but only uncached reads). Map the MMIO range a second time with a set of attributes that allow cache-line reads (but only uncached, non-write-combined stores).
来源:https://stackoverflow.com/questions/53311131/mapping-mmio-region-write-back-does-not-work