问题
I have 4 GPUs hung off the same PCIe switch (PLX PEX 8747) on a Haswell based system. I want to send the same data to each GPU. Is it possible for the PCIe switch to replicate the data to N targets, rather than do N separate transfers? In effect is it possible to broadcast data to N GPUs over the PCIe bus?
I was wondering how SLI / Crosssfire handled such issues? I can imagine large amounts of data being identical identical for each GPU in a given scene being rendered. I remember reading somehwere that the old NVIDIA 890 Ultra SLI system included this broadcast mechanism in their switch for SLI.
http://www.nvidia.com/docs/IO/52280/NVIDIA_Broadcast_PWShort_TB.pdf
Is this possible with newer PCIe switches?
Update: It appears the PCIe standard supports multi-cast, as outlined by the answer below. I found some info on this at
www.pcisig.com/developers/main/training_materials/get_document?doc_id=31337695e3bc0310ea570c9df49e507b9d3eb4a5
Yes I specifically wanted a CUDA or OpenCL interface to transfer the data to N devices. Seems a shame the API doesn't support this yet.
回答1:
The PCI-e SIG ratified a scheme for switch level multicast over PCI-e about 5 years ago, and it (I believe) is fully described in the PCI-e 3.0 standard. However, I don't believe any of the GPU/Acceleration vendors support multicast yet, and there certainly isn't any CUDA level API support for such a feature as of CUDA 5.5.
来源:https://stackoverflow.com/questions/19795486/sending-the-same-data-to-n-gpus