How to offload particular thread of a single app to particular Xeon Phi cores?
问题 Suppose I have a single c/c++ app running on the host. there are few threads running on the host CPU and 50 threads running on the Xeon Phi cores. How can I make sure that each of these 50 runs on its own Xeon Phi core and is never purged off the core cache (given the code is small enough). Could someone please to outline a very general idea how to do this and which tool/API would be more suitable (for C/C++ code) ? What is the fastest way to exchange data between the host thread-aggregator