You can get quite far with Linux by removing the 'disturbance' from other processes the to the realtime process. I played with the same thing in Windows, which is a much larger horror to get right, but it shows the direction. So a kind of check-list:
- Most important (strange but true): the hardware. Don't go for a laptop, this will be optimized to do strange things during SMM interrupts. Nothing you can do.
- The drivers: Linux (and Windows) has bad drivers and good drivers. Related to hardware. And there is only one way to find out: benchmarking.
Isolate from rest of system, disable all sharing:
- Isolate one CPU (
man cpuset
). Create two CPU sets, one for normal processes, and one for your realtime process.
- Reduce realtime part of your code to the minimum. Communicate with large buffer with other parts of the system. Reduce IO to bare mimimum (since IO has bad guarantees).
- Make the process have the highest (soft) realtime priority.
- Disable HyperThreading (you don't want to share)
- pre-allocate the memory you need, and mlock() the memory.
- Isolate the devices you use. Start by allocating a dedicated IRQ to the device (move the other devices to another IRQ, or remove other devices/drivers).
- Isolate the IO you use.
Reduce activity of rest of system:
- only start processes you really really need.
- remove hardware you don't need like disks and other hardware.
- disable swapping.
- don't use Linux kernel modules or load them up front. The init of modules is unpredictable.
- preferably remove the user also :)
Make it stable and reproducable:
- disable all energy savings. You want the same performance all of the time.
- review all BIOS settings, and remove all 'eventing' and 'sharing' from them. So no fancy speedsteps, thermal management etc. Choose low latency, don't choose things with 'burst' in the name since that generally trades throughput for worse performance.
- review Linux driver settings, and lower latencies (if applicable).
- use a recent kernel which tries to look like a realtime kernel each day somewhat more.
And then benchmark, using stress testing and leaving the machine on for days while recording max. latencies.
So: good luck :)