In C/C++ under Linux, I need to allocate a large (several gigabyte) block of memory, in order to store real-time data from a sensor connected to the ethernet port and streaming
Well, under linux you can use mlock()/mlockall() to keep an adress range in physical memory and prevent it from being swapped out. The process using mlock needs a couple of privileges to do so, "man mlock" has the details. I am not sure about the maximum mlock'able block (it might differ from what seems to be "free"), so probably a binary search could help (lock a range, if that fails reduce the size of the area etc..)
On the other hand, 110MB/s is not really a problem for a Solid-State-Drive. A 60GB SSD with 280MB/s write speed costs about $200 on the corner. Just copy the sensor data into a small write buffer and stream that to the SSD.
If you malloc
the needed amount of memory and write to it at that speed, you'll still get a performance hit due to all the page faults (i.e. mapping each page of virtual memory to physical memory, which also may include swapping out memory of other processes).
In order to avoid that, you could memset
the entire allocated buffer to 0 before you start reading from the sensor, so that all the needed virtual memory is mapped to physical memory.
If you only use the available physical memory, you should suffer no swapping at all. Using more would cause memory of other processes to be swapped to the disk - if these processes are idle, it shouldn't pose any problem. If they're active (i.e. using their memory once in a while), some swapping would occur - probably in a much lower rate than the hard-drive bandwidth. The more memory you use, more active processes' memory would be swapped out, and more HD activity would occur - at this point the maximal amount of memory you could use with decent performance is pretty much a result of trial and error.
By using more than the physical memory available, you'll definitely cause swapping at the rate of memory writes, and there's no way to avoid that.
If the computer system is dedicated to receiving data from your sensor, you can simply disable swap. Then allocate as big buffer as you can, leaving enough memory in the system only for essential tools.
What is the best way to determine how much memory to allocate?
Due to how virtual memory is used, non-swappable kernel memory, it is nearly impossible to identify how much of installed memory can be accessed by an application.
Best I can come up with is to allow user to configure how much memory to use for buffering.
Am I limited to just allocating a slightly smaller block than the reported free memory,
Reported free memory is not really "free physical memory." Unfortunately.
or can I interface more directly with the linux virtual memory manager?
That can be done by using a custom device driver, allocating memory directly in kernel space and providing access to it via mmap()
. Generally not recommended, yet would works in specialized cases such as yours.
However, I also need to make sure that there will be no disk-swapping
At pace of the Linux kernel development, knowledge becomes obsolete quite fast, so take with grain of salt what I'm saying here. You can try to play with the following:
SysV shared memory. It is generally not swapped. See man shmget
.
tmpfs - in-memory file system. The memory was pinned to RAM at least in early 2.6 kernels and thus was not swappable. To use it as memory, create a file on tmpfs, write()
something into the file (to force the memory being actually allocated) and then mmap() the file.