I want to parse /proc/net/tcp/
, but is it safe?
How should I open and read files from /proc/
and not be afraid, that some other process (o
When you read from a /proc file, the kernel is calling a function which has been registered in advance to be the "read" function for that proc file. See the __proc_file_read
function in fs/proc/generic.c .
Therefore, the safety of the proc read is only as safe as the function the kernel calls to satisfy the read request. If that function properly locks all data it touches and returns to you in a buffer, then it is completely safe to read using that function. Since proc files like the one used for satisfying read requests to /proc/net/tcp have been around for a while and have undergone scrupulous review, they are about as safe as you could ask for. In fact, many common Linux utilities rely on reading from the proc filesystem and formatting the output in a different way. (Off the top of my head, I think 'ps' and 'netstat' do this).
As always, you don't have to take my word for it; you can look at the source to calm your fears. The following documentation from proc_net_tcp.txt tells you where the "read" functions for /proc/net/tcp live, so you can look at the actual code that is run when you read from that proc file and verify for yourself that there are no locking hazards.
This document describes the interfaces /proc/net/tcp and /proc/net/tcp6.
Note that these interfaces are deprecated in favor of tcp_diag. These /proc interfaces provide information about currently active TCP connections, and are implemented by tcp4_seq_show() in net/ipv4/tcp_ipv4.c and tcp6_seq_show() in net/ipv6/tcp_ipv6.c, respectively.
Although the files in /proc
appear as regular files in userspace, they are not really files but rather entities that support the standard file operations from userspace (open
, read
, close
). Note that this is quite different than having an ordinary file on disk that is being changed by the kernel.
All the kernel does is print its internal state into its own memory using a sprintf
-like function, and that memory is copied into userspace whenever you issue a read(2)
system call.
The kernel handles these calls in an entirely different way than for regular files, which could mean that the entire snapshot of the data you will read could be ready at the time you open(2)
it, while the kernel makes sure that concurrent calls are consistent and atomic. I haven't read that anywhere, but it doesn't really make sense to be otherwise.
My advice is to take a look at the implementation of a proc file in your particular Unix flavour. This is really an implementation issue (as is the format and the contents of the output) that is not governed by a standard.
The simplest example would be the implementation of the uptime proc file in Linux. Note how the entire buffer is produced in the callback function supplied to single_open
.
/proc is a virtual file system : in fact, it just gives a convenient view of the kernel internals. It's definitely safe to read it (that's why it's here) but it's risky on the long term, as the internal of these virtual files may evolve with newer version of kernel.
EDIT
More information available in proc documentation in Linux kernel doc, chapter 1.4 Networking I can't find if the information how the information evolve over time. I thought it was frozen on open, but can't have a definite answer.
EDIT2
According to Sco doc (not linux, but I'm pretty sure all flavours of *nix behave like that)
Although process state and consequently the contents of /proc files can change from instant to instant, a single read(2) of a /proc file is guaranteed to return a ``sane'' representation of state, that is, the read will be an atomic snapshot of the state of the process. No such guarantee applies to successive reads applied to a /proc file for a running process. In addition, atomicity is specifically not guaranteed for any I/O applied to the as (address-space) file; the contents of any process's address space might be concurrently modified by an LWP of that process or any other process in the system.
I have the source for Linux 2.6.27.8 handy since I'm doing driver development at the moment on an embedded ARM target.
The file ...linux-2.6.27.8-lpc32xx/net/ipv4/raw.c
at line 934 contains, for example
seq_printf(seq, "%4d: %08X:%04X %08X:%04X"
" %02X %08X:%08X %02X:%08lX %08X %5d %8d %lu %d %p %d\n",
i, src, srcp, dest, destp, sp->sk_state,
atomic_read(&sp->sk_wmem_alloc),
atomic_read(&sp->sk_rmem_alloc),
0, 0L, 0, sock_i_uid(sp), 0, sock_i_ino(sp),
atomic_read(&sp->sk_refcnt), sp, atomic_read(&sp->sk_drops));
which outputs
[wally@zenetfedora ~]$ cat /proc/net/tcp
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
0: 017AA8C0:0035 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 15160 1 f552de00 299
1: 00000000:C775 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 13237 1 f552ca00 299
...
in function raw_sock_seq_show()
which is part of a hierarchy of procfs handling functions. The text is not generated until a read()
request is made of the /proc/net/tcp
file, a reasonable mechanism since procfs reads are surely much less common than updating the information.
Some drivers (such as mine) implement the proc_read function with a single sprintf()
. The extra complication in the core drivers implementation is to handle potentially very long output which may not fit in the intermediate, kernel-space buffer during a single read.
I tested that with a program using a 64K read buffer but it results in a kernel space buffer of 3072 bytes in my system for proc_read to return data. Multiple calls with advancing pointers are needed to get more than that much text returned. I don't know what the right way to make the returned data consistent when more than one i/o is needed. Certainly each entry in /proc/net/tcp
is self-consistent. There is some likelihood that lines side-by-side are snapshot at different times.
In general, no. (So most of the answers here are wrong.) It might be safe, depending on what property you want. But it's easy to end up with bugs in your code if you assume too much about the consistency of a file in /proc
. For example, see this bug which came from assuming that /proc/mounts was a consistent snapshot.
For example:
/proc/uptime
is totally atomic, as someone mentioned in another answer -- but only since Linux 2.6.30, which is less than two years old. So even this tiny, trivial file was subject to a race condition until then, and still is in most enterprise kernels. See fs/proc/uptime.c for the current source, or the commit that made it atomic. On a pre-2.6.30 kernel, you can open
the file, read
a bit of it, then if you later come back and read
again, the piece you get will be inconsistent with the first piece. (I just demonstrated this -- try it yourself for fun.)
/proc/mounts
is atomic within a single read
system call. So if you read
the whole file all at once, you get a single consistent snapshot of the mount points on the system. However, if you use several read
system calls -- and if the file is big, this is exactly what will happen if you use normal I/O libraries and don't pay special attention to this issue -- you will be subject to a race condition. Not only will you not get a consistent snapshot, but mount points which were present before you started and never stopped being present might go missing in what you see. To see that it's atomic for one read()
, look at m_start() in fs/namespace.c and see it grab a semaphore that guards the list of mountpoints, which it keeps until m_stop()
, which is called when the read()
is done. To see what can go wrong, see this bug from last year (same one I linked above) in otherwise high-quality software that blithely read /proc/mounts
.
/proc/net/tcp
, which is the one you're actually asking about, is even less consistent than that. It's atomic only within each row of the table. To see this, look at listening_get_next() in net/ipv4/tcp_ipv4.c and established_get_next()
just below in the same file, and see the locks they take out on each entry in turn. I don't have repro code handy to demonstrate the lack of consistency from row to row, but there are no locks there (or anything else) that would make it consistent. Which makes sense if you think about it -- networking is often a super-busy part of the system, so it's not worth the overhead to present a consistent view in this diagnostic tool.
The other piece that keeps /proc/net/tcp
atomic within each row is the buffering in seq_read()
, which you can read in fs/seq_file.c. This ensures that once you read()
part of one row, the text of the whole row is kept in a buffer so that the next read()
will get the rest of that row before starting a new one. The same mechanism is used in /proc/mounts
to keep each row atomic even if you do multiple read()
calls, and it's also the mechanism that /proc/uptime
in newer kernels uses to stay atomic. That mechanism does not buffer the whole file, because the kernel is cautious about memory use.
Most files in /proc
will be at least as consistent as /proc/net/tcp
, with each row a consistent picture of one entry in whatever information they're providing, because most of them use the same seq_file
abstraction. As the /proc/uptime
example illustrates, though, some files were still being migrated to use seq_file
as recently as 2009; I bet there are still some that use older mechanisms and don't have even that level of atomicity. These caveats are rarely documented. For a given file, your only guarantee is to read the source.
In the case of /proc/net/tcp
, you can read it and parse each line without fear. But if you try to draw any conclusions from multiple lines at once -- beware, other processes and the kernel are changing it while you read it, and you are probably creating a bug.
Short of unknown bugs, there are no race conditions in /proc
that would lead to reading corrupted data or a mix of old and new data. In this sense, it's safe. However there's still the race condition that much of the data you read from /proc
is potentially-outdated as soon as it's generated, and even moreso by the time you get to reading/processing it. For instance processes can die at any time and a new process can be assigned the same pid; the only process ids you can ever use without race conditions are your own child processes'. Same goes for network information (open ports, etc.) and really most of the information in /proc
. I would consider it bad and dangerous practice to rely on any data in /proc
being accurate, except data about your own process and potentially its child processes. Of course it may still be useful to present other information from /proc
to the user/admin for informative/logging/etc. purposes.