问题
I am working on an as close to real-time
system as possible in linux and need to send about 600-800 bytes in a TCP packet as soon as I receive a specific packet.
For best possible latencies I want this packet to be sent directly from the kernel instead of it the received packet going all the way up to the userspace and the applicaiton and then making its way back.
If I were on windows I'd have written an NDIS filter which I would cache the packet to be sent with and the matching parameters so that it would check the received packet and on a match fire the pre-cached packet onto the network without passing the received packet up to the higher layers.
So my question is what is the closest analogue of an NDIS filter on linux?
I have read about netfilter and perhaps that is what I would use, but I do not know if it is the best way possible.
What else could I do to achieve lowest-possible latencies?
My current purely userspace code gives me about 80-100 micro seconds on an Intel Xeon 3.7 GHz processor running Ubuntu 10.04 on 2.6.3x kernel.
回答1:
You can use the iptables
target NFLOG
to copy packets out to userspace or NFQUEUE
to allow userspace to mangle them. This interaction happens over netlink, but you can use libraries such as libnetfilter_log and libnetfilter_queue which wrap around it.
回答2:
There is similar mechanism in Linux kernel called BPF ( Berkeley packet filter). Register a BPF filter into kernel from your application. The packets matching the filter would be captured and forwarded to registered hook function.
Below is an exmpale code I found on internet. ( https://gist.github.com/939154 ) Basically you had to create an open, binding it with an BPF filter and then select to this FD for receiving packets: ; set_filter(int fd) { struct bpf_program fcode = {0};
/* dump ssh packets only */
struct bpf_insn insns[] = {
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 22, 2, 0),
BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 22, 0, 1),
BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
BPF_STMT(BPF_RET+BPF_K, 0),
};
/* Set the filter */
fcode.bf_len = sizeof(insns) / sizeof(struct bpf_insn);
fcode.bf_insns = &insns[0];
if(ioctl(fd, BIOCSETF, &fcode) < 0)
return -1;
return 0;
}
The bpf_inst looks terrbiel. However, it is not needed to write it manually. You can use tcp-dump to auto generated these scripts.
for example:
sudo tcpdump 'tcp[13]=18' -i eth0 -dd
{ 0x28, 0, 0, 0x0000000c },
{ 0x15, 0, 8, 0x00000800 },
{ 0x30, 0, 0, 0x00000017 },
{ 0x15, 0, 6, 0x00000006 },
{ 0x28, 0, 0, 0x00000014 },
{ 0x45, 4, 0, 0x00001fff },
{ 0xb1, 0, 0, 0x0000000e },
{ 0x50, 0, 0, 0x0000001b },
{ 0x15, 0, 1, 0x00000012 },
{ 0x6, 0, 0, 0x00000060 },
{ 0x6, 0, 0, 0x00000000 },
来源:https://stackoverflow.com/questions/12829150/what-is-the-analogue-of-an-ndis-filter-in-linux