问题
I'm having a lot of trouble sending netlink messages from kernel module to userspace-daemon. They randomly fail. On the kernel side, the genlmsg_unicast
fails with EAGAIN
while on the user-side, nl_recvmsgs_default
(function from libnl
) fails with NLE_NOMEM
which is caused by recvmsg
syscall failing with ENOBUFS
.
Netlink messages are small, maximum payload size is ~300B.
Here is the code for sending message from kernel:
int send_to_daemon(void* msg, int len, int command, int seq, u32 pid) {
struct sk_buff* skb;
void* msg_head;
int res, payload;
payload = GENL_HDRLEN+nla_total_size(len)+36;
skb = genlmsg_new(payload, GFP_KERNEL);
msg_head = genlmsg_put(skb, pid, seq, &psvfs_gnl_family, 0, command);
nla_put(skb, PSVFS_A_MSG, len, msg);
genlmsg_end(skb, msg_head);
genlmsg_unicast(&init_net, skb, pid);
return 0;
}
I absolutely have no idea why this is happening and my project just won't work because of that! I really hope someone could help me with that.
回答1:
I wonder if you are running on a 64bits machine. If it is the case, I suspect that the use of an int
as the type of payload
can be the root of some issues as genlmsg_new()
expects a size_t
which is 64bits on x86_64.
Secondly, I don't think you need to add GENL_HDRLEN
to payload
as this is taken care of by genlmsg_new()
(by using genlmsg_total_size()
, which returns genlmsg_msg_size()
which finally does the addition). Why this + 36
by the way? Does not look very portable nor explicit on what it is there for.
Hard to tell more without having a look at the rest of the code.
回答2:
I was having a similar problem receiving ENOBUFS via recvmsg from a netlink socket. I found that my problem was the kernel socket buffer filling before userspace could drain it.
From the netlink(7) man page:
However, reliable transmissions from kernel to user are impossible in
any case. The kernel can't send a netlink message if the socket
buffer is full: the message will be dropped and the kernel and the
user-space process will no longer have the same view of kernel state.
It is up to the application to detect when this happens (via the
ENOBUFS error returned by recvmsg(2)) and resynchronize.
I addressed this problem by increasing the size of the socket receive buffer (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, ...) , or nl_socket_set_buffer_size() if you are using libnl).
来源:https://stackoverflow.com/questions/8478231/netlink-sending-from-kernel-to-user-eagain-and-enobufs