Can eBPF modify the return value or parameters of a syscall?

╄→гoц情女王★ 提交于 2019-11-30 21:33:12

I believe that attaching eBPF to kprobes/kretprobes gives you read access to function arguments and return values, but that you cannot tamper with them. I am NOT 100% sure; good places to ask for confirmation would be the IO Visor project mailing list or IRC channel (#iovisor at irc.oftc.net).

As an alternative solution, I know you can at least change the return value of a syscall with strace, with the -e option. Quoting the manual page:

-e inject=set[:error=errno|:retval=value][:signal=sig][:when=expr]
       Perform syscall tampering for the specified set of syscalls.

Also, there was a presentation on this, and fault injection, at Fosdem 2017, if it is of any interest to you. Here is one example command from the slides:

strace -P precious.txt -efault=unlink:retval=0 unlink precious.txt

Edit: As stated by Ben, eBPF on kprobes and tracepoints is definitively read only, for tracing and monitoring use cases. I also got confirmation about this on IRC.

Ben Walther

Within kernel probes (kprobes), the eBPF virtual machine has read-only access to the syscall parameters and return value.

However the eBPF program will have a return code of it's own. It is possible to apply a seccomp profile that traps BPF (NOT eBPF; thanks @qeole) return codes and interrupt the system call during execution.

The allowed runtime modifications are:

  • SECCOMP_RET_KILL: Immediate kill with SIGSYS
  • SECCOMP_RET_TRAP: Send a catchable SIGSYS, giving a chance to emulate the syscall
  • SECCOMP_RET_ERRNO: Force errno value
  • SECCOMP_RET_TRACE: Yield decision to ptracer or set errno to -ENOSYS
  • SECCOMP_RET_ALLOW: Allow

https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt

The SECCOMP_RET_TRACE method enables modifying the system call performed, arguments, or return value. This is architecture dependent and modification of mandatory external references may cause an ENOSYS error.

It does so by passing execution up to a waiting userspace ptrace, which has the ability to modify the traced process memory, registers, and file descriptors.

The tracer needs to call ptrace and then waitpid. An example:

ptrace(PTRACE_SETOPTIONS, tracee_pid, 0, PTRACE_O_TRACESECCOMP);
waitpid(tracee_pid, &status, 0);

http://man7.org/linux/man-pages/man2/ptrace.2.html

When waitpid returns, depending on the contents of status, one can retrieve the seccomp return value using the PTRACE_GETEVENTMSG ptrace operation. This will retrieve the seccomp SECCOMP_RET_DATA value, which is a 16-bit field set by the BPF program. Example:

ptrace(PTRACE_GETEVENTMSG, tracee_pid, 0, &data);

Syscall arguments can be modified in memory before continuing operation. You can perform a single syscall entry or exit with the PTRACE_SYSCALL step. Syscall return values can be modified in userspace before resuming execution; the underlying program won't be able to see that the syscall return values have been modified.

An example implementation: Filter and Modify System Calls with seccomp and ptrace

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!