问题
For example, to monitor all mkdir
calls made, the best I could come up with was:
#!/bin/sh
set -eux
d=debug/tracing
mkdir -p debug
if ! mountpoint -q debug; then
mount -t debugfs nodev debug
fi
# Stop tracing.
echo 0 > "${d}/tracing_on"
# Clear previous traces.
echo > "${d}/trace"
# Enable tracing mkdir
echo sys_enter_mkdir > "${d}/set_event"
# Set tracer type.
echo function > "${d}/current_tracer"
# Filter only sys_mkdir as a workaround.
echo SyS_mkdir > "${d}/set_ftrace_filter"
# Start tracing.
echo 1 > "${d}/tracing_on"
# Generate two mkdir calls.
rm -rf /tmp/a
rm -rf /tmp/b
mkdir /tmp/a
mkdir /tmp/b
# View the trace.
cat "${d}/trace"
# Stop tracing.
echo 0 > "${d}/tracing_on"
umount debug
And then after running with sudo
it gives:
# tracer: function
#
# entries-in-buffer/entries-written: 4/4 #P:16
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
mkdir-31254 [015] .... 2010985.576760: sys_mkdir(pathname: 7ffc54b32c77, mode: 1ff)
mkdir-31254 [015] .... 2010985.576763: SyS_mkdir <-tracesys_phase2
mkdir-31255 [007] .... 2010985.578363: sys_mkdir(pathname: 7fff02d90c77, mode: 1ff)
mkdir-31255 [007] .... 2010985.578365: SyS_mkdir <-tracesys_phase2
My problem with this is that it output two lines for each syscall:
sys_mkdir
which is the event that I wantSyS_mkdir
which is the filtered function workaround, which I don't want to see
If I instead try to do:
echo > "${d}/set_ftrace_filter"
or don't touch that file at at all, then it shows a ton of functions and makes it hard to fint the syscall at all.
If there a nicer way to disable regular functions, and keep just syscall events?
I could use just SyS_mkdir
and disable the syscall event I guess, but it feels cleaner if I could use the more specific event? Also:
- the event shows arguments, which is nicer.
- syscall function names change across kernel versions. E.g., it is already
__x64_sys_mkdir
instead ofSyS_mkdir
on Linux v4.18.
Related:
- How to trace the write system call in the Linux kernel?
Tested on Ubuntu 18.04, Linux kernel 4.15.
回答1:
Use the nop
tracer
As proposed by sruffell, all we have to do is to use the nop
tracer instead of function
, and that will disable the function traces, but not the events.
Run with sudo
:
#!/bin/sh
set -eux
d=debug/tracing
mkdir -p debug
if ! mountpoint -q debug; then
mount -t debugfs nodev debug
fi
# Stop tracing.
echo 0 > "${d}/tracing_on"
# Clear previous traces.
echo > "${d}/trace"
# Find the tracer name.
cat "${d}/available_tracers"
# Disable tracing functions, show only system call events.
echo nop > "${d}/current_tracer"
# Find the event name with.
grep mkdir "${d}/available_events"
# Enable tracing mkdir.
# Both statements below seem to do the exact same thing,
# just with different interfaces.
# https://www.kernel.org/doc/html/v4.18/trace/events.html
echo sys_enter_mkdir > "${d}/set_event"
# echo 1 > "${d}/events/syscalls/sys_enter_mkdir/enable"
# Start tracing.
echo 1 > "${d}/tracing_on"
# Generate two mkdir calls by two different processes.
rm -rf /tmp/a /tmp/b
mkdir /tmp/a
mkdir /tmp/b
# View the trace.
cat "${d}/trace"
# Stop tracing.
echo 0 > "${d}/tracing_on"
umount debug
This gives the desired output:
mkdir-26064 [007] .... 2014370.909743: sys_mkdir(pathname: 7fffbd461c77, mode: 1ff)
mkdir-26065 [014] .... 2014370.911615: sys_mkdir(pathname: 7ffea53bac77, mode: 1ff)
Alternative not best solution
This also works, but definitely less nice, replace:
echo SyS_mkdir > "${d}/set_ftrace_filter"
with:
echo '*' > "${d}/set_ftrace_notrace"
This turns off all functions as mentioned in the docs: https://www.kernel.org/doc/html/v4.18/trace/ftrace.html#the-file-system
set_ftrace_notrace:
This has an effect opposite to that of set_ftrace_filter. Any function that is added here will not be traced. If a function exists in both set_ftrace_filter and set_ftrace_notrace, the function will not be traced.
回答2:
In addition, it's worth mention another concise way to gain such info. One can do something like:
stap -e 'probe syscall.mkdir { printf("%s[%d] -> %s(%s)\n", execname(), pid(), name, argstr) }'
The output:
systemd-journal[318] -> mkdir("/var/log/journal/c8d2562a041649cdbfd1ac5e24dbe0db", 0755)
systemd-journal[318] -> mkdir("/var/log/journal/c8d2562a041649cdbfd1ac5e24dbe0db", 0755)
mkdir[4870] -> mkdir("wtf", 0777)
...
Another way:
stap -e 'probe kernel.function("sys_mkdir") { printf("%s[%d] (%s)\n", execname(), pid(), $$parms) }'
The output:
systemd-journal[318] (pathname=0x55b74f7ab8b0 mode=0x1ed)
systemd-journal[318] (pathname=0x55b74f7ab8b0 mode=0x1ed)
mkdir[8532] (pathname=0x7ffcf30af761 mode=0x1ff)
...
You can customize the output as you like.
P.S. Systemtap is based on kprobes. Architecture doc will help to understand its internals.
More about the SystemTap.
来源:https://stackoverflow.com/questions/52764544/how-to-trace-just-system-call-events-with-ftrace-without-showing-any-other-funct