My understanding of system calls is that in Linux the system call mechanism (int 0x80
or whatever) is documented and guaranteed to be stable across different kernel
On IA-32 there are two ways to make a system call:
Pure int/iret based system call takes 211 CPU cycles (and even much more on modern processors). Sysenter/sysexit takes 46 CPU ticks. As you can see execution of only a pair of instructions used for system call introduces significant overhead. But any system call implementation involves some work on the kernel side (setup of kernel context, dispatching of the call and its arguments etc.). More or less realistic highly optimized system call will take ~250 and ~100 CPU cycles for int/iret and sysenter/sysexit based system calls respectively. In Linux and Windows it will take ~500 ticks.
In the same time, function call (based on call/ret) have a cost of 2-4 tics + 1 for each argument.
As you can see, overhead introduced by function call is negligible in comparision to the system call cost.
On other hand, if you embed raw system calls in your application, you will make it highly hardware dependent. For example, what if your application with embedded sysenter/sysexit based raw system call will be executed on old PC without these instructions support? In addition your application will be sensitive for system call call convention used by OS.
Such libraries like ntdll.dll and glibc are commonly used, because they provide well-known and hardware independent interface for the system services and hides details of the communication with kernel behind the scene.
Linux and Windows have approximately the same cost of system calls if use the same way of crossing the user/kernel space border (difference will be negligible). Both trying to use fastest way possible on each particular machine. All modern Windows versions starting at least from Windows XP are prepared for sysenter/sysexit. Some old and/or specific versions of Linux can still use int/iret based calls. x64 versions of OSes relies to syscall/sysret instructions which works like the sysenter/sysexit and available as part of AMD64 instructions set.