Rather than hand-coding seccomp filters as shown in the example
below, you may prefer to employ the libseccomp library, which
provides a front-end for generating seccomp filters.
The Seccomp field of the /proc/[pid]/status file provides a
method of viewing the seccomp mode of a process; see proc(5).
seccomp
() provides a superset of the functionality provided by
the prctl(2) PR_SET_SECCOMP
operation (which does not support
flags).
Since Linux 4.4, the ptrace(2) PTRACE_SECCOMP_GET_FILTER
operation can be used to dump a process's seccomp filters.
Architecture support for seccomp BPF
Architecture support for seccomp BPF filtering is available on
the following architectures:
* x86-64, i386, x32 (since Linux 3.5)
* ARM (since Linux 3.8)
* s390 (since Linux 3.8)
* MIPS (since Linux 3.16)
* ARM-64 (since Linux 3.19)
* PowerPC (since Linux 4.3)
* Tile (since Linux 4.3)
* PA-RISC (since Linux 4.6)
Caveats
There are various subtleties to consider when applying seccomp
filters to a program, including the following:
* Some traditional system calls have user-space implementations
in the vdso(7) on many architectures. Notable examples
include clock_gettime(2), gettimeofday(2), and time(2). On
such architectures, seccomp filtering for these system calls
will have no effect. (However, there are cases where the
vdso(7) implementations may fall back to invoking the true
system call, in which case seccomp filters would see the
system call.)
* Seccomp filtering is based on system call numbers. However,
applications typically do not directly invoke system calls,
but instead call wrapper functions in the C library which in
turn invoke the system calls. Consequently, one must be aware
of the following:
• The glibc wrappers for some traditional system calls may
actually employ system calls with different names in the
kernel. For example, the exit(2) wrapper function actually
employs the exit_group(2) system call, and the fork(2)
wrapper function actually calls clone(2).
• The behavior of wrapper functions may vary across
architectures, according to the range of system calls
provided on those architectures. In other words, the same
wrapper function may invoke different system calls on
different architectures.
• Finally, the behavior of wrapper functions can change
across glibc versions. For example, in older versions, the
glibc wrapper function for open(2) invoked the system call
of the same name, but starting in glibc 2.26, the
implementation switched to calling openat(2) on all
architectures.
The consequence of the above points is that it may be necessary
to filter for a system call other than might be expected.
Various manual pages in Section 2 provide helpful details about
the differences between wrapper functions and the underlying
system calls in subsections entitled C library/kernel
differences.
Furthermore, note that the application of seccomp filters even
risks causing bugs in an application, when the filters cause
unexpected failures for legitimate operations that the
application might need to perform. Such bugs may not easily be
discovered when testing the seccomp filters if the bugs occur in
rarely used application code paths.
Seccomp-specific BPF details
Note the following BPF details specific to seccomp filters:
* The BPF_H
and BPF_B
size modifiers are not supported: all
operations must load and store (4-byte) words (BPF_W
).
* To access the contents of the seccomp_data buffer, use the
BPF_ABS
addressing mode modifier.
* The BPF_LEN
addressing mode modifier yields an immediate mode
operand whose value is the size of the seccomp_data buffer.