Prior to Linux 4.4, all bpf
() commands require the caller to have
the CAP_SYS_ADMIN
capability. From Linux 4.4 onwards, an
unprivileged user may create limited programs of type
BPF_PROG_TYPE_SOCKET_FILTER
and associated maps. However they
may not store kernel pointers within the maps and are presently
limited to the following helper functions:
* get_random
* get_smp_processor_id
* tail_call
* ktime_get_ns
Unprivileged access may be blocked by writing the value 1 to the
file /proc/sys/kernel/unprivileged_bpf_disabled.
eBPF objects (maps and programs) can be shared between processes.
For example, after fork(2), the child inherits file descriptors
referring to the same eBPF objects. In addition, file
descriptors referring to eBPF objects can be transferred over
UNIX domain sockets. File descriptors referring to eBPF objects
can be duplicated in the usual way, using dup(2) and similar
calls. An eBPF object is deallocated only after all file
descriptors referring to the object have been closed.
eBPF programs can be written in a restricted C that is compiled
(using the clang
compiler) into eBPF bytecode. Various features
are omitted from this restricted C, such as loops, global
variables, variadic functions, floating-point numbers, and
passing structures as function arguments. Some examples can be
found in the samples/bpf/*_kern.c files in the kernel source
tree.
The kernel contains a just-in-time (JIT) compiler that translates
eBPF bytecode into native machine code for better performance.
In kernels before Linux 4.15, the JIT compiler is disabled by
default, but its operation can be controlled by writing one of
the following integer strings to the file
/proc/sys/net/core/bpf_jit_enable:
0 Disable JIT compilation (default).
1 Normal compilation.
2 Debugging mode. The generated opcodes are dumped in
hexadecimal into the kernel log. These opcodes can then be
disassembled using the program tools/net/bpf_jit_disasm.c
provided in the kernel source tree.
Since Linux 4.15, the kernel may configured with the
CONFIG_BPF_JIT_ALWAYS_ON
option. In this case, the JIT compiler
is always enabled, and the bpf_jit_enable is initialized to 1 and
is immutable. (This kernel configuration option was provided as
a mitigation for one of the Spectre attacks against the BPF
interpreter.)
The JIT compiler for eBPF is currently available for the
following architectures:
* x86-64 (since Linux 3.18; cBPF since Linux 3.0);
* ARM32 (since Linux 3.18; cBPF since Linux 3.4);
* SPARC 32 (since Linux 3.18; cBPF since Linux 3.5);
* ARM-64 (since Linux 3.18);
* s390 (since Linux 4.1; cBPF since Linux 3.7);
* PowerPC 64 (since Linux 4.8; cBPF since Linux 3.1);
* SPARC 64 (since Linux 4.12);
* x86-32 (since Linux 4.18);
* MIPS 64 (since Linux 4.18; cBPF since Linux 3.16);
* riscv (since Linux 5.1).