|
NAME | SYNOPSIS | DESCRIPTION | PARAMETERS | EXAMPLES | FURTHER READING | SEE ALSO | AUTHORS | COLOPHON |
|
|
|
BPF classifie...actions in tc(8) Linux BPF classifie...actions in tc(8)
BPF - BPF programmable classifier and actions for ingress/egress
queueing disciplines
eBPF classifier (filter) or action:
tc filter ... bpf [ object-file OBJ_FILE ] [ section CLS_NAME ] [
export UDS_FILE ] [ verbose ] [ direct-action | da ] [ skip_hw |
skip_sw ] [ police POLICE_SPEC ] [ action ACTION_SPEC ] [ classid
CLASSID ]
tc action ... bpf [ object-file OBJ_FILE ] [ section CLS_NAME ] [
export UDS_FILE ] [ verbose ]
cBPF classifier (filter) or action:
tc filter ... bpf [ bytecode-file BPF_FILE | bytecode BPF_BYTECODE
] [ police POLICE_SPEC ] [ action ACTION_SPEC ] [ classid CLASSID
]
tc action ... bpf [ bytecode-file BPF_FILE | bytecode BPF_BYTECODE
]
Extended Berkeley Packet Filter ( eBPF ) and classic Berkeley
Packet Filter (originally known as BPF, for better distinction
referred to as cBPF here) are both available as a fully
programmable and highly efficient classifier and actions. They
both offer a minimal instruction set for implementing small
programs which can safely be loaded into the kernel and thus
executed in a tiny virtual machine from kernel space. An in-kernel
verifier guarantees that a specified program always terminates and
neither crashes nor leaks data from the kernel.
In Linux, it's generally considered that eBPF is the successor of
cBPF. The kernel internally transforms cBPF expressions into eBPF
expressions and executes the latter. Execution of them can be
performed in an interpreter or at setup time, they can be just-in-
time compiled (JIT'ed) to run as native machine code.
Currently, the eBPF JIT compiler is available for the following
architectures:
* x86_64 (since Linux 3.18)
* arm64 (since Linux 3.18)
* s390 (since Linux 4.1)
* ppc64 (since Linux 4.8)
* sparc64 (since Linux 4.12)
* mips64 (since Linux 4.13)
* arm32 (since Linux 4.14)
* x86_32 (since Linux 4.18)
Whereas the following architectures have cBPF, but did not (yet)
switch to eBPF JIT support:
* ppc32
* sparc32
* mips32
eBPF's instruction set has similar underlying principles as the
cBPF instruction set, it however is modelled closer to the
underlying architecture to better mimic native instruction sets
with the aim to achieve a better run-time performance. It is
designed to be JIT'ed with a one to one mapping, which can also
open up the possibility for compilers to generate optimized eBPF
code through an eBPF backend that performs almost as fast as
natively compiled code. Given that LLVM provides such an eBPF
backend, eBPF programs can therefore easily be programmed in a
subset of the C language. Other than that, eBPF infrastructure
also comes with a construct called "maps". eBPF maps are key/value
stores that are shared between multiple eBPF programs, but also
between eBPF programs and user space applications.
For the traffic control subsystem, classifier and actions that can
be attached to ingress and egress qdiscs can be written in eBPF or
cBPF. The advantage over other classifier and actions is that
eBPF/cBPF provides the generic framework, while users can
implement their highly specialized use cases efficiently. This
means that the classifier or action written that way will not
suffer from feature bloat, and can therefore execute its task
highly efficient. It allows for non-linear classification and even
merging the action part into the classification. Combined with
efficient eBPF map data structures, user space can push new
policies like classids into the kernel without reloading a
classifier, or it can gather statistics that are pushed into one
map and use another one for dynamically load balancing traffic
based on the determined load, just to provide a few examples.
object-file
points to an object file that has an executable and linkable
format (ELF) and contains eBPF opcodes and eBPF map definitions.
The LLVM compiler infrastructure with clang(1) as a C language
front end is one project that supports emitting eBPF object files
that can be passed to the eBPF classifier (more details in the
EXAMPLES section). This option is mandatory when an eBPF
classifier or action is to be loaded.
section
is the name of the ELF section from the object file, where the
eBPF classifier or action resides. By default the section name for
the classifier is called "classifier", and for the action
"action". Given that a single object file can contain multiple
classifier and actions, the corresponding section name needs to be
specified, if it differs from the defaults.
export
points to a Unix domain socket file. In case the eBPF object file
also contains a section named "maps" with eBPF map specifications,
then the map file descriptors can be handed off via the Unix
domain socket to an eBPF "agent" herding all descriptors after tc
lifetime. This can be some third party application implementing
the IPC counterpart for the import, that uses them for calling
into bpf(2) system call to read out or update eBPF map data from
user space, for example, for monitoring purposes or to push down
new policies.
verbose
if set, it will dump the eBPF verifier output, even if loading the
eBPF program was successful. By default, only on error, the
verifier log is being emitted to the user.
direct-action | da
instructs eBPF classifier to not invoke external TC actions,
instead use the TC actions return codes (TC_ACT_OK, TC_ACT_SHOT
etc.) for classifiers.
skip_hw | skip_sw
hardware offload control flags. By default TC will try to offload
filters to hardware if possible. skip_hw explicitly disables the
attempt to offload. skip_sw forces the offload and disables
running the eBPF program in the kernel. If hardware offload is
not possible and this flag was set kernel will report an error and
filter will not be installed at all.
police
is an optional parameter for an eBPF/cBPF classifier that
specifies a police in tc(1) which is attached to the classifier,
for example, on an ingress qdisc.
action
is an optional parameter for an eBPF/cBPF classifier that
specifies a subsequent action in tc(1) which is attached to a
classifier.
classid
flowid
provides the default traffic control class identifier for this
eBPF/cBPF classifier. The default class identifier can also be
overwritten by the return code of the eBPF/cBPF program. A default
return code of -1 specifies the here provided default class
identifier to be used. A return code of the eBPF/cBPF program of 0
implies that no match took place, and a return code other than
these two will override the default classid. This allows for
efficient, non-linear classification with only a single eBPF/cBPF
program as opposed to having multiple individual programs for
various class identifiers which would need to reparse packet
contents.
bytecode
is being used for loading cBPF classifier and actions only. The
cBPF bytecode is directly passed as a text string in the form of
's,c t f k,c t f k,c t f k,...' , where s denotes the number of
subsequent 4-tuples. One such 4-tuple consists of c t f k
decimals, where c represents the cBPF opcode, t the jump true
offset target, f the jump false offset target and k the immediate
constant/literal. There are various tools that generate code in
this loadable format, for example, bpf_asm that ships with the
Linux kernel source tree under tools/net/ , so it is certainly not
expected to hack this by hand. The bytecode or bytecode-file
option is mandatory when a cBPF classifier or action is to be
loaded.
bytecode-file
also being used to load a cBPF classifier or action. It's
effectively the same as bytecode only that the cBPF bytecode is
not passed directly via command line, but rather resides in a text
file.
eBPF TOOLING
A full blown example including eBPF agent code can be found inside
the iproute2 source package under: examples/bpf/
As prerequisites, the kernel needs to have the eBPF system call
namely bpf(2) enabled and ships with cls_bpf and act_bpf kernel
modules for the traffic control subsystem. To enable eBPF/eBPF JIT
support, depending which of the two the given architecture
supports:
echo 1 > /proc/sys/net/core/bpf_jit_enable
A given restricted C file can be compiled via LLVM as:
clang -O2 -emit-llvm -c bpf.c -o - | llc -march=bpf
-filetype=obj -o bpf.o
The compiler invocation might still simplify in future, so for
now, it's quite handy to alias this construct in one way or
another, for example:
__bcc() {
clang -O2 -emit-llvm -c $1 -o - | \
llc -march=bpf -filetype=obj -o "`basename $1 .c`.o"
}
alias bcc=__bcc
A minimal, stand-alone unit, which matches on all traffic with the
default classid (return code of -1) looks like:
#include <linux/bpf.h>
#ifndef __section
# define __section(x) __attribute__((section(x), used))
#endif
__section("classifier") int cls_main(struct __sk_buff *skb)
{
return -1;
}
char __license[] __section("license") = "GPL";
More examples can be found further below in subsection eBPF
PROGRAMMING as focus here will be on tooling.
There can be various other sections, for example, also for
actions. Thus, an object file in eBPF can contain multiple
entrance points. Always a specific entrance point, however, must
be specified when configuring with tc. A license must be part of
the restricted C code and the license string syntax is the same as
with Linux kernel modules. The kernel reserves its right that
some eBPF helper functions can be restricted to GPL compatible
licenses only, and thus may reject a program from loading into the
kernel when such a license mismatch occurs.
The resulting object file from the compilation can be inspected
with the usual set of tools that also operate on normal object
files, for example objdump(1) for inspecting ELF section headers:
objdump -h bpf.o
[...]
3 classifier 000007f8 0000000000000000 0000000000000000 00000040 2**3
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
4 action-mark 00000088 0000000000000000 0000000000000000 00000838 2**3
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
5 action-rand 00000098 0000000000000000 0000000000000000 000008c0 2**3
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
6 maps 00000030 0000000000000000 0000000000000000 00000958 2**2
CONTENTS, ALLOC, LOAD, DATA
7 license 00000004 0000000000000000 0000000000000000 00000988 2**0
CONTENTS, ALLOC, LOAD, DATA
[...]
Adding an eBPF classifier from an object file that contains a
classifier in the default ELF section is trivial (note that
instead of "object-file" also shortcuts such as "obj" can be
used):
bcc bpf.c
tc filter add dev em1 parent 1: bpf obj bpf.o flowid 1:1
In case the classifier resides in ELF section "mycls", then that
same command needs to be invoked as:
tc filter add dev em1 parent 1: bpf obj bpf.o sec mycls flowid
1:1
Dumping the classifier configuration will tell the location of the
classifier, in other words that it's from object file "bpf.o"
under section "mycls":
tc filter show dev em1
filter parent 1: protocol all pref 49152 bpf
filter parent 1: protocol all pref 49152 bpf handle 0x1 flowid
1:1 bpf.o:[mycls]
The same program can also be installed on ingress qdisc side as
opposed to egress ...
tc qdisc add dev em1 handle ffff: ingress
tc filter add dev em1 parent ffff: bpf obj bpf.o sec mycls
flowid ffff:1
... and again dumped from there:
tc filter show dev em1 parent ffff:
filter protocol all pref 49152 bpf
filter protocol all pref 49152 bpf handle 0x1 flowid ffff:1
bpf.o:[mycls]
Attaching a classifier and action on ingress has the restriction
that it doesn't have an actual underlying queueing discipline.
What ingress can do is to classify, mangle, redirect or drop
packets. When queueing is required on ingress side, then ingress
must redirect packets to the ifb device, otherwise policing can be
used. Moreover, ingress can be used to have an early drop point of
unwanted packets before they hit upper layers of the networking
stack, perform network accounting with eBPF maps that could be
shared with egress, or have an early mangle and/or redirection
point to different networking devices.
Multiple eBPF actions and classifier can be placed into a single
object file within various sections. In that case, non-default
section names must be provided, which is the case for both actions
in this example:
tc filter add dev em1 parent 1: bpf obj bpf.o flowid 1:1 \
action bpf obj bpf.o sec action-mark
\
action bpf obj bpf.o sec action-rand
ok
The advantage of this is that the classifier and the two actions
can then share eBPF maps with each other, if implemented in the
programs.
In order to access eBPF maps from user space beyond tc(8) setup
lifetime, the ownership can be transferred to an eBPF agent via
Unix domain sockets. There are two possibilities for implementing
this:
1) implementation of an own eBPF agent that takes care of setting
up the Unix domain socket and implementing the protocol that tc(8)
dictates. A code example of this can be found inside the iproute2
source package under: examples/bpf/
2) use tc exec for transferring the eBPF map file descriptors
through a Unix domain socket, and spawning an application such as
sh(1) . This approach's advantage is that tc will place the file
descriptors into the environment and thus make them available just
like stdin, stdout, stderr file descriptors, meaning, in case user
applications run from within this fd-owner shell, they can
terminate and restart without losing eBPF maps file descriptors.
Example invocation with the previous classifier and action
mixture:
tc exec bpf imp /tmp/bpf
tc filter add dev em1 parent 1: bpf obj bpf.o exp /tmp/bpf
flowid 1:1 \
action bpf obj bpf.o sec action-mark
\
action bpf obj bpf.o sec action-rand
ok
Assuming that eBPF maps are shared with classifier and actions,
it's enough to export them once, for example, from within the
classifier or action command. tc will setup all eBPF map file
descriptors at the time when the object file is first parsed.
When a shell has been spawned, the environment will have a couple
of eBPF related variables. BPF_NUM_MAPS provides the total number
of maps that have been transferred over the Unix domain socket.
BPF_MAP<X>'s value is the file descriptor number that can be
accessed in eBPF agent applications, in other words, it can
directly be used as the file descriptor value for the bpf(2)
system call to retrieve or alter eBPF map values. <X> denotes the
identifier of the eBPF map. It corresponds to the id member of
struct bpf_elf_map from the tc eBPF map specification.
The environment in this example looks as follows:
sh# env | grep BPF
BPF_NUM_MAPS=3
BPF_MAP1=6
BPF_MAP0=5
BPF_MAP2=7
sh# ls -la /proc/self/fd
[...]
lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map
lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map
lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map
sh# my_bpf_agent
eBPF agents are very useful in that they can prepopulate eBPF maps
from user space, monitor statistics via maps and based on that
feedback, for example, rewrite classids in eBPF map values during
runtime. Given that eBPF agents are implemented as normal
applications, they can also dynamically receive traffic control
policies from external controllers and thus push them down into
eBPF maps to dynamically adapt to network conditions. Moreover,
eBPF maps can also be shared with other eBPF program types (e.g.
tracing), thus very powerful combination can therefore be
implemented.
eBPF PROGRAMMING
eBPF classifier and actions are being implemented in restricted C
syntax (in future, there could additionally be new language
frontends supported).
The header file linux/bpf.h provides eBPF helper functions that
can be called from an eBPF program. This man page will only
provide two minimal, stand-alone examples, have a look at
examples/bpf from the iproute2 source package for a fully fledged
flow dissector example to better demonstrate some of the
possibilities with eBPF.
Supported 32 bit classifier return codes from the C program and
their meanings:
0 , denotes a mismatch
-1 , denotes the default classid configured from the command
line
else , everything else will override the default classid to
provide a facility for non-linear matching
Supported 32 bit action return codes from the C program and their
meanings ( linux/pkt_cls.h ):
TC_ACT_OK (0) , will terminate the packet processing pipeline
and allows the packet to proceed
TC_ACT_SHOT (2) , will terminate the packet processing
pipeline and drops the packet
TC_ACT_UNSPEC (-1) , will use the default action configured
from tc (similarly as returning -1 from a classifier)
TC_ACT_PIPE (3) , will iterate to the next action, if
available
TC_ACT_RECLASSIFY (1) , will terminate the packet processing
pipeline and start classification from the beginning
else , everything else is an unspecified return code
Both classifier and action return codes are supported in eBPF and
cBPF programs.
To demonstrate restricted C syntax, a minimal toy classifier
example is provided, which assumes that egress packets, for
instance originating from a container, have previously been marked
in interval [0, 255]. The program keeps statistics on different
marks for user space and maps the classid to the root qdisc with
the marking itself as the minor handle:
#include <stdint.h>
#include <asm/types.h>
#include <linux/bpf.h>
#include <linux/pkt_sched.h>
#include "helpers.h"
struct tuple {
long packets;
long bytes;
};
#define BPF_MAP_ID_STATS 1 /* agent's map identifier */
#define BPF_MAX_MARK 256
struct bpf_elf_map __section("maps") map_stats = {
.type = BPF_MAP_TYPE_ARRAY,
.id = BPF_MAP_ID_STATS,
.size_key = sizeof(uint32_t),
.size_value = sizeof(struct tuple),
.max_elem = BPF_MAX_MARK,
.pinning = PIN_GLOBAL_NS,
};
static inline void cls_update_stats(const struct __sk_buff *skb,
uint32_t mark)
{
struct tuple *tu;
tu = bpf_map_lookup_elem(&map_stats, &mark);
if (likely(tu)) {
__sync_fetch_and_add(&tu->packets, 1);
__sync_fetch_and_add(&tu->bytes, skb->len);
}
}
__section("cls") int cls_main(struct __sk_buff *skb)
{
uint32_t mark = skb->mark;
if (unlikely(mark >= BPF_MAX_MARK))
return 0;
cls_update_stats(skb, mark);
return TC_H_MAKE(TC_H_ROOT, mark);
}
char __license[] __section("license") = "GPL";
Another small example is a port redirector which demuxes
destination port 80 into the interval [8080, 8087] steered by RSS,
that can then be attached to ingress qdisc. The exercise of adding
the egress counterpart and IPv6 support is left to the reader:
#include <asm/types.h>
#include <asm/byteorder.h>
#include <linux/bpf.h>
#include <linux/filter.h>
#include <linux/in.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include "helpers.h"
static inline void set_tcp_dport(struct __sk_buff *skb, int nh_off,
__u16 old_port, __u16 new_port)
{
bpf_l4_csum_replace(skb, nh_off + offsetof(struct tcphdr, check),
old_port, new_port, sizeof(new_port));
bpf_skb_store_bytes(skb, nh_off + offsetof(struct tcphdr, dest),
&new_port, sizeof(new_port), 0);
}
static inline int lb_do_ipv4(struct __sk_buff *skb, int nh_off)
{
__u16 dport, dport_new = 8080, off;
__u8 ip_proto, ip_vl;
ip_proto = load_byte(skb, nh_off +
offsetof(struct iphdr, protocol));
if (ip_proto != IPPROTO_TCP)
return 0;
ip_vl = load_byte(skb, nh_off);
if (likely(ip_vl == 0x45))
nh_off += sizeof(struct iphdr);
else
nh_off += (ip_vl & 0xF) << 2;
dport = load_half(skb, nh_off + offsetof(struct tcphdr, dest));
if (dport != 80)
return 0;
off = skb->queue_mapping & 7;
set_tcp_dport(skb, nh_off - BPF_LL_OFF, __constant_htons(80),
__cpu_to_be16(dport_new + off));
return -1;
}
__section("lb") int lb_main(struct __sk_buff *skb)
{
int ret = 0, nh_off = BPF_LL_OFF + ETH_HLEN;
if (likely(skb->protocol == __constant_htons(ETH_P_IP)))
ret = lb_do_ipv4(skb, nh_off);
return ret;
}
char __license[] __section("license") = "GPL";
The related helper header file helpers.h in both examples was:
/* Misc helper macros. */
#define __section(x) __attribute__((section(x), used))
#define offsetof(x, y) __builtin_offsetof(x, y)
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
/* Object pinning settings */
#define PIN_NONE 0
#define PIN_OBJECT_NS 1
#define PIN_GLOBAL_NS 2
/* ELF map definition */
struct bpf_elf_map {
__u32 type;
__u32 size_key;
__u32 size_value;
__u32 max_elem;
__u32 flags;
__u32 id;
__u32 pinning;
__u32 inner_id;
__u32 inner_idx;
};
/* Some used BPF function calls. */
static int (*bpf_skb_store_bytes)(void *ctx, int off, void *from,
int len, int flags) =
(void *) BPF_FUNC_skb_store_bytes;
static int (*bpf_l4_csum_replace)(void *ctx, int off, int from,
int to, int flags) =
(void *) BPF_FUNC_l4_csum_replace;
static void *(*bpf_map_lookup_elem)(void *map, void *key) =
(void *) BPF_FUNC_map_lookup_elem;
/* Some used BPF intrinsics. */
unsigned long long load_byte(void *skb, unsigned long long off)
asm ("llvm.bpf.load.byte");
unsigned long long load_half(void *skb, unsigned long long off)
asm ("llvm.bpf.load.half");
Best practice, we recommend to only have a single eBPF classifier
loaded in tc and perform all necessary matching and mangling from
there instead of a list of individual classifier and separate
actions. Just a single classifier tailored for a given use-case
will be most efficient to run.
eBPF DEBUGGING
Both tc filter and action commands for bpf support an optional
verbose parameter that can be used to inspect the eBPF verifier
log. It is dumped by default in case of an error.
In case the eBPF/cBPF JIT compiler has been enabled, it can also
be instructed to emit a debug output of the resulting opcode image
into the kernel log, which can be read via dmesg(1) :
echo 2 > /proc/sys/net/core/bpf_jit_enable
The Linux kernel source tree ships additionally under tools/net/ a
small helper called bpf_jit_disasm that reads out the opcode image
dump from the kernel log and dumps the resulting disassembly:
bpf_jit_disasm -o
Other than that, the Linux kernel also contains an extensive
eBPF/cBPF test suite module called test_bpf . Upon ...
modprobe test_bpf
... it performs a diversity of test cases and dumps the results
into the kernel log that can be inspected with dmesg(1) . The
results can differ depending on whether the JIT compiler is
enabled or not. In case of failed test cases, the module will fail
to load. In such cases, we urge you to file a bug report to the
related JIT authors, Linux kernel and networking mailing lists.
cBPF
Although we generally recommend switching to implementing eBPF
classifier and actions, for the sake of completeness, a few words
on how to program in cBPF will be lost here.
Likewise, the bpf_jit_enable switch can be enabled as mentioned
already. Tooling such as bpf_jit_disasm is also independent
whether eBPF or cBPF code is being loaded.
Unlike in eBPF, classifier and action are not implemented in
restricted C, but rather in a minimal assembler-like language or
with the help of other tooling.
The raw interface with tc takes opcodes directly. For example, the
most minimal classifier matching on every packet resulting in the
default classid of 1:1 looks like:
tc filter add dev em1 parent 1: bpf bytecode '1,6 0 0
4294967295,' flowid 1:1
The first decimal of the bytecode sequence denotes the number of
subsequent 4-tuples of cBPF opcodes. As mentioned, such a 4-tuple
consists of c t f k decimals, where c represents the cBPF opcode,
t the jump true offset target, f the jump false offset target and
k the immediate constant/literal. Here, this denotes an
unconditional return from the program with immediate value of -1.
Thus, for egress classification, Willem de Bruijn implemented a
minimal stand-alone helper tool under the GNU General Public
License version 2 for iptables(8) BPF extension, which abuses the
libpcap internal classic BPF compiler, his code derived here for
usage with tc(8) :
#include <pcap.h>
#include <stdio.h>
int main(int argc, char **argv)
{
struct bpf_program prog;
struct bpf_insn *ins;
int i, ret, dlt = DLT_RAW;
if (argc < 2 || argc > 3)
return 1;
if (argc == 3) {
dlt = pcap_datalink_name_to_val(argv[1]);
if (dlt == -1)
return 1;
}
ret = pcap_compile_nopcap(-1, dlt, &prog, argv[argc - 1],
1, PCAP_NETMASK_UNKNOWN);
if (ret)
return 1;
printf("%d,", prog.bf_len);
ins = prog.bf_insns;
for (i = 0; i < prog.bf_len - 1; ++ins, ++i)
printf("%u %u %u %u,", ins->code,
ins->jt, ins->jf, ins->k);
printf("%u %u %u %u",
ins->code, ins->jt, ins->jf, ins->k);
pcap_freecode(&prog);
return 0;
}
Given this small helper, any tcpdump(8) filter expression can be
abused as a classifier where a match will result in the default
classid:
bpftool EN10MB 'tcp[tcpflags] & tcp-syn != 0' > /var/bpf/tcp-
syn
tc filter add dev em1 parent 1: bpf bytecode-file
/var/bpf/tcp-syn flowid 1:1
Basically, such a minimal generator is equivalent to:
tcpdump -iem1 -ddd 'tcp[tcpflags] & tcp-syn != 0' | tr '\n'
',' > /var/bpf/tcp-syn
Since libpcap does not support all Linux' specific cBPF extensions
in its compiler, the Linux kernel also ships under tools/net/ a
minimal BPF assembler called bpf_asm for providing full control.
For detailed syntax and semantics on implementing such programs by
hand, see references under FURTHER READING .
Trivial toy example in bpf_asm for classifying IPv4/TCP packets,
saved in a text file called foobar :
ldh [12]
jne #0x800, drop
ldb [23]
jneq #6, drop
ret #-1
drop: ret #0
Similarly, such a classifier can be loaded as:
bpf_asm foobar > /var/bpf/tcp-syn
tc filter add dev em1 parent 1: bpf bytecode-file
/var/bpf/tcp-syn flowid 1:1
For BPF classifiers, the Linux kernel provides additionally under
tools/net/ a small BPF debugger called bpf_dbg , which can be used
to test a classifier against pcap files, single-step or add
various breakpoints into the classifier program and dump register
contents during runtime.
Implementing an action in classic BPF is rather limited in the
sense that packet mangling is not supported. Therefore, it's
generally recommended to make the switch to eBPF, whenever
possible.
Further and more technical details about the BPF architecture can
be found in the Linux kernel source tree under
Documentation/networking/filter.txt .
Further details on eBPF tc(8) examples can be found in the
iproute2 source tree under examples/bpf/ .
tc(8), tc-ematch(8) bpf(2) bpf(4)
Manpage written by Daniel Borkmann.
Please report corrections or improvements to the Linux kernel
networking mailing list: <netdev@vger.kernel.org>
This page is part of the iproute2 (utilities for controlling
TCP/IP networking and traffic) project. Information about the
project can be found at
⟨http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2⟩.
If you have a bug report for this manual page, send it to
netdev@vger.kernel.org, shemminger@osdl.org. This page was
obtained from the project's upstream Git repository
⟨https://git.kernel.org/pub/scm/network/iproute2/iproute2.git⟩ on
2025-08-11. (At that time, the date of the most recent commit
that was found in the repository was 2025-08-08.) If you discover
any rendering problems in this HTML version of the page, or you
believe there is a better or more up-to-date source for the page,
or you have corrections or improvements to the information in this
COLOPHON (which is not part of the original manual page), send a
mail to man-pages@man7.org
iproute2 18 May 2015BPF classifie...actions in tc(8)
Pages that refer to this page: bpf(2), bpf-helpers(7), tc(8), tc-actions(8)