действия и инструкции OpenFlow с расширениями Open vSwitch (OpenFlow actions and instructions with Open vSwitch extensions)
OUTPUT ACTIONS
These actions send a packet to a physical port or a controller. A
packet that never encounters an output action on its trip through
the Open vSwitch pipeline is effectively dropped. Because actions
are executed in order, a packet modification action that is not
eventually followed by an output action will not have an
externally visible effect.
The output action
Syntax:
port
output:
port
output:
field
output(port=
port, max_len=
nbytes)
Outputs the packet to an OpenFlow port most commonly specified as
port. Alternatively, the output port may be read from field, a
field or subfield in the syntax described under ``Field
Specifications'' above. Either way, if the port is the packet's
input port, the packet is not output.
The port may be one of the following standard OpenFlow ports:
local
Outputs the packet on the ``local port'' that
corresponds to the network device that has the same
name as the bridge, unless the packet was received
on the local port. OpenFlow switch implementations
are not required to have a local port, but Open
vSwitch bridges always do.
in_port
Outputs the packet on the port on which it was
received. This is the only standard way to output
the packet to the input port (but see ``Output to
the Input port'', below).
The port may also be one of the following additional OpenFlow
ports, unless max_len
is specified:
normal
Subjects the packet to the device's normal L2/L3
processing. This action is not implemented by all
OpenFlow switches, and each switch implements it
differently. The section ``The OVS Normal
Pipeline'' below documents the OVS implementation.
flood
Outputs the packet on all switch physical ports,
except the port on which it was received and any
ports on which flooding is disabled. Flooding can
be disabled automatically on a port by Open vSwitch
when IEEE 802.1D spanning tree (STP) or rapid
spanning tree (RSTP) is enabled, or by a controller
using an OpenFlow OFPT_MOD_PORT
request to set the
port's OFPPC_NO_FLOOD
flag (ovs-ofctl mod-port
provides a command-line interface to set this
flag).
all
Outputs the packet on all switch physical ports
except the port on which it was received.
controller
Sends the packet and its metadata to an OpenFlow
controller or controllers encapsulated in an
OpenFlow ``packet-in'' message. The separate
controller
action, described below, provides more
options for output to a controller.
Open vSwitch rejects output to other standard OpenFlow ports,
including none
, unset
, and port numbers reserved for future use
as standard ports, with the error OFPBAC_BAD_OUT_PORT
.
With max_len, the packet is truncated to at most nbytes bytes
before being output. In this case, the output port may not be a
patch port. Truncation is just for the single output action, so
that later actions in the OpenFlow pipeline work with the
complete packet. The truncation feature is meant for use in
monitoring applications, e.g. for mirroring packets to a
collector.
When an output
action specifies the number of a port that does
not currently exist (and is not in the range for standard ports),
the OpenFlow specification allows but does not require OVS to
reject the action. All versions of Open vSwitch treat such an
action as a no-op. If a port with the number is created later,
then the action will be honored at that point. (OpenFlow requires
OVS to reject output to a port number that will never be valid,
with OFPBAC_BAD_OUT_PORT
, but this situation does not arise when
OVS is a software switch, since the user can add or renumber
ports at any time.)
A controller can suppress output to a port by setting its
OFPPC_NO_FORWARD
flag using an OpenFlow OFPT_MOD_PORT
request
(ovs-ofctl mod-port
provides a command-line interface to set this
flag). When output is disabled, output
actions (and other actions
that output to the port) are allowed but have no effect.
Open vSwitch allows output to a port that does not exist,
although OpenFlow allows switches to reject such actions.
Output to the Input Port
OpenFlow requires a switch to ignore attempts to send a packet
out its ingress port in the most straightforward way. For
example, output:234
has no effect if the packet has ingress port
234. The rationale is that dropping these packets makes it harder
to loop the network. Sometimes this behavior can even be
convenient, e.g. it is often the desired behavior in a flow that
forwards a packet to several ports (``floods'' the packet).
Sometimes one really needs to send a packet out its ingress port
(``hairpin''). In this case, use in_port
to explicitly output the
packet to its input port, e.g.:
$ ovs-ofctl add-flow br0 in_port=2,actions=in_port
This also works in some circumstances where the flow doesn't
match on the input port. For example, if you know that your
switch has five ports numbered 2 through 6, then the following
will send every received packet out every port, even its ingress
port:
$ ovs-ofctl add-flow br0 actions=2,3,4,5,6,in_port
or, equivalently:
$ ovs-ofctl add-flow br0 actions=all,in_port
Sometimes, in complicated flow tables with multiple levels of
resubmit
actions, a flow needs to output to a particular port
that may or may not be the ingress port. It's difficult to take
advantage of output to in_port
in this situation. To help, Open
vSwitch provides, as an OpenFlow extension, the ability to modify
the in_port
field. Whatever value is currently in the in_port
field is both the port to which output will be dropped and the
destination for in_port
. This means that the following adds flows
that reliably output to port 2 or to ports 2 through 6,
respectively:
$ ovs-ofctl add-flow br0 "in_port=2,actions=load:0->in_port,2"
$ ovs-ofctl add-flow br0 "actions=load:0->in_port,2,3,4,5,6"
If in_port
is important for matching or other reasons, one may
save and restore it on the stack:
$ ovs-ofctl add-flow br0 actions="push:in_port,\
load:0->in_port,\
2,3,4,5,6,\
pop:in_port"
Conformance:
All versions of OpenFlow and Open vSwitch support output
to a
literal port. Output to a register is an OpenFlow extension
introduced in Open vSwitch 1.3. Output with truncation is an
OpenFlow extension introduced in Open vSwitch 2.6.
The OVS Normal Pipeline
This section documents how Open vSwitch implements output to the
normal
port. The OpenFlow specification places no requirements on
how this port works, so all of this documentation is specific to
Open vSwitch.
Open vSwitch uses the Open_vSwitch
database, detailed in
ovs-vswitchd.conf.db(5), to determine the details of the normal
pipeline.
The normal pipeline executes the following ingress stages for
each packet. Each stage either accepts the packet, in which case
the packet goes on to the next stage, or drops the packet, which
terminates the pipeline. The result of the ingress stages is a
set of output ports, which is the empty set if some ingress stage
drops the packet:
1. Input port lookup
: Looks up the OpenFlow in_port
field's value to the corresponding Port
and Interface
record in the database.
The in_port
is normally the OpenFlow port that the
packet was received on. If set_field
or another
actions changes the in_port
, the updated value is
honored. Accept the packet if the lookup succeeds,
which it normally will. If the lookup fails, for
example because in_port
was changed to an unknown
value, drop the packet.
2. Drop malformed packet
: If the packet is malformed
enough that it contains only part of an 802.1Q header,
then drop the packet with an error.
3. Drop packets sent to a port reserved for mirroring:
If
the packet was received on a port that is configured
as the output port for a mirror (that is, it is the
output_port
in some Mirror
record), then drop the
packet.
4. VLAN input processing:
This stage determines what VLAN
the packet is in. It also verifies that this VLAN is
valid for the port; if not, drop the packet. How the
VLAN is determined and which ones are valid vary based
on the vlan-mode
in the input port's Port
record:
trunk
The packet is in the VLAN specified in its
802.1Q header, or in VLAN 0 if there is no
802.1Q header. The trunks
column in the Port
record lists the valid VLANs; if it is empty,
all VLANs are valid.
access
The packet is in the VLAN specified in the tag
column of its Port
record. The packet must not
have an 802.1Q header with a nonzero VLAN ID;
if it does, drop the packet.
native-tagged
native-untagged
Same as trunk
except that the VLAN of a packet
without an 802.1Q header is not necessarily zero;
instead, it is taken from the tag
column.
dot1q-tunnel
The packet is in the VLAN specified in the tag
column of its Port
record, which is a QinQ
service VLAN with the Ethertype specified by the
Port
's other_config
: qinq-ethtype
. If the packet
has an 802.1Q header, then it specifies the
customer VLAN. The cvlans
column specifies the
valid customer VLANs; if it is empty, all
customer VLANs are valid.
5. Drop reserved multicast addresses:
If the packet is
addressed to a reserved Ethernet multicast address and
the Bridge
record does not have other_config
:
forward-bpdu
set to true
, drop the packet.
6. LACP bond admissibility:
This step applies only if the
input port is a member of a bond (a Port
with more
than one Interface
) and that bond is configured to use
LACP. Otherwise, skip to the next step.
The behavior here depends on the state of LACP
negotiation:
• If LACP has been negotiated with the peer,
accept the packet if the bond member is enabled
(i.e. carrier is up and it hasn't been
administratively disabled). Otherwise, drop the
packet.
• If LACP negotiation is incomplete, then drop
the packet. There is one exception: if fallback
to active-backup mode is enabled, continue with
the next step, pretending that the active-
backup balancing mode is in use.
7. Non-LACP bond admissibility:
This step applies if the
input port is a member of a bond without LACP
configured, or if a LACP bond falls back to active-
backup as described in the previous step. If neither
of these applies, skip to the next step.
If the packet is an Ethernet multicast or broadcast,
and not received on the bond's active member, drop the
packet.
The remaining behavior depends on the bond's balancing
mode:
L4 (aka TCP balancing)
Drop the packet (this balancing mode is only
supported with LACP).
Active-backup
Accept the packet only if it was received on
the active member.
SLB (Source Load Balancing)
Drop the packet if the bridge has not learned
the packet's source address (in its VLAN) on
the port that received it. Otherwise, accept
the packet unless it is a gratuitous ARP.
Otherwise, accept the packet if the MAC entry
we found is ARP-locked. Otherwise, drop the
packet. (See the ``SLB Bonding'' section in the
OVS bonding document for more information and a
rationale.)
8. Learn source MAC:
If the source Ethernet address is
not a multicast address, then insert a mapping from
packet's source Ethernet address and VLAN to the input
port in the bridge's MAC learning table. (This is
skipped if the packet's VLAN is listed in the switch's
Bridge
record in the flood_vlans
column, since there
is no use for MAC learning when all packets are
flooded.)
When learning happens on a non-bond port, if the
packet is a gratuitous ARP, the entry is marked as
ARP-locked. The lock expires after 5 seconds. (See the
``SLB Bonding'' section in the OVS bonding document
for more information and a rationale.)
9. IP multicast path:
If multicast snooping is enabled on
the bridge, and the packet is an Ethernet multicast
but not an Ethernet broadcast, and the packet is an IP
packet, then the packet takes a special processing
path. This path is not yet documented here.
10. Output port set:
Search the MAC learning table for the
port corresponding to the packet's Ethernet
destination and VLAN. If the search finds an entry,
the output port set is just the learned port.
Otherwise (including the case where the packet is an
Ethernet multicast or in flood_vlans
), the output port
set is all of the ports in the bridge that belong to
the packet's VLAN, except for any ports that were
disabled for flooding via OpenFlow or that are
configured in a Mirror
record as a mirror destination
port.
The following egress stages execute once for each element in the
set of output ports. They execute (conceptually) in parallel, so
that a decision or action taken for a given output port has no
effect on those for another one:
1. Drop loopback:
If the output port is the same as the
input port, drop the packet.
2. VLAN output processing:
This stage adjusts the packet
to represent the VLAN in the correct way for the
output port. Its behavior varies based on the
vlan-mode
in the output port's Port
record:
trunk
native-tagged
native-untagged
If the packet is in VLAN 0 (for native-untagged
,
if the packet is in the native VLAN) drops any
802.1Q header. Otherwise, ensures that there is
an 802.1Q header designating the VLAN.
access
Remove any 802.1Q header that was present.
dot1q-tunnel
Ensures that the packet has an outer 802.1Q
header with the QinQ Ethertype and the specified
configured tag, and an inner 802.1Q header with
the packet's VLAN.
3. VLAN priority tag processing:
If VLAN output
processing discarded the 802.1Q headers, but priority
tags are enabled with other_config
: priority-tags
in
the output port's Port
record, then a priority-only
tag is added (perhaps only if the priority would be
nonzero, depending on the configuration).
4. Bond member choice:
If the output port is a bond, the
code chooses a particular member. This step is skipped
for non-bonded ports.
If the bond is configured to use LACP, but LACP
negotiation is incomplete, then normally the packet is
dropped. The exception is that if fallback to active-
backup mode is enabled, the egress pipeline continues
choosing a bond member as if active-backup mode was in
use.
For active-backup mode, the output member is the
active member. Other modes hash appropriate header
fields and use the hash value to choose one of the
enabled members.
5. Output:
The pipeline sends the packet to the output
port.
The controller action
Syntax:
controller
controller:
max_len
controller(
key[=
value],
...)
Sends the packet and its metadata to an OpenFlow controller or
controllers encapsulated in an OpenFlow ``packet-in'' message.
The supported options are:
max_len=
max_len
Limit to max_len the number of bytes of the packet
to send in the ``packet-in.'' A max_len of 0
prevents any of the packet from being sent (thus,
only metadata is included). By default, the entire
packet is sent, equivalent to a max_len of 65535.
reason=
reason
Specify reason as the reason for sending the
message in the ``packet-in.'' The supported reasons
are no_match
, action
, invalid_ttl
, action_set
,
group
, and packet_out
. The default reason is
action
.
id=
controller_id
Specify controller_id, a 16-bit integer, as the
connection ID of the OpenFlow controller or
controllers to which the ``packet-in'' message
should be sent. The default is zero. Zero is also
the default connection ID for each controller
connection, and a given controller connection will
only have a nonzero connection ID if its controller
uses the NXT_SET_CONTROLLER_ID
Open vSwitch
extension to OpenFlow.
userdata=
hh...
Supplies the bytes represented as hex digits hh as
additional data to the controller in the ``packet-
in'' message. Pairs of hex digits may be separated
by periods for readability.
pause
Causes the switch to freeze the packet's trip
through Open vSwitch flow tables and serializes
that state into the packet-in message as a
``continuation,'' an additional property in the
NXT_PACKET_IN2
message. The controller can later
send the continuation back to the switch in an
NXT_RESUME
message, which will restart the packet's
traversal from the point where it was interrupted.
This permits an OpenFlow controller to interpose on
a packet midway through processing in Open vSwitch.
Conformance:
All versions of OpenFlow and Open vSwitch support controller
action and its max_len
option. The userdata
and pause
options
require the Open vSwitch NXAST_CONTROLLER2
extension action added
in Open vSwitch 2.6. In the absence of these options, the reason
(other than reason=action
) and controller_id (option than
controller_id=0
) options require the Open vSwitch
NXAST_CONTROLLER
extension action added in Open vSwitch 1.6.
The enqueue action
Syntax:
enqueue(
port,
queue)
enqueue:
port:
queue
Enqueues the packet on the specified queue within port port.
port must be an OpenFlow port number or name as described under
``Port Specifications'' above. port may be in_port
or local
but
the other standard OpenFlow ports are not allowed.
queue must be a a number between 0 and 4294967294 (0xfffffffe),
inclusive. The number of actually supported queues depends on the
switch. Some OpenFlow implementations do not support queuing at
all. In Open vSwitch, the supported queues vary depending on the
operating system, datapath, and hardware in use. Use the QoS
and
Queue
tables in the Open vSwitch database to configure queuing on
individual OpenFlow ports (see ovs-vswitchd.conf.db(5) for more
information).
Conformance:
Only OpenFlow 1.0 supports enqueue
. OpenFlow 1.1 added the
set_queue
action to use in its place along with output
.
Open vSwitch translates enqueue
to a sequence of three actions in
OpenFlow 1.1 or later: set_queue:
queue, output:
port, pop_queue
.
This is equivalent in behavior as long as the flow table does not
otherwise use set_queue
, but it relies on the pop_queue
Open
vSwitch extension action.
The bundle and bundle_load actions
Syntax:
bundle(
fields,
basis,
algorithm, ofport, members:
port...)
bundle_load(
fields,
basis,
algorithm, ofport,
dst,
members:
port...)
These actions choose a port (a ``member'') from a comma-separated
OpenFlow port list. After selecting the port, bundle
outputs to
it, whereas bundle_load
writes its port number to dst, which must
be a 16-bit or wider field or subfield in the syntax described
under ``Field Specifications'' above.
These actions hash a set of fields using basis as a universal
hash parameter, then apply the bundle link selection algorithm to
choose a port.
fields must be one of the following. For the options with
``symmetric'' in the name, reversing source and destination
addresses yields the same hash:
eth_src
Ethernet source address.
nw_src
IPv4 or IPv6 source address.
nw_dst
IPv4 or IPv6 destination address.
symmetric_l4
Ethernet source and destination, Ethernet type,
VLAN ID or IDs (if any), IPv4 or IPv6 source and
destination, IP protocol, TCP or SCTP (but not UDP)
source and destination.
symmetric_l3l4
IPv4 or IPv6 source and destination, IP protocol,
TCP or SCTP (but not UDP) source and destination.
symmetric_l3l4+udp
Like symmetric_l3l4
but include UDP ports.
algorithm must be one of the following:
active_backup
Chooses the first live port listed in members.
hrw
(Highest Random Weight)
Computes the following, considering only the live
ports in members:
for
i in [1,n_members]:
weights[i] = hash(flow, i)
member = { i such that weights[i] >= weights[j] for all j != i }
This algorithm is specified by RFC 2992.
The algorithms take port liveness into account when selecting
members. The definition of whether a port is live is subject to
change. It currently takes into account carrier status and link
monitoring protocols such as BFD and CFM. If none of the members
is live, bundle
does not output the packet and bundle_load
stores
OFPP_NONE
(65535) in the output field.
Example: bundle(eth_src,0,hrw,ofport,members:4,8)
uses an
Ethernet source hash with basis 0, to select between OpenFlow
ports 4 and 8 using the Highest Random Weight algorithm.
Conformance:
Open vSwitch 1.2 introduced the bundle
and bundle_load
OpenFlow
extension actions.
The group action
Syntax:
group:
group
Outputs the packet to the OpenFlow group group, which must be a
number in the range 0 to 4294967040 (0xffffff00). The group must
exist or Open vSwitch will refuse to add the flow. When a group
is deleted, Open vSwitch also deletes all of the flows that
output to it.
Groups contain action sets, whose semantics are described above
in the section ``Action Sets''. The semantics of action sets can
be surprising to users who expect action list semantics, since
action sets reorder and sometimes ignore actions.
A group
action usually executes the action set or sets in one or
more group buckets. Open vSwitch saves the packet and metadata
before it executes each bucket, and then restores it afterward.
Thus, when a group executes more than one bucket, this means that
each bucket executes on the same packet and metadata. Moreover,
regardless of the number of buckets executed, the packet and
metadata are the same before and after executing the group.
Sometimes saving and restoring the packet and metadata can be
undesirable. In these situations, workarounds are possible. For
example, consider a pipeline design in which a select
group
bucket is to communicate to a later stage of processing a value
based on which bucket was selected. An obvious design would be
for the bucket to communicate the value via set_field
on a
register. This does not work because registers are part of the
metadata that group
saves and restores. The following alternative
bucket designs do work:
• Recursively invoke the rest of the pipeline with
resubmit
.
• Use resubmit
into a table that uses push
to put the
value on the stack for the caller to pop
off. This
works because group
preserves only packet data and
metadata, not the stack.
(This design requires indirection through resubmit
because actions sets may not contain push
or pop
actions.)
An exit
action within a group bucket terminates only execution of
that bucket, not other buckets or the overall pipeline.
Conformance:
OpenFlow 1.1 introduced group
. Open vSwitch 2.6 and later also
supports group
as an extension to OpenFlow 1.0.