Manual:Layer2 misconfiguration: Difference between revisions

From MikroTik Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 339: Line 339:


[[Category:Manual]]
[[Category:Manual]]
[[Category:Bridging and switching]]
[[Category:Case Studies]]
[[Category:Case Studies]]
[[Category:Bridging and switching]]

Revision as of 14:56, 17 April 2018

Applies to RouterOS: v6.41 +

Introduction

There are certain configuration that are known to have major flaws by design and should be avoided by all means possible. Misconfigured Layer2 can sometimes cause hard to detect network errors, random performance drops, certain segments of a network to be unreachable, certain networking services to be malfunctioning or a complete network failure. This page will contain some common and not so very common configurations that will cause issues in your network.

Bridges on a single switch chip

Consider the following scenario, you have a device with a built-in switch chip and you need to isolate certain ports from each other, for this reason you have created multiple bridges and enabled hardware offloading on them. Since each bridge is located on a different Layer2 domain, then Layer2 frames will not be forwarded between these bridges, as a result ports in each bridge are isolated from other ports in a different bridge.

Configuration

/interface bridge
add name=bridge1
add name=bridge2
/interface bridge port
add bridge=bridge1 interface=ether1
add bridge=bridge1 interface=ether2
add bridge=bridge2 interface=ether3
add bridge=bridge2 interface=ether4

Problem

After a simple performance test you might notice that one bridge is capable of forwarding traffic at wire-speed while the second, third, ... bridge is not able to forward as much data as the first bridge. Another symptom might be that there exists a huge latency for packets that need to be routed. After a quick inspection you might notice that the CPU is always at full load, this is because hardware offloading is not available on all bridges, but is available only on one bridge. By checking the hardware offloading status you will notice that only one bridge has it active:

[admin@MikroTik] > /interface bridge port print
Flags: X - disabled, I - inactive, D - dynamic, H - hw-offload 
 #     INTERFACE                                 BRIDGE                                 HW
 0   H ether1                                    bridge1                                yes
 1   H ether2                                    bridge1                                yes
 2     ether3                                    bridge2                                yes
 3     ether4                                    bridge2                                yes

The reason why only one bridge has the hardware offloading flag available is because the device does not support port isolation. If port isolation is not supported, then only one bridge will be able to offload the traffic to the switch chip.

Solution

Not all device devices support port isolation, currently only CRS1xx/CRS2xx series devices support it and only 7 isolated and hardware offloaded bridges are supported at the same time, other devices will have to use the CPU to forward the packets on other bridges. This is usually a hardware limitation and a different device might be required. Bridge split horizon parameter is a software feature that disables hardware offloading and when using bridge filter rules you need to enable forward all packets to the CPU, which requires the hardware offloading to be disabled. You can control which bridge will be hardware offloaded with the hw=yes flag and by setting hw=no to other bridges, for example:

/interface bridge port set [find where bridge=bridge1] hw=no
/interface bridge port set [find where bridge=bridge2] hw=yes

Sometimes it is possible to restructure a network topology to use VLANs, which is the proper way to isolate Layer2 networks.

Packet flow with hardware offloading and MAC learning

Consider the following scenario, you setup a bridge and have enabled hardware offloading in order to maximize the throughput for your device, as a result your device is working as a switch, but you want to use packet analyser or to simply sniff some packets that are being forwarded over your bridge or you might want to use Firewall rules for statistics.

Configuration

/interface bridge
add name=bridge
/interface bridge port
add bridge=bridge hw=yes interface=ether1
add bridge=bridge hw=yes interface=ether2

Problem

Wheb hardware offloading is enabled, all packets are being processed by the built-in switch chip, all MikroTik devices using a built-in switch chip are capable of MAC learning which makes a switch a smart switch. The function of a smart switch is not to flood traffic to ports that are not supposed to receive certain packets, because of MAC learning the switch chip will learn on which ports a certain MAC address is located, the switch chip will send packets that are destined to this address directly without flooding the packet to all ports. If the destination MAC address is not known, then the packet is flooded to all ports, broadcast packets are always flooded to all ports. Devices that have a switch chip have a port called switch-cpu port, this is the port on which packets that are destined to the CPU will be received on. Because of this behaviour packets that are destined to a learned MAC address are not sent to the CPU and are not visible with /tool sniffer, this can be sometimes misleading since traffic is not visible, but rx-bytes/tx-bytes counters are increasing, this behaviour is similar to FastPath.

Solution

Packets with a destination MAC address that has been learned will not be sent to the CPU since the packets are not not being flooded to all ports. If you do need to send certain packets to the CPU for packet analyser or for Firewall, then it is possible to copy or redirect the packet to the CPU by using ACL rules. Below is an example how to send a copy of packets that are meant for 4C:5E:0C:4D:12:4B:

/interface ethernet switch rule
add copy-to-cpu=yes dst-mac-address=4C:5E:0C:4D:12:4B/FF:FF:FF:FF:FF:FF ports=ether1 switch=switch1

Note: If the packet is sent to the CPU, then the packet must be processed by the CPU, this increases the CPU load.


LAG interfaces and load balancing

Consider the following scenario, you have created a LAG interface to increase total bandwidth between 2 network nodes, usually these are switches. For testing purposes to make sure that LAG interface is working properly you have attached two servers that transfer data, most commonly the well known network performance measurement tool https://en.wikipedia.org/wiki/Iperf is used to test such setups. For example, you might have made a LAG interface out of two Gigabit Ethernet ports, which gives you a 2Gbps interface while the servers are connected using a 10Gbps interface, for example, SFP+.

Alt text
LACP topology

Configuration

The following configuration is relevant to SW1 and SW2:

/interface bonding
add mode=802.3ad name=bond1 slaves=ether1,ether2
/interface bridge port
add bridge=bridge interface=bond1
add bridge=bridge interface=sfp-sfpplus1

Problem

After initial tests you immediately notice that the your network throughput never exceeds the 1Gbps limit even though the CPU load on the servers is low as well as on the network nodes (switches in this case), but the throughput is still limited to only 1Gbps. The reason behind this is because LACP (802.ad) uses transmit hash policy in order to determine if traffic can be balanced over multiple LAG members, in this case a LAG interface does not create a 2Gbps interface, but rather an interface that can balance traffic over multiple slave interface whenever it is possible. For each packet a transmit hash is generated, this determines through which LAG member will the packet be sent, this is needed in order to avoid packets being out of order, there is an option to select the transmit hash policy, usually there is an option to choose between Layer2 (MAC), Layer3 (IP) and Layer4 (Port), in RouterOS this can be selected by using the transmit-hash-policy parameter. In this case the transmit hash is the same since you are sending packets to the same destination MAC address, as well as the same IP address and Iperf uses the same port as well, this generates the same transmit hash for all packets and load balancing between LAG members is not possible. Note that now always packets will get balanced over LAG members even though the destination is different, this is because the standardized transmit hash policy can generate the same transmit hash for different destinations, for example, 192.168.0.1/192.168.0.2 will get balanced, but 192.168.0.2/192.168.0.4 will NOT get balanced in case layer2-and-3 transmit hash policy is used and the destination MAC address is the same.

Solution

Choose the proper transmit hash policy and test your network's throughput properly. The simplest way to test such setups is to use multiple destinations, for example, instead of sending data to just one server, rather send data to multiple servers, this will generate a different transmit hash for each packet and will make load balancing across LAG members possible. For some setups you might want to change the bonding interface mode to increase the total throughput, for UDP traffic balance-rr mode might be sufficient, but can cause issues for TCP traffic, you can read more about selecting the right mode for your setup Here.

VLAN interface on a slave interface

Consider the following scenario, you have created a bridge and you want a DHCP Server to give out IP addresses only to a certain tagged VLAN traffic, for this reason you have created a VLAN interface, specified a VLAN ID and created a DHCP Server on it, but for some reasons it is not working properly.

Configuration

/interface bridge
add name=bridge
/interface bridge port
add interface=ether1 bridge=bridge
add interface=ether2 bridge=bridge
/interface vlan
add name=VLAN99 interface=ether1 vlan-id=99
/ip pool
add name=VLAN99_POOL range=192.168.99.100-192.168.99.200
/ip address add address=192.168.99.1/24 interface=VLAN99
/ip dhcp-server
add interface=VLAN99 address-pool=VLAN99_POOL disabled=no
/ip dhcp-server network
add address=192.168.99.0/24 gateway=192.168.99.1 dns-server=192.168.99.1

Problem

When you add an interface to a bridge, the bridge becomes the master interface and all bridge ports become slave ports, this means that all traffic that is received on a bridge port is captured by the bridge interface and all traffic is forwarded to the CPU using the bridge interface instead of the physical interface. As a result VLAN interface that is created on a slave interface will never capture any traffic at all since it is immediately forwarded to the master interface before any packet processing is being done. Usual side effect is that some DHCP clients receive IP addresses and some don't.

Solution

Change the interface on which the VLAN interface will be listening for traffic, change it to the master interface:

/interface vlan set VLAN99 interface=bridge

VLAN on a bridge in a bridge

Consider the following scenario, you have a set of interfaces (don't have to be physical interfaces) and you want all of them to be in the same Layer2 segment, the solution is to add them to a single bridge, but you require that traffic from one port tags all traffic into a certain VLAN. This can be done by creating a VLAN interface on top of the bridge interface and by creating a separate bridge that contains this newly created VLAN interface and the interface, which will send out tagged traffic. Network diagram can be found below:

Alt text
VLAN on bridge in bridge topology

Configuration

/interface bridge
add name=bridge1
add name=bridge2
/interface vlan
add interface=bridge1 name=VLAN vlan-id=99
/interface bridge port
add bridge=bridge1 interface=ether1
add bridge=bridge1 interface=ether2
add bridge=bridge2 interface=VLAN
add bridge=bridge2 interface=ether3

Problem

Packets coming from ether3 will be sent out tagged and traffic won't be flooded through ether1 and ether2, but if another port is added to bridge2, then traffic will be flooded. Similar issue arises when traffic needs to be sent from ether1 to ether3 since MAC learning is only possible between bridge ports and not interfaces that are created on top of the bridge interface. As a result unicast traffic will be flooded to ether2 and ether3. If a device behind ether3 is using (R)STP, then ether1 and ether2 will send out tagged BPDUs. Because of the broken MAC learning functionality and broken (R)STP this setup and configuration must be avoided. It is also known that in some setups this kind of configuration can prevent you from connecting to the device by using MAC telnet.

Solution

Use bridge VLAN filtering. The proper way to tag traffic is to assign a VLAN ID whenever traffic enters a bridge, this behaviour can easily be achieved by specifying PVID value for a bridge port and specifying which ports are tagged (trunk) ports and which are untagged (access) ports. Below is an example how such setup should have been configured:

/interface bridge
add name=bridge vlan-filtering=yes
/interface bridge port
add bridge=bridge interface=ether1
add bridge=bridge interface=ether2
add bridge=bridge interface=ether3 pvid=99
/interface bridge vlan
add bridge=bridge tagged=ether1 untagged=ether3 vlan-ids=99

Warning: By enabling vlan-filtering you will be filtering out traffic destined to the CPU, before enabling VLAN filtering you should make sure that you set up a Management port


VLAN in bridge with a physical interface

Very similar case to VLAN on a bridge in a bridge, there are multiple possible scenarios where this could could have been used, most popular use case is when you want to send out tagged traffic through a physical interface, in such a setup you want traffic from one interface to receive only certain tagged traffic and send out this tagged traffic as tagged through a physical interface (simplified trunk/access port setup) by just using VLAN interfaces and a bridge.

Configuration

/interface vlan
add interface=ether1 name=VLAN99 vlan-id=99
/interface bridge
add name=bridge
/interface bridge port
add interface=ether2 bridge=bridge
add interface=VLAN99 bridge=bridge

Problem

This setup and configuration will work on most cases, but it violates the IEEE 802.1W standard when (R)STP is used. If this is the only device in your Layer2 domain, then this should not cause problems, but problems can arise when there are other vendor switches. The reason for this is that (R)STP on a bridge interface is enabled by default and BPDUs coming from ether1 will be sent out tagged since everything sent into ether1 will be sent out through ether2 as tagged traffic, not all switches can understand tagged BPDUs. Precautions should be made with this configuration in a more complex network where there are multiple network topologies for certain (group of) VLANs, this is relevant to MSTP and PVSTP(+) with mixed vendor devices. In a ring-like topology with multiple network topologies for certain VLANs, one port from the switch will be blocked, but in MSTP and PVSTP(+) a path can be opened for a certain VLAN, in such a situation it is possible that devices that don't support PVSTP(+) will untag the BPDUs and forward the BPDU, as a result the switch will receive its own packet, trigger a loop detection and block a port, this can happen to other protocols as well, but (R)STP is the most common case. If a switch is using a BPDU guard function, then this type of configuration can trigger it and cause a port to be blocked by STP. It has been reported that this type of configuration can prevent traffic from being forwarded over certain bridge ports over time when using 6.41 or later.

Solution

To avoid compatibility issues you should use bridge VLAN filtering. Below you can find an example how the same traffic tagging effect can be achieved with a bridge VLAN filtering configuration:

/interface bridge
add name=bridge vlan-filtering=yes
/interface bridge port
add bridge=bridge interface=ether1 pvid=99
add bridge=bridge interface=ether2
/interface bridge vlan
add bridge=bridge tagged=ether2 untagged=ether1 vlan-ids=99

Warning: By enabling vlan-filtering you will be filtering out traffic destined to the CPU, before enabling VLAN filtering you should make sure that you set up a Management port


Bridged VLAN on physical interfaces

Very similar case to VLAN on a bridge in a bridge, consider the following scenario, you have a couple of switches in your network and you are using VLANs to isolate certain Layer2 domains and connect these switches are connected to a router that assigns addresses and routes the traffic to the world. For redundancy you connect switches all switches directly to the router and have enabled RSTP, but to be able to setup DHCP Server you decide that you can create a VLAN interface for each VLAN on each physical interface that is connected to a switch and add these VLAN interfaces in a bridge. Network diagram can be found bellow:

Alt text
Bridged VLANs topology

Configuration

Only the router part is relevant to this case, switch configuration doesn't really matter as long as ports are switched. Router configuration can be found bellow:

/interface bridge
add name=bridge10
add name=bridge20
/interface vlan
add interface=ether1 name=ether1_v10 vlan-id=10
add interface=ether1 name=ether1_v20 vlan-id=20
add interface=ether2 name=ether2_v10 vlan-id=10
add interface=ether2 name=ether2_v20 vlan-id=20
/interface bridge port
add bridge=bridge10 interface=ether1_v10
add bridge=bridge10 interface=ether2_v10
add bridge=bridge20 interface=ether1_v20
add bridge=bridge20 interface=ether2_v20

Problem

You might notice that the network is having some weird delays or even the network is unresponsive, you might notice that there is a loop detected (packet received with own MAC address) and some traffic is being generated out of nowhere. The problem occurs because a broadcast packet that is coming from either one of the VLAN interface created on the Router will be sent out the physical interface, packet will be forwarded through the physical interface, through a switch and will be received back on a different physical interface, in this case broadcast packets sent out ether1_v10 will be received on ether2, packet will be captured by ether2_v10, which is bridged with ether1_v10 and will get forwarded again the same path (loop). (R)STP might not always detect this loop since (R)STP is not aware of any VLANs, a loop does not exist with untagged traffic, but exists with tagged traffic. In this scenario it is quite obvious to spot the loop, but in more complex setups it is not always easy to detect the network design flaw. Sometimes this network design flaw might get unnoticed for a very long time if your network does not use broadcast traffic, usually Nieghbor Discovery Protocol is broadcasting packets from the VLAN interface and will usually trigger a loop detection in such a setup. Sometimes it is useful to capture the packet that triggered a loop detection, this can by using sniffer and analysing the packet capture file:

/tool sniffer
set filter-mac-address=4C:5E:0C:4D:12:44/FF:FF:FF:FF:FF:FF \
filter-interface=ether1 filter-direction=rx file-name=loop_packet.pcap

Or a more convenient way using logging:

/interface bridge filter
add action=log chain=forward src-mac-address=4C:5E:0C:4D:12:44/FF:FF:FF:FF:FF:FF
add action=log chain=input src-mac-address=4C:5E:0C:4D:12:44/FF:FF:FF:FF:FF:FF

Solution

Partial solution is to use Multiple Spanning Tree Protocol across the whole network, but it is required to use bridge VLAN filtering in order to make all bridges compatible with IEEE 802.1W and IEEE 802.1Q.

/interface bridge
add name=bridge vlan-filtering=yes
/interface bridge port
add bridge=bridge interface=ether1
add bridge=bridge interface=ether2
/interface bridge vlan
add bridge=bridge tagged=ether1,ether2,bridge vlan-ids=10,20
/interface vlan
add name=vlan10 interface=bridge vlan-id=10
add name=vlan20 interface=bridge vlan-id=20

Even though rewriting your configuration to use bridge VLAN filtering will fix loop occurrence because of broadcast traffic that is coming from a VLAN interface, there still might exist loops with tagged unknown unicast or broadcast traffic. To make sure that loops don't exist with tagged and untagged traffic you should consider implementing MSTP in your network instead of (R)STP.

Warning: By enabling vlan-filtering you will be filtering out traffic destined to the CPU, before enabling VLAN filtering you should make sure that you set up a Management port


Bridge VLAN filtering on non-CRS3xx

Consider the following scenario, you found out the new bridge VLAN filtering feature and you decided to change the configuration on your device, you have a very simple trunk/access port setup and you like the concept of bridge VLAN filtering.

Configuration

/interface bridge
add name=bridge vlan-filtering=yes
/interface bridge port
add bridge=bridge interface=ether1
add bridge=bridge interface=ether2 pvid=20
add bridge=bridge interface=ether3 pvid=30
add bridge=bridge interface=ether4 pvid=40
/interface bridge vlan
add bridge=bridge1 tagged=ether1 untagged=ether2 vlan-ids=20
add bridge=bridge1 tagged=ether1 untagged=ether3 vlan-ids=30
add bridge=bridge1 tagged=ether1 untagged=ether4 vlan-ids=40

Problem

For example, you use this configuration on a CRS1xx/CRS2xx series device and you started to notice that the CPU usage is very high and when running a performance test to check the network's throughput you notice that the total throughput is only a fraction of the wire-speed performance that is should easily reach. The cause of the problem is that not all devices support bridge VLAN filtering on a hardware level. All devices are able to be configured with bridge VLAN filtering, but only few of them will be able to offload the traffic to the switch chip. If improper configuration method is used on a device with a built-in switch chip, then the CPU will be used to forward the traffic.

Solution

Before using bridge VLAN filtering check if your device supports it at the hardware level, table with compatibility can be found at the Bridge Hardware Offloading section. Each type of device currently requires a different configuration method, below is a list of which configuration should be used on a device in order to use benefits of hardware offloading:

MTU on master interface

Consider the following scenario, you have created a bridge, added a few interfaces to it and have created a VLAN interface on top of the bridge interface, but you need to increase the MTU size on the VLAN interface in order to receive larger packets.

Configuration

/interface bridge
add name=bridge
/interface bridge port
add bridge=bridge interface=ether1
add bridge=bridge interface=ether2
/interface vlan
add interface=bridge name=VLAN99 vlan-id=99

Problem

As soon as you try to increase the MTU size on the VLAN interface, you receive an error that RouterOS Could not set MTU. This can happen when you are trying to set MTU larger than the L2MTU. In this case you need to increase the L2MTU size on all slave interfaces, which will update the L2MTU size on the bridge interface. After this has been done, you will be able to set a larger MTU on the VLAN interface. The same principle applies to bonding interfaces. You can increase the MTU on interfaces like VLAN, MPLS, VPLS, Bonding and other interfaces only when all physical slave interfaces have proper L2MTU set.

Solution

Increase the L2MTU on slave interfaces before changing the MTU on a master interface.

/interface ethernet
set ether1,ether2 l2mtu=9018
/interface vlan
set VLAN99 mtu=9000

MTU inconsistency

Consider the following scenario, you have multiple devices in your network, most of them are used as a switch/bridge in your network and there are certain endpoints that are supposed to receive and process traffic. To decrease the overhead in your network, you have decided to increase the MTU size so you set a larger MTU size on both endpoints, but you start to notice that some packets are being dropped.

Alt text
MTU inconsistency setup

Configuration

In this case both endpoints can be any type of device, we will assume that they are both Linux servers that are supposed to transfer large amount of data. In such a scenario you would have probably set something similar to this on ServerA and ServerB:

ip link set eth1 mtu 9000

And on your Switch you have probably have set something similar to this:

/interface bridge
add name=bridge
/interface bridge port
add interface=ether1 bridge=bridge
add interface=ether2 bridge=bridge

Problem

This is a very simplified problem, but in larger networks this might not be very easy to detect. For instance, ping might be working since a generic ping packet will be 70 bytes long (14 bytes for Ethernet header, 20 bytes for IPv4 header, 8 bytes for ICMP header, 28 bytes for ICMP payload), but data transfer might not work properly. The reason why some packets might not get forwarded is that MikroTik devices running RouterOS by default has MTU set to 1500 and L2MTU set to something around 1580 bytes (depends on the device), but the Ethernet interface will silently drop anything that does not fit into the L2MTU size. Note that L2MTU parameter is not relevant to x86 or CHR devices. For a device that is only supposed to forward packets, there is no need to increase the MTU size, it is only required to increase the L2MTU size, RouterOS will not allow you to increase the MTU size that is larger than the L2MTU size. If you require the packet to be received on the interface and the device needs to process this packet rather than just forwarding it, for example, in case of routing, then it is required to increase the L2MTU and the MTU size, but you can leave the MTU size on the interface to the default value if you are using only IP traffic (that supports packet fragmentation) and don't mind that packets are being fragmented. You can use the ping utility to make sure that all devices are able to forward jumbo frames:

/ping 192.168.88.1 size=9000 do-not-fragment

Remember that the L2MTU and MTU size needs to be larger or equal to the ping packet size on the device from which and to which you are sending a ping packet, since ping (ICMP) is IP traffic that is sent out from a interface over Layer3.

Solution

Increase the L2MTU size on your Switch:

/interface ethernet
set ether1,ether2 l2mtu=9000

In case your traffic is encapsulated (VLAN, VPN, MPLS, VPLS or other), then you might need to consider setting even a larger L2MTU size. In this scenario it is not needed to increase the MTU size for the reason described above.

Note: Full frame MTU is not the same as L2MTU. L2MTU size does not include the Ethernet header (14 bytes) and the CRC checksum (FCS) field. The FCS field is stripped by the Ethernet's driver and RouterOS will never show the extra 4 bytes to any packet. For example, if a you set MTU and L2MTU to 9000, then the full frame MTU is 9014 bytes long, this can also be observed when sniffing packets with /tool sniffer quick


[ Top | Back to Content ]