ECMP load balancing with masquerade: Difference between revisions

From MikroTik Wiki
Jump to navigation Jump to search
Megis (talk | contribs)
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Spanish version of this article: [[Load_Balancing_Persistent_Spanish|Balanceo de carga mejorado persistente]]
==Introduction==
==Introduction==
This example is improved (different) version of round-robin load balancing example. It adds persistent user sessions, i.e. a particular user would use the same source IP address for all outgoing connections.
This example is improved (different) version of round-robin load balancing example. It adds persistent user sessions, i.e. a particular user would use the same source IP address for all outgoing connections.
Line 10: Line 8:
Configuration export from the gateway router:
Configuration export from the gateway router:
<pre>
<pre>
'''/ ip address'''
/ ip address
add address=192.168.0.1/24 network=192.168.0.0 broadcast=192.168.0.255 interface=Local  
add address=192.168.0.1/24 network=192.168.0.0 broadcast=192.168.0.255 interface=Local  
add address=10.111.0.2/24 network=10.111.0.0 broadcast=10.111.0.255 interface=wlan2
add address=10.111.0.2/24 network=10.111.0.0 broadcast=10.111.0.255 interface=wlan1
add address=10.112.0.2/24 network=10.112.0.0 broadcast=10.112.0.255 interface=wlan1
add address=10.112.0.2/24 network=10.112.0.0 broadcast=10.112.0.255 interface=wlan2
 
/ ip route
add dst-address=0.0.0.0/0 gateway=10.111.0.1,10.112.0.1 check-gateway=ping


'''/ ip firewall mangle'''
/ ip firewall nat
add chain=prerouting src-address-list=odd in-interface=Local action=mark-connection \
add chain=srcnat out-interface=wlan1 action=masquerade
  new-connection-mark=odd passthrough=yes
add chain=srcnat out-interface=wlan2 action=masquerade
add chain=prerouting src-address-list=odd in-interface=Local action=mark-routing \
  new-routing-mark=odd passthrough=no
add chain=prerouting src-address-list=even in-interface=Local action=mark-connection \
  new-connection-mark=even passthrough=yes
add chain=prerouting src-address-list=even in-interface=Local action=mark-routing \
  new-routing-mark=even passthrough=no
add chain=prerouting in-interface=Local connection-state=new nth=2,1 \
    action=mark-connection new-connection-mark=odd passthrough=yes
add chain=prerouting in-interface=Local action=add-src-to-address-list \
  address-list=odd address-list-timeout=1d connection-mark=odd passthrough=yes
add chain=prerouting in-interface=Local connection-mark=odd action=mark-routing \
    new-routing-mark=odd passthrough=no
add chain=prerouting in-interface=Local connection-state=new nth=2,2 \
    action=mark-connection new-connection-mark=even passthrough=yes
add chain=prerouting in-interface=Local action=add-src-to-address-list \
  address-list=even address-list-timeout=1d connection-mark=even passthrough=yes
add chain=prerouting in-interface=Local connection-mark=even action=mark-routing \
    new-routing-mark=even passthrough=no


'''/ ip firewall nat'''
/ ip firewall mangle
add chain=srcnat connection-mark=odd action=src-nat to-addresses=10.111.0.2 \
add chain=input in-interface=wlan1 action=mark-connection new-connection-mark=wlan1_conn
    to-ports=0-65535
add chain=input in-interface=wlan2 action=mark-connection new-connection-mark=wlan2_conn 
add chain=srcnat connection-mark=even action=src-nat to-addresses=10.112.0.2 \
add chain=output connection-mark=wlan1_conn action=mark-routing new-routing-mark=to_wlan1   
    to-ports=0-65535
add chain=output connection-mark=wlan2_conn action=mark-routing new-routing-mark=to_wlan2   


'''/ ip route'''
/ ip route
add dst-address=0.0.0.0/0 gateway=10.111.0.1 scope=255 target-scope=10 routing-mark=odd
add dst-address=0.0.0.0/0 gateway=10.111.0.1 routing-mark=to_wlan1
add dst-address=0.0.0.0/0 gateway=10.112.0.1 scope=255 target-scope=10 routing-mark=even
add dst-address=0.0.0.0/0 gateway=10.112.0.1 routing-mark=to_wlan2
add dst-address=0.0.0.0/0 gateway=10.112.0.1 scope=255 target-scope=10
</pre>
</pre>


Line 55: Line 37:
/ ip address  
/ ip address  
add address=192.168.0.1/24 network=192.168.0.0 broadcast=192.168.0.255 interface=Local
add address=192.168.0.1/24 network=192.168.0.0 broadcast=192.168.0.255 interface=Local
add address=10.111.0.2/24 network=10.111.0.0 broadcast=10.111.0.255 interface=wlan2
add address=10.111.0.2/24 network=10.111.0.0 broadcast=10.111.0.255 interface=wlan1
add address=10.112.0.2/24 network=10.112.0.0 broadcast=10.112.0.255 interface=wlan1
add address=10.112.0.2/24 network=10.112.0.0 broadcast=10.112.0.255 interface=wlan2
</pre>
</pre>
The router has two upstream (WAN) interfaces with the addresses of 10.111.0.2/24 and 10.112.0.2/24.
The router has two upstream (WAN) interfaces with the addresses of 10.111.0.2/24 and 10.112.0.2/24.
The LAN interface has the name "Local" and IP address of 192.168.0.1/24.
The LAN interface has the name "Local" and IP address of 192.168.0.1/24.


===Mangle===
===NAT===
<pre>
<pre>
/ ip firewall mangle
/ ip firewall nat
add chain=prerouting src-address-list=odd in-interface=Local action=mark-connection \
add chain=srcnat out-interface=wlan1 action=masquerade
  new-connection-mark=odd passthrough=yes
add chain=srcnat out-interface=wlan2 action=masquerade
add chain=prerouting src-address-list=odd in-interface=Local action=mark-routing \
  new-routing-mark=odd
</pre>
</pre>


All traffic from customers having their IP address previously placed in the address list "odd" is instantly marked with connection and routing marks "odd". Afterwards the traffic is excluded from processing against successive mangle rules in prerouting chain.  
As routing decision is already made we just need rules that will fix src-addresses for all outgoing packets. if this packet will leave via wlan1 it will be NATed to 10.111.0.2/24,
if via wlan2 then NATed to 10.112.0.2/24
 
===Routing===
<pre>
<pre>
/ ip firewall mangle
/ ip route
add chain=prerouting src-address-list=even in-interface=Local action=mark-connection \
add dst-address=0.0.0.0/0 gateway=10.111.0.1,10.112.0.1 check-gateway=ping
  new-connection-mark=even passthrough=yes
add chain=prerouting src-address-list=even in-interface=Local action=mark-routing \
  new-routing-mark=even
</pre>
Same stuff as above, only for customers having their IP address previously placed in the address list "even".
<pre>
/ ip firewall mangle
add chain=prerouting in-interface=Local connection-state=new nth=1,1,0 \
    action=mark-connection new-connection-mark=odd passthrough=yes
add chain=prerouting in-interface=Local action=add-src-to-address-list \
  address-list=odd address-list-timeout=1d connection-mark=odd passthrough=yes
add chain=prerouting in-interface=Local connection-mark=odd action=mark-routing \
    new-routing-mark=odd passthrough=no
</pre>
First we take every second packet that establishes new session (note connection-state=new), and mark it with connection mark "odd". Consequently all successive packets belonging to the same session will carry the connection mark "odd". Note that we are passing these packets to the second and third rules (passthrough=yes). Second rule adds IP address of the client to the address list to enable all successive sessions to go through the same gateway. Third rule places the routing mark "odd" on all packets that belong to the "odd" connection and stops processing all other mangle rules for these packets in prerouting chain.
<pre>
/ ip firewall mangle
add chain=prerouting in-interface=Local connection-state=new nth=1,1,1 \
    action=mark-connection new-connection-mark=even passthrough=yes
add chain=prerouting in-interface=Local action=add-src-to-address-list \
  address-list=even address-list-timeout=1d connection-mark=even passthrough=yes
add chain=prerouting in-interface=Local connection-mark=even action=mark-routing \
    new-routing-mark=even passthrough=no
</pre>
</pre>
These rules do the same for the remaining half of the traffic as the first three rules for the first half of the traffic.


The code above effectively means that each new connection initiated through the router from the local network will be marked as either "odd" or "even" with both routing and connection marks.
This is typical ECMP (Equal Cost Multi-Path) gateway with check-gateway. ECMP is "persistent per-connection load balancing" or "per-src-dst-address combination load balancing".
As soon as one of the gateway will not be reachable, check-gateway will remove it from gateway list. And you will have a "failover" effect.


The above works fine. There are however some situations where you might find that the same IP address is listed under both the ODD and EVEN scr-address-lists. This behavior causes issues with apps that require persistent connections. A simple remedy for this situation is to add the following statement to your mangle rules:
<pre>
add chain=prerouting in-interface=Local connection-state=new nth=1,1,1 \
    src-address-list=!odd action=mark-connection new-connection-mark=even \
    passthrough=yes
</pre>
This will ensure that the new connection will not already be part of the ODD src-address-list. You will have to do the same for the ODD mangle rule thus excluding IP's already part of the EVEN scr-address-list.


===NAT===
You can use asymmetric bandwidth links also  - for example one link is 2Mbps other 10Mbps. Just use this command to make load balancing 1:5
<pre>
<pre>
/ ip firewall nat
/ ip route
add chain=srcnat connection-mark=odd action=src-nat to-addresses=10.111.0.2 \
add dst-address=0.0.0.0/0 gateway=10.111.0.1,10.112.0.1,10.112.0.1,10.112.0.1,10.112.0.1,10.112.0.1 check-gateway=ping
    to-ports=0-65535
add chain=srcnat connection-mark=even action=src-nat to-addresses=10.112.0.2 \
    to-ports=0-65535
</pre>
</pre>


All traffic marked "odd" is being NATted to source IP address of 10.111.0.2, while traffic marked "even" gets "10.112.0.2" source IP address.
===Connections to the router itself===


===Routing===
<pre>
<pre>
/ ip route
/ ip firewall mangle
add dst-address=0.0.0.0/0 gateway=10.111.0.1 scope=255 target-scope=10 routing-mark=odd
add chain=input in-interface=wlan1 action=mark-connection new-connection-mark=wlan1_conn
add dst-address=0.0.0.0/0 gateway=10.112.0.1 scope=255 target-scope=10 routing-mark=even
add chain=input in-interface=wlan2 action=mark-connection new-connection-mark=wlan2_conn
add chain=output connection-mark=wlan1_conn action=mark-routing new-routing-mark=to_wlan1   
add chain=output connection-mark=wlan2_conn action=mark-routing new-routing-mark=to_wlan2   
</pre>
</pre>


For all traffic marked "odd" (consequently having 10.111.0.2 translated source address) we use 10.111.0.1 gateway. In the same manner all traffic marked "even" is routed through the  10.112.0.1 gateway.
<pre>
<pre>
/ ip route
/ ip route
add dst-address=0.0.0.0/0 gateway=10.112.0.1 scope=255 target-scope=10
add dst-address=0.0.0.0/0 gateway=10.111.0.1 routing-mark=to_wlan1
add dst-address=0.0.0.0/0 gateway=10.112.0.1 routing-mark=to_wlan2
</pre>
</pre>
Finally, we have one additional entry specifying that traffic from the router itself (the traffic without any routing marks) should go to 10.112.0.1 gateway.


With all multi-gateway situations there is a usual problem to reach router from public network via one, other or both gateways. Explanations is very simple -
Outgoing packets uses same routing decision as packets that are going trough the router. So reply to a packet that was received via wlan1 might be send out and masqueraded via wlan2.
To avoid that we need to policy routing those connections.


{{Note| Additional mangle rules for forward chain might be needed if connections from internet to router and being NATted to local network. }}


==Known Issues==
==Known Issues==


===DNS issues===
ISP specific DNS servers might have custom configuration that treats specific requests from ISP's network differently than requests from other network. So in case connection is made via other gateway those sites will not be accessible.
To avoid that we suggest to use 3rd-party (public) DNS servers, and in case you need ISP specific recourse, create static DNS entry and policy route that traffic to specific gateway.
===Routing table flushing===
Every time when something triggers flush of the routing table and ECMP cache is flushed. Connections will be assigned to gateways once again and may or may not be on the same gateway.(in case of 2 gateways there are 50% chance that traffic will start to flow via other gateway).
If you have fully routed network (clients address can be routed via all available gateway), change of the gateway will have no ill effect, but in case you use
masquerade, change of the gateway will result in change of the packet's source address and connection will be dropped.
Routing table flush can be caused by  2 things:
1) routing table change  (dynamic routing protocol update, user manual changes)
2) every 10 minutes routing table is flushed for security reasons (to avoid possible DoS attacks)


'''So even if you do not have any changes of routing table, connections may jump to other gateway every 10 minutes'''


[[Category: Routing]]
[[Category: Routing]]

Latest revision as of 12:46, 5 April 2016

Introduction

This example is improved (different) version of round-robin load balancing example. It adds persistent user sessions, i.e. a particular user would use the same source IP address for all outgoing connections. Consider the following network layout:

File:LoadBalancing.jpg

Quick Start for Impatient

Configuration export from the gateway router:

/ ip address
add address=192.168.0.1/24 network=192.168.0.0 broadcast=192.168.0.255 interface=Local 
add address=10.111.0.2/24 network=10.111.0.0 broadcast=10.111.0.255 interface=wlan1
add address=10.112.0.2/24 network=10.112.0.0 broadcast=10.112.0.255 interface=wlan2

/ ip route
add dst-address=0.0.0.0/0 gateway=10.111.0.1,10.112.0.1 check-gateway=ping 

/ ip firewall nat 
add chain=srcnat out-interface=wlan1 action=masquerade
add chain=srcnat out-interface=wlan2 action=masquerade

/ ip firewall mangle
add chain=input in-interface=wlan1 action=mark-connection new-connection-mark=wlan1_conn
add chain=input in-interface=wlan2 action=mark-connection new-connection-mark=wlan2_conn   
add chain=output connection-mark=wlan1_conn action=mark-routing new-routing-mark=to_wlan1     
add chain=output connection-mark=wlan2_conn action=mark-routing new-routing-mark=to_wlan2     

/ ip route
add dst-address=0.0.0.0/0 gateway=10.111.0.1 routing-mark=to_wlan1 
add dst-address=0.0.0.0/0 gateway=10.112.0.1 routing-mark=to_wlan2

Explanation

First we give a code snippet and then explain what it actually does.

IP Addresses

/ ip address 
add address=192.168.0.1/24 network=192.168.0.0 broadcast=192.168.0.255 interface=Local
add address=10.111.0.2/24 network=10.111.0.0 broadcast=10.111.0.255 interface=wlan1
add address=10.112.0.2/24 network=10.112.0.0 broadcast=10.112.0.255 interface=wlan2 

The router has two upstream (WAN) interfaces with the addresses of 10.111.0.2/24 and 10.112.0.2/24. The LAN interface has the name "Local" and IP address of 192.168.0.1/24.

NAT

/ ip firewall nat 
add chain=srcnat out-interface=wlan1 action=masquerade
add chain=srcnat out-interface=wlan2 action=masquerade

As routing decision is already made we just need rules that will fix src-addresses for all outgoing packets. if this packet will leave via wlan1 it will be NATed to 10.111.0.2/24, if via wlan2 then NATed to 10.112.0.2/24

Routing

/ ip route 
add dst-address=0.0.0.0/0 gateway=10.111.0.1,10.112.0.1 check-gateway=ping 

This is typical ECMP (Equal Cost Multi-Path) gateway with check-gateway. ECMP is "persistent per-connection load balancing" or "per-src-dst-address combination load balancing". As soon as one of the gateway will not be reachable, check-gateway will remove it from gateway list. And you will have a "failover" effect.


You can use asymmetric bandwidth links also - for example one link is 2Mbps other 10Mbps. Just use this command to make load balancing 1:5

/ ip route 
add dst-address=0.0.0.0/0 gateway=10.111.0.1,10.112.0.1,10.112.0.1,10.112.0.1,10.112.0.1,10.112.0.1 check-gateway=ping 


Connections to the router itself

/ ip firewall mangle
add chain=input in-interface=wlan1 action=mark-connection new-connection-mark=wlan1_conn
add chain=input in-interface=wlan2 action=mark-connection new-connection-mark=wlan2_conn
add chain=output connection-mark=wlan1_conn action=mark-routing new-routing-mark=to_wlan1     
add chain=output connection-mark=wlan2_conn action=mark-routing new-routing-mark=to_wlan2     
/ ip route
add dst-address=0.0.0.0/0 gateway=10.111.0.1 routing-mark=to_wlan1 
add dst-address=0.0.0.0/0 gateway=10.112.0.1 routing-mark=to_wlan2

With all multi-gateway situations there is a usual problem to reach router from public network via one, other or both gateways. Explanations is very simple - Outgoing packets uses same routing decision as packets that are going trough the router. So reply to a packet that was received via wlan1 might be send out and masqueraded via wlan2.

To avoid that we need to policy routing those connections.

Note: Additional mangle rules for forward chain might be needed if connections from internet to router and being NATted to local network.


Known Issues

DNS issues

ISP specific DNS servers might have custom configuration that treats specific requests from ISP's network differently than requests from other network. So in case connection is made via other gateway those sites will not be accessible.

To avoid that we suggest to use 3rd-party (public) DNS servers, and in case you need ISP specific recourse, create static DNS entry and policy route that traffic to specific gateway.

Routing table flushing

Every time when something triggers flush of the routing table and ECMP cache is flushed. Connections will be assigned to gateways once again and may or may not be on the same gateway.(in case of 2 gateways there are 50% chance that traffic will start to flow via other gateway).


If you have fully routed network (clients address can be routed via all available gateway), change of the gateway will have no ill effect, but in case you use masquerade, change of the gateway will result in change of the packet's source address and connection will be dropped.


Routing table flush can be caused by 2 things:

1) routing table change (dynamic routing protocol update, user manual changes)

2) every 10 minutes routing table is flushed for security reasons (to avoid possible DoS attacks)

So even if you do not have any changes of routing table, connections may jump to other gateway every 10 minutes