Difference between revisions of "NetworkPro on Quality of Service"
(heads-up on l2)
m (→More Information)
|Line 258:||Line 258:|
Head of Line blocking
Head of Line blocking
Revision as of 20:21, 25 April 2012
- 1 Theory
- 1.1 TCP control for QoS
- 1.2 The TCP Window
- 1.3 Control is Outbound
- 1.4 QoS Packet Flow & Double Control
- 1.5 HTB Queue Tree VS Simple Queue
- 1.6 Guaranteeing Bandwidth with the HTB "Priority" Feature
- 1.7 Guaranteeing Priority with the Limit-At setting
- 1.8 Queue Type
- 1.9 Queue Size
- 1.10 Lends VS Borrows
- 1.11 Adjusting the Max-Limit
- 2 QoS Example
- 3 Test Setups
- 4 Heads-up on Layer 2 devices, hidden buffers and performance degradation
- 5 More Information
- 6 Questions and Contradictions
Let's begin with some theory that I've accumulated over the years.
It should save your life when dealing with Bandwidth Management and QoS.
Please register in the MikroTik WiKi and help us out by editing and suppliying articles.
TCP control for QoS
wikipedia.org/wiki/Bandwidth_management TCP rate control - artificially adjusting TCP window size as well as controlling the rate of ACKs being returned to the sender.
RouterOS HTB can adjust the TCP window by dropping packets out of the set bandwidth limit.
More general information on TCP and the ACK packet: wikipedia.org/wiki/ACK_(TCP)
The TCP Window
A TCP window is the amount of data a sender can send on a particular TCP connection before it gets an acknowledgment (ACK packet) back from the receiver that it has gotten some of it.
For example if a pair of hosts are talking over a TCP connection that has a TCP window size of 64 KB (kilobytes), the sender can only send 64 KB of data and then it must stop and wait for an acknowledgment from the receiver that some or all of the data has been received. If the receiver acknowledges that all the data has been received then the sender is free to send another 64 KB. If the sender gets back an acknowledgment from the receiver that it received the first 32 KB (which could happen if the second 32 KB was still in transit or it could happen if the second 32 KB got lost or dropped, shaped), then the sender could only send another 32 KB since it can't have more than 64 KB of unacknowledged data.
The primary reason for the window is congestion control. The whole network connection, which consists of the hosts at both ends, the routers in between and the actual physical connections themselves (be they fiber, copper, satellite or whatever) will have a bottleneck somewhere that can only handle data so fast. Unless the bottleneck is the sending speed of the transmitting host, then if the transmitting occurs too fast the bottleneck will be surpassed resulting in lost data. The TCP window throttles the transmission speed down to a level where congestion and data loss do not occur.
Control is Outbound
Inbound traffic for router - traffic that hits routers' interfaces, no matter from what side - Internet or local - it will be received by interface no matter what, even malformed packets, and you cannot do anything with these. Outbound traffic for router - traffic that goes out of routers' interfaces, no matter of direction, to your network or out of it. This is where you can set up queues and prioritize, and limit!
HTB allows us to work with the Outgoing traffic (the traffic that is leaving the router via any interface).
Example1: Client sends out 10Mbps UDP traffic - this traffic will get to the routers local interface, but in one of the HTBs (global-in, global-total, global-out or outgoing interface) it will be shaped to (let’s say) 1Mbps. So only 1Mbps will leave the router. But in the next second the client can send 10Mbps once again and we will shape them again.
Example2: Client sends out 10Mbps TCP traffic - this traffic will get to the routers local interface, but in one of the HTBs (global-in, global-total, global-out or outgoing interface) it will be shaped to (let’s say) 1Mbps. So only 1Mbps will leave the router. Source gets ACK replies only for 1Mbps of 10Mbps, so source, in the next second, will send a little more than 1Mbps of TCP traffic. (due to TCP Window adjusting to our shaped bandwidth)
There are 4 ways we can look at a flow:
1) client upload that router receives on the local interface
2) client upload that router sends out to the Internet
3) client download that router receives on the public interface
4) client download that router sends out to the customer
1) and 3) - is Inbound traffic
2) and 4) - is Outbound traffic
HTB can control 1) when its sent out as 2) and 3) when its sent out as 4)
QoS Packet Flow & Double Control
Working with packets for bandwidth management is done in this order:
1. Mangle chain prerouting
2. HTB global-in
3. Mangle chain forward
4. Mangle chain postrouting
5. HTB global-out
6. HTB out interface
So, in one router, you can do:
a) in #1+#2 - first marking & shaping, in #3+#5 - second marking & shaping
b) in #1+#2 - first marking & shaping, in #3+#6 - second marking & shaping
c) in #1+#2 - first marking & shaping, in #4+#5 - second marking & shaping
d) in #1+#2 - first marking & shaping, in #4+#6 - second marking & shaping
In his presentation http://mum.mikrotik.com/presentations/CZ09/QoS_Megis.pdf
Janis Megis says that creating priorities separately for each client is suicide - there is no hardware that can handle small queue tree for every user (if you have 1000 of them). So he has the next best thing, which is as close as possible to the desired behavior.
The main Idea of the setup is to have two separate QoS steps:
1) In the first step we prioritize traffic, we are making sure that traffic with higher priority has more chance to get to the customers than traffic with lower priority.
Example: We have total of 100Mbps available, but clients at this particular moment would like to receive 10Mbps of Priority=1 traffic, 20Mbps of Priority=4 and 150Mbps of Priority=8. Of course after our prioritization and limitation 80Mbps of priority=8 will be dropped. And only 100Mbps total will get to the next step.
2) Next step is per-user limitation, we already have only higher priority traffic, but now we must make sure that some user will not overuse it, so we have PCQ with limits.
This way we get virtually the same behavior as "per user prioritization".
So the plan is to mark by traffic type in prerouting and limit by traffic type in global-in.
Then remark traffic by IP addresses in forward and limit in global-out.
1) you need to mark all traffic, that would be managed by one particular Queue, at the same place (prerouting)
2) if you use global-total/in/out or Queue Simple or if you use Queue Tree and you do not mark connections first, you must mark upload and download for every type of traffic separately
4) if you do not use a simple PCQ, you must have a parent queue, that has max-limit and (let’s say) parent=global-in, all other queues parent=<parent>
5) you need 2 sets of those queues - one for upload, one for download
Create packet marks in the mangle chain "Prerouting" for traffic prioritization in the global-in Queue
Limitation for in mangle chain "forward" marked traffic can be placed in the "global-out" or interface queue
If queues will be placed in the interface queues
queues on the public interface will capture only client upload
queues on the local interface will capture only client's download
If queues will be placed in global-out
download and upload will be limited together (separate marks needed)
Double Control is achieved with Queue Tree
HTB Queue Tree VS Simple Queue
Queue Tree is an unidirectional queue in one of the HTBs. It is also The way to add a queue on a separate interface (parent:ether1/pppoe1..) - this way it is possible to ease mangle configuration - you don't need separate marks per outgoing interface.
Also it is possible to have double queuing, for example: prioritization of traffic in global-in or global-out, and limitation per client on the outgoing interface.
Queue Tree is not ordered - all traffic passes it together, where as with Queue Simple - traffic is evaluated by each Simple Queue, one by one, from top to bottom. If the traffic matches the Simple Queue - then its managed by it, otherwise its passed down.
Each simple queue creates 3 separate queues:
One in global-in ("direct" part)
One in Global-out ('reverse" part)
One in Global-total ('total" part)
Simple queues are ordered - similar to firewall rules
further down = longer packet processing
further down = smaller chance to get traffic
so it’s necessary to reduce the number of queues
In the case of Simple Queues, the order is for 'catching traffic' (mangle) and the "priority" is the HTB feature.
Guaranteeing Bandwidth with the HTB "Priority" Feature
See the official article http://wiki.mikrotik.com/wiki/Manual:HTB
We already know that limit-at (CIR) to all queues will be given out no matter what.
Priority is responsible for distribution of remaining parent queues traffic to child queues so that they are able to reach max-limit
Queue with higher priority will reach its max-limit before the queue with lower priority. 8 is the lowest priority, 1 is the highest.
Make a note that priority only works:
- for leaf queues - priority in inner queue have no meaning.
- if max-limit is specified (not 0)
"Priority" feature (1..8) : Prioritize one child queue over other child queue. Does not work on parent queues (if queue has at least one child). One is the highest, eight is the lowest priority. Child queue with higher priority will have chance to reach its limit-at before child with lower priority (confirmed) and after that child queue with higher priority will have chance to reach its max-limit before child with lower priority.
Guaranteeing Priority with the Limit-At setting
Priority traffic will have better performance in its limit-at than in between limit-at and max-limit. Currently for me this means, although no one nowhere talks about it, that a leaf with higher "priority" setting is given more transmission time when the bandwidth its trying to consume is within its limit-at.
A queue tree with all default-small leaf queues, will degrade performance by 10ms on a leaf queue that has more traffic than whats configured in its limit-at. If traffic is within this limit-at - performance is OK - no additional delays could be noticed. No information about this could be seen from any columns inside WinBox - Bytes, Packets, Dropped, Lends, Borrows, Queued Bytes, Queued Packets. Changing to Queue Type "default" (50 packets) did not increase the delay and did not change it noticably.
Queue type applies only to Child queues. It doesn't matter what queue type you set for the parent. The parent is only used to set the max-limit and to group the leaf queues.
My tests show that PCQ by dst-port for example will have better results than default and default-small. SFQ as the wireless default- as well. So much in fact that I would never use a FIFO queue again. I wonder why its the default on Ethernet. I guess its because FIFO uses as little resources as possible.
Lends VS Borrows
- a Lend happens when the packet is treated as limit-at
- a Borrow happens when the packet is between limit-at and max-limit
Adjusting the Max-Limit
Without setting the max-limit properly, the HTB Queue Tree will not drop enough low-priority packets, so the bandwidth control would be lost. In order to have Control - we must set the max-limit to a lower value - 99.99% to 85% of the tested throuput of the bottleneck.
Also to keep control our limi-ats combined must not be more than the max-limit.
I know it bugs you to set only 80% of your available bandwidth in these fields but this is required. There will be a bottleneck somewhere in the system and QoS can only work if the bottleneck is in your router where it has some control. The goal is to force the bottleneck to be in your router as opposed to some random location out on the wire over which you have no control.
The situation can become confusing because most ISPs offer only "Best effort" service which means they don't actually guarantee any level of service to you. Fortunately there is usually a minimum level that you receive on a consistent basis and you must set your QoS limits below this minimum. The problem is finding this minimum. For this reason start with 80% of your measured speed and try things for a couple of days. If the performance is acceptable you can start to inch your levels up. But I warn you that if you go even 5% higher than you should be your QoS will totally stop working (just too high) or randomly stop working (high when your ISP is slow). This can lead to a lot of confusion on your part so take my advice and get it working first by conservatively setting these speeds and then optimize later..
Upload QoS for ADSL, tested and seems working good enough. Can be easily adapted and upgraded for all needs - up/down, ADSL or dedicated optic.
/ip firewall mangle add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_1_Up dst-port=80,443 packet-size=0-666 protocol=tcp tcp-flags=syn comment=QoS add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_1_Up dst-port=80,443 packet-size=0-123 protocol=tcp tcp-flags=ack add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_1_Up dst-port=53,123 protocol=udp add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_2_Up dst-port=80,443 connection-bytes=0-1000000 protocol=tcp add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_2_Up dst-port=110,995,143,993,25,20,21 packet-size=0-666 protocol=tcp tcp-flags=syn add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_2_Up dst-port=110,995,143,993,25,20,21 packet-size=0-123 protocol=tcp tcp-flags=ack add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_3_Up packet-size=0-666 protocol=tcp tcp-flags=syn add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_3_Up packet-size=0-123 protocol=tcp tcp-flags=ack add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_4_Up dst-port=110,995,143,993,25,20,21 protocol=tcp add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_4_Up dst-port=80,443 connection-bytes=1000000-0 protocol=tcp add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_8_Up p2p=all-p2p add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_7_Up add action=mark-packet chain=postrouting out-interface=ADSL1 passthrough=no new-packet-mark=QoS_2_Up src-port=8291 comment=WinBox /queue tree add max-limit=666K name=QoS_ADSL1_Up parent=ADSL1 add name=QoS_1 packet-mark=QoS_1_Up parent=QoS_ADSL1_Up priority=1 add name=QoS_2 packet-mark=QoS_2_Up parent=QoS_ADSL1_Up priority=2 add name=QoS_3 packet-mark=QoS_3_Up parent=QoS_ADSL1_Up priority=3 add name=QoS_7 packet-mark=QoS_7_Up parent=QoS_ADSL1_Up priority=7 add name=QoS_8 packet-mark=QoS_8_Up parent=QoS_ADSL1_Up priority=8 add name=QoS_4 packet-mark=QoS_4_Up parent=QoS_ADSL1_Up priority=4
To study packet queues one can use a couple of RouterOS machines. One with packet queues and the other to generate traffic with Bandwidth Test. If sniffer is needed - the built-in is used.
So far testing with VMWare has shown wrong resulst with TCP. I have achieved correct behaviour with live RouterBOARDS as well as with Oracle VM VirtualBox.
This screenshot shows the easiest test scenario.
First addresses are assigned to ether1 - 192.168.56.55/24, 192.168.56.56/32 to 59/32
Then packets are marked in prerouting by their dst address
After that a Queue Tree is built as shown, with one leaf per packet mark.
So when generating traffic with the bandwidth test and with ping tools to one of the different IPs, you will be simulating your different priority traffic.
Then its up to ones need and imagination to experiment.
In the picture you can see an example laboratory setup. This graphic diagram I made for my MSc thesis presentation.
Make sure Flow Control is turned off on everything that passes Ethernet frames - switches, ports, devices.
Avoid using QoS or priority on cheap switches due to limitations in the level of control over the queues on those devices.
Questions and Contradictions
If you have found a contradiction and if you have a question - please ask in the forum. http://forum.mikrotik.com/