Joined: 22 Mar 2010
|Posted: Fri Sep 06, 2013 5:55 am Post subject: Automatic multipath routes
|Hi, just wanted to share some idea. At work, we have several connections to the Internet, and some of these are a bunch of poor performance ADSLs. "Poor" means they drop packets, packets arrive out of sequence very often, links go down, latency grows suddenly, and all this in a (apparently) random manner. Sometimes they do well all day; sometimes, they don't, and all this in production time. Fortunatetly, we have other links to use for the *really* important stuff, and we use these ones to serve the "mass". Oh, forgot to say, the ISP is a headache and the modems too. Hardware problems were discarded with the arrival of new equipment, isolation transformer (to eliminate completely any potential diff between neutral and ground) and other stuff.
So what has been going on is that we have been using a script that dows it's best to determine if a link is up and then build the multipath route with the "ip" command to loadbalance the traffic.
All and all, it works, but it's too slow to take decisions, sometimes it doesn't take the right decision, and sometimes it doesn't fully understand the situation. Thus, loadbalance is always a step back in the actual situation.
Moreover, we are doing some changes in the whole networking architecture in a few weeks, and the script, which has hardcoded parameters, will soon be completely obsolete.
So I investigated but could not find a viable solution that doesn't involve complicated routing protocols. Then I decided to make my own script (in fact, two script files).
This post is merly to ask the community wether this set of scripts could be useful enough to make an ebuild out of them and ask people to test them. By the way, you can download the project from "https://github.com/diegommm/ampr", it's published under GPLv2 and it's name is "Automatic Multi Path Route".
It only needs "ping", "ip", "sleep" and "bash". Tried it on:
* (2x) Ubuntu 12.04 amd64 (both up to date). Software involved: iputils-sss20101006, iproute2-ss111117, coreutils-8.13, bash-4.2.25.
* Fresh Gentoo Linux amd64 (installed a couple of days ago). Software involved: iputils-s20121221, iproute2-ss130221, coreutils-8.20, bash-4.2.45.
* Old-old Debian x86 (2006). Software involved: (to be confirmed).
Works like a charm, and is sufficiently generic as to keep multiple multipath routes, which can be in the same or diferent tables (one of the requirements of the new networking architecture) and detects the following issues in the connections:
* If ping fails, it throws.
* If it gets <X> consecutive ICMP messages saying something not good (network/host unreachable, administratively prohibited, frag needed, etc.) it throws.
* If ping doesn't give any response in a certain period of time, it throws.
* If it recieves <Y> number of consecutive replies which are out of sequence, it throws.
* If the ping time grows beyond a treshold, it considers the connection has a high latency, and may decide to drop the weight of that path to a smaller weight specified before.
* If nothing of the above happens for <Z> consecutive replies, it considers the connection is "acceptable", and will make sure that path is up and with a higher weight.
The script runs in a daemon-like fashion, and as the ADSLs failed, grew in latency, and recovered I could see how changes *would* have been done to the configured multipath routes. "Would" because it also supports a "PRETEND" mode, in which nothing is actually done to the routing tables but you get the problems detected and the solution it would take on stdout. Thinking it deserves some respect, I took the liberty to add a "TROUBLESHOOT" option to make a description of the system to attach to bug reports (like utilities versions and so).
What you need to get it working is an INI file which can define a "[general]" section where you can overwrite default values (even though they are safe enough for many situations, I think), and then define as many multipath routes as you want, each with it's own list of "possible" paths. Here's an example taken from the example configuration file:
## Sleep for this number of seconds after completion of each group of operations.
## Wether to initialize or not the routes to each remote host
## Default weight for normal paths
## Default weight for high latency paths
## Default options for "pingmon" function
PINGMONOPTS=-C 5 -F 1 -k 0.5 -K 3 -q 0 -U 10 -x 120 -X 6
## If non zero, will run the whole program but will write to stdout the 'ip' commands instead of actually executing them.
## When non zero, will run interactively and ask where to save a troubleshooting report to attach to a bug report.
## For each new route you add a section. The header section constitutes the next part of "ip route replace" command.
## In the section add the lines that describe the path. Each line is a white-space separated list of the following
# fileds (only the first three are mandatory), in order:
# * <RHOST>: the host to be monitored by "pingmon".
# * <DEV>: argument for "dev" in "nexthop".
# * <VIA>: argument for "via" in "nexthop".
# * <NORMAL_WEIGHT>[,<HLAT_WEIGHT>]: argument for "weight"
# in "nexthop". The <HLAT_WEIGHT> is optional.
# * <PINGMONOPTS>: the rest of the words found on the line will be passed to the "pingmon" function
192.168.1.1 eth1 10.0.0.1
192.168.1.2 eth1 10.0.0.2
[default scope global src 10.0.0.3 table 3]
126.96.36.199 eth0 192.168.1.2
## Router 192.168.1.3 has big latency
188.8.131.52 eth0 192.168.1.3 2,1 -i 3
The example is dummy, but I guess the idea is there... you might probably not need to overwrite the default values (thus, no "[general]" section) and only add a couple of lines for the actual stuff to be done. Note that after a path is deleted from the route, only the routes cached through that entrie's router are flushed.
EDIT: the debian host which I tested has the following software: iputils-sss20071127, iproute2-ss080725, coreutils-6.10, bash-3.2.39