Commit 937e75d8 authored by Ondřej Zajíček's avatar Ondřej Zajíček

Add the Babel routing protocol (RFC 6126)

This patch implements the IPv6 subset of the Babel routing protocol.
Based on the patch from Toke Hoiland-Jorgensen, with some heavy
modifications and bugfixes.

Thanks to Toke Hoiland-Jorgensen for the original patch.
parent a7baa098
......@@ -206,6 +206,9 @@ fi
AC_SUBST(iproutedir)
all_protocols="$proto_bfd bgp ospf pipe $proto_radv rip static"
if test "$ip" = ipv6 ; then
all_protocols="$all_protocols babel"
fi
all_protocols=`echo $all_protocols | sed 's/ /,/g'`
if test "$with_protocols" = all ; then
......
......@@ -1380,6 +1380,102 @@ corresponding protocol sections.
<chapt>Protocols
<sect>Babel
<sect1>Introduction
<p>The Babel protocol (RFC6126) is a loop-avoiding distance-vector routing
protocol that is robust and efficient both in ordinary wired networks and in
wireless mesh networks. Babel is conceptually very simple in its operation and
"just works" in its default configuration, though some configuration is possible
and in some cases desirable.
<p>While the Babel protocol is dual stack (i.e., can carry both IPv4 and IPv6
routes over the same IPv6 transport), BIRD presently implements only the IPv6
subset of the protocol. No Babel extensions are implemented, but the BIRD
implementation can coexist with implementations using the extensions (and will
just ignore extension messages).
<p>The Babel protocol implementation in BIRD is currently in alpha stage.
<sect1>Configuration
<p>Babel supports no global configuration options apart from those common to all
other protocols, but supports the following per-interface configuration options:
<code>
protocol babel [<name>] {
interface <interface pattern> {
type <wired|wireless>;
rxcost <number>;
hello interval <number>;
update interval <number>;
port <number>;
tx class|dscp <number>;
tx priority <number>;
rx buffer <number>;
tx length <number>;
check link <switch>;
};
}
</code>
<descrip>
<tag>type wired|wireless </tag>
This option specifies the interface type: Wired or wireless. Wired
interfaces are considered more reliable, and so the default hello
interval is higher, and a neighbour is considered unreachable after only
a small number of "hello" packets are lost. On wireless interfaces,
hello packets are sent more often, and the ETX link quality estimation
technique is used to compute the metrics of routes discovered over this
interface. This technique will gradually degrade the metric of routes
when packets are lost rather than the more binary up/down mechanism of
wired type links. Default: <cf/wired/.
<tag>rxcost <m/num/</tag>
This specifies the RX cost of the interface. The route metrics will be
computed from this value with a mechanism determined by the interface
<cf/type/. Default: 96 for wired interfaces, 256 for wireless.
<tag>hello interval <m/num/</tag>
Interval at which periodic "hello" messages are sent on this interface,
in seconds. Default: 4 seconds.
<tag>update interval <m/num/</tag>
Interval at which periodic (full) updates are sent. Default: 4 times the
hello interval.
<tag>port <m/number/</tag>
This option selects an UDP port to operate on. The default is to operate
on port 6696 as specified in the Babel RFC.
<tag>tx class|dscp|priority <m/number/</tag>
These options specify the ToS/DiffServ/Traffic class/Priority of the
outgoing Babel packets. See <ref id="dsc-prio" name="tx class"> common
option for detailed description.
<tag>rx buffer <m/number/</tag>
This option specifies the size of buffers used for packet processing.
The buffer size should be bigger than maximal size of received packets.
The default value is the interface MTU, and the value will be clamped to a
minimum of 512 bytes + IP packet overhead.
<tag>tx length <m/number/</tag>
This option specifies the maximum length of generated Babel packets. To
avoid IP fragmentation, it should not exceed the interface MTU value.
The default value is the interface MTU value, and the value will be
clamped to a minimum of 512 bytes + IP packet overhead.
<tag>check link <m/switch/</tag>
If set, the hardware link state (as reported by OS) is taken into
consideration. When the link disappears (e.g. an ethernet cable is
unplugged), neighbors are immediately considered unreachable and all
routes received from them are withdrawn. It is possible that some
hardware drivers or platforms do not implement this feature. Default:
yes.
</descrip>
<sect><label id="sect-bfd">BFD
<sect1>Introduction
......
......@@ -57,6 +57,9 @@ Reply codes of BIRD command-line interface
1020 Show BFD sessions
1021 Show RIP interface
1022 Show RIP neighbors
1023 Show Babel interfaces
1024 Show Babel neighbors
1025 Show Babel entries
8000 Reply too long
8001 Route not found
......
......@@ -60,6 +60,7 @@
#define NORET __attribute__((noreturn))
#define UNUSED __attribute__((unused))
#define PACKED __attribute__((packed))
/* Microsecond time */
......
......@@ -25,5 +25,6 @@ u32 u32_log2(u32 v);
static inline u32 u32_hash(u32 v) { return v * 2902958171u; }
#endif
static inline u8 u32_popcount(u32 v) { return __builtin_popcount(v); }
#endif
......@@ -26,6 +26,7 @@
#define IP6_OSPF_ALL_ROUTERS ipa_build6(0xFF020000, 0, 0, 5)
#define IP6_OSPF_DES_ROUTERS ipa_build6(0xFF020000, 0, 0, 6)
#define IP6_RIP_ROUTERS ipa_build6(0xFF020000, 0, 0, 9)
#define IP6_BABEL_ROUTERS ipa_build6(0xFF020000, 0, 0, 0x00010006)
#define IP4_NONE _MI4(0)
#define IP6_NONE _MI6(0,0,0,0)
......
......@@ -124,6 +124,7 @@ static char * number(char * str, long num, int base, int size, int precision,
* width is automatically replaced by standard IP address width which
* depends on whether we use IPv4 or IPv6; |%#I| gives hexadecimal format),
* |%R| for Router / Network ID (u32 value printed as IPv4 address)
* |%lR| for 64bit Router / Network ID (u64 value printed as eight :-separated octets)
* and |%m| resp. |%M| for error messages (uses strerror() to translate @errno code to
* message text). On the other hand, it doesn't support floating
* point numbers.
......@@ -137,9 +138,10 @@ int bvsnprintf(char *buf, int size, const char *fmt, va_list args)
unsigned long num;
int i, base;
u32 x;
u64 X;
char *str, *start;
const char *s;
char ipbuf[STD_ADDRESS_P_LENGTH+1];
char ipbuf[MAX(STD_ADDRESS_P_LENGTH,ROUTER_ID_64_LENGTH)+1];
struct iface *iface;
int flags; /* flags to number() */
......@@ -309,12 +311,27 @@ int bvsnprintf(char *buf, int size, const char *fmt, va_list args)
/* Router/Network ID - essentially IPv4 address in u32 value */
case 'R':
x = va_arg(args, u32);
bsprintf(ipbuf, "%d.%d.%d.%d",
((x >> 24) & 0xff),
((x >> 16) & 0xff),
((x >> 8) & 0xff),
(x & 0xff));
if(qualifier == 'l') {
X = va_arg(args, u64);
bsprintf(ipbuf, "%02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x",
((X >> 56) & 0xff),
((X >> 48) & 0xff),
((X >> 40) & 0xff),
((X >> 32) & 0xff),
((X >> 24) & 0xff),
((X >> 16) & 0xff),
((X >> 8) & 0xff),
(X & 0xff));
}
else
{
x = va_arg(args, u32);
bsprintf(ipbuf, "%d.%d.%d.%d",
((x >> 24) & 0xff),
((x >> 16) & 0xff),
((x >> 8) & 0xff),
(x & 0xff));
}
s = ipbuf;
goto str;
......
......@@ -30,4 +30,6 @@ static inline char *xbasename(const char *str)
return s ? s+1 : (char *) str;
}
#define ROUTER_ID_64_LENGTH 23
#endif
......@@ -919,6 +919,9 @@ protos_build(void)
proto_build(&proto_bfd);
bfd_init_all();
#endif
#ifdef CONFIG_BABEL
proto_build(&proto_babel);
#endif
proto_pool = rp_new(&root_pool, "Protocols");
proto_flush_event = ev_new(proto_pool);
......
......@@ -76,7 +76,7 @@ void protos_dump_all(void);
extern struct protocol
proto_device, proto_radv, proto_rip, proto_static,
proto_ospf, proto_pipe, proto_bgp, proto_bfd;
proto_ospf, proto_pipe, proto_bgp, proto_bfd, proto_babel;
/*
* Routing Protocol Instance
......
......@@ -219,6 +219,12 @@ typedef struct rte {
struct {
u8 suppressed; /* Used for deterministic MED comparison */
} bgp;
#endif
#ifdef CONFIG_BABEL
struct {
u16 metric; /* Babel metric */
u64 router_id; /* Babel router id */
} babel;
#endif
struct { /* Routes generated by krt sync (both temporary and inherited ones) */
s8 src; /* Alleged route source (see krt.h) */
......@@ -374,6 +380,7 @@ typedef struct rta {
#define RTS_OSPF_EXT2 10 /* OSPF external route type 2 */
#define RTS_BGP 11 /* BGP route */
#define RTS_PIPE 12 /* Inter-table wormhole */
#define RTS_BABEL 13 /* Babel route */
#define RTC_UNICAST 0
#define RTC_BROADCAST 1
......@@ -422,7 +429,8 @@ typedef struct eattr {
#define EAP_RIP 2 /* RIP */
#define EAP_OSPF 3 /* OSPF */
#define EAP_KRT 4 /* Kernel route attributes */
#define EAP_MAX 5
#define EAP_BABEL 5 /* Babel attributes */
#define EAP_MAX 6
#define EA_CODE(proto,id) (((proto) << 8) | (id))
#define EA_PROTO(ea) ((ea) >> 8)
......@@ -547,6 +555,7 @@ extern struct protocol *attr_class_to_protocol[EAP_MAX];
#define DEF_PREF_DIRECT 240 /* Directly connected */
#define DEF_PREF_STATIC 200 /* Static route */
#define DEF_PREF_OSPF 150 /* OSPF intra-area, inter-area and type 1 external routes */
#define DEF_PREF_BABEL 130 /* Babel */
#define DEF_PREF_RIP 120 /* RIP */
#define DEF_PREF_BGP 100 /* BGP */
#define DEF_PREF_PIPE 70 /* Routes piped from other tables */
......
H Protocols
C babel
C bfd
C bgp
C ospf
......
S babel.c
S packet.c
source=babel.c packets.c
root-rel=../../
dir-name=proto/babel
include ../../Rules
/*
* BIRD -- The Babel protocol
*
* Copyright (c) 2015--2016 Toke Hoiland-Jorgensen
*
* Can be freely distributed and used under the terms of the GNU GPL.
*
* This file contains the main routines for handling and sending TLVs, as
* well as timers and interaction with the nest.
*/
/**
* DOC: The Babel protocol
*
* Babel (RFC6126) is a loop-avoiding distance-vector routing protocol that is
* robust and efficient both in ordinary wired networks and in wireless mesh
* networks.
*
* The Babel protocol keeps state for each neighbour in a &babel_neighbor
* struct, tracking received Hello and I Heard You (IHU) messages. A
* &babel_interface struct keeps hello and update times for each interface, and
* a separate hello seqno is maintained for each interface.
*
* For each prefix, Babel keeps track of both the possible routes (with next hop
* and router IDs), as well as the feasibility distance for each prefix and
* router id. The prefix itself is tracked in a &babel_entry struct, while the
* possible routes for the prefix are tracked as &babel_route entries and the
* feasibility distance is maintained through &babel_source structures.
*
* The main route selection is done in babel_select_route(). This is called when
* an entry is updated by receiving updates from the network or when modified by
* internal timers. It performs feasibility checks on the available routes for
* the prefix and selects the one with the lowest metric to be announced to the
* core.
*/
#include <stdlib.h>
#include "babel.h"
#define OUR_ROUTE(r) (r->neigh == NULL)
/*
* Is one number greater or equal than another mod 2^16? This is based on the
* definition of serial number space in RFC 1982. Note that arguments are of
* uint type to avoid integer promotion to signed integer.
*/
static inline int ge_mod64k(uint a, uint b)
{ return (u16)(a - b) < 0x8000; }
static void babel_dump_entry(struct babel_entry *e);
static void babel_dump_route(struct babel_route *r);
static void babel_select_route(struct babel_entry *e);
static void babel_send_route_request(struct babel_entry *e, struct babel_neighbor *n);
static void babel_send_wildcard_request(struct babel_iface *ifa);
static int babel_cache_seqno_request(struct babel_proto *p, ip_addr prefix, u8 plen,
u64 router_id, u16 seqno);
static void babel_trigger_iface_update(struct babel_iface *ifa);
static void babel_trigger_update(struct babel_proto *p);
static void babel_send_seqno_request(struct babel_entry *e);
static inline void babel_kick_timer(struct babel_proto *p);
static inline void babel_iface_kick_timer(struct babel_iface *ifa);
/*
* Functions to maintain data structures
*/
static void
babel_init_entry(struct fib_node *n)
{
struct babel_entry *e = (void *) n;
e->proto = NULL;
e->selected_in = NULL;
e->selected_out = NULL;
e->updated = now;
init_list(&e->sources);
init_list(&e->routes);
}
static inline struct babel_entry *
babel_find_entry(struct babel_proto *p, ip_addr prefix, u8 plen)
{
return fib_find(&p->rtable, &prefix, plen);
}
static struct babel_entry *
babel_get_entry(struct babel_proto *p, ip_addr prefix, u8 plen)
{
struct babel_entry *e = fib_get(&p->rtable, &prefix, plen);
e->proto = p;
return e;
}
static struct babel_source *
babel_find_source(struct babel_entry *e, u64 router_id)
{
struct babel_source *s;
WALK_LIST(s, e->sources)
if (s->router_id == router_id)
return s;
return NULL;
}
static struct babel_source *
babel_get_source(struct babel_entry *e, u64 router_id)
{
struct babel_proto *p = e->proto;
struct babel_source *s = babel_find_source(e, router_id);
if (s)
return s;
s = sl_alloc(p->source_slab);
s->router_id = router_id;
s->expires = now + BABEL_GARBAGE_INTERVAL;
s->seqno = 0;
s->metric = BABEL_INFINITY;
add_tail(&e->sources, NODE s);
return s;
}
static void
babel_expire_sources(struct babel_entry *e)
{
struct babel_proto *p = e->proto;
struct babel_source *n, *nx;
WALK_LIST_DELSAFE(n, nx, e->sources)
{
if (n->expires && n->expires <= now)
{
rem_node(NODE n);
sl_free(p->source_slab, n);
}
}
}
static struct babel_route *
babel_find_route(struct babel_entry *e, struct babel_neighbor *n)
{
struct babel_route *r;
WALK_LIST(r, e->routes)
if (r->neigh == n)
return r;
return NULL;
}
static struct babel_route *
babel_get_route(struct babel_entry *e, struct babel_neighbor *nbr)
{
struct babel_proto *p = e->proto;
struct babel_route *r = babel_find_route(e, nbr);
if (r)
return r;
r = sl_alloc(p->route_slab);
memset(r, 0, sizeof(*r));
r->e = e;
add_tail(&e->routes, NODE r);
if (nbr)
{
r->neigh = nbr;
r->expires = now + BABEL_GARBAGE_INTERVAL;
add_tail(&nbr->routes, NODE &r->neigh_route);
}
return r;
}
static void
babel_flush_route(struct babel_route *r)
{
struct babel_proto *p = r->e->proto;
DBG("Babel: Flush route %I/%d router_id %lR neigh %I\n",
r->e->n.prefix, r->e->n.pxlen, r->router_id, r->neigh ? r->neigh->addr : IPA_NONE);
rem_node(NODE r);
if (r->neigh)
rem_node(&r->neigh_route);
if (r->e->selected_in == r)
r->e->selected_in = NULL;
if (r->e->selected_out == r)
r->e->selected_out = NULL;
sl_free(p->route_slab, r);
}
static void
babel_expire_route(struct babel_route *r)
{
struct babel_proto *p = r->e->proto;
struct babel_entry *e = r->e;
TRACE(D_EVENTS, "Route expiry timer for %I/%d router-id %lR fired",
e->n.prefix, e->n.pxlen, r->router_id);
if (r->metric < BABEL_INFINITY)
{
r->metric = BABEL_INFINITY;
r->expires = now + r->expiry_interval;
}
else
{
babel_flush_route(r);
}
}
static void
babel_refresh_route(struct babel_route *r)
{
if (!OUR_ROUTE(r) && (r == r->e->selected_in))
babel_send_route_request(r->e, r->neigh);
r->refresh_time = 0;
}
static void
babel_expire_routes(struct babel_proto *p)
{
struct babel_entry *e;
struct babel_route *r, *rx;
struct fib_iterator fit;
FIB_ITERATE_INIT(&fit, &p->rtable);
loop:
FIB_ITERATE_START(&p->rtable, &fit, n)
{
e = (struct babel_entry *) n;
int changed = 0;
WALK_LIST_DELSAFE(r, rx, e->routes)
{
if (r->refresh_time && r->refresh_time <= now)
babel_refresh_route(r);
if (r->expires && r->expires <= now)
{
babel_expire_route(r);
changed = 1;
}
}
if (changed)
{
/*
* We have to restart the iteration because there may be a cascade of
* synchronous events babel_select_route() -> nest table change ->
* babel_rt_notify() -> p->rtable change, invalidating hidden variables.
*/
FIB_ITERATE_PUT(&fit, n);
babel_select_route(e);
goto loop;
}
babel_expire_sources(e);
/* Remove empty entries */
if (EMPTY_LIST(e->sources) && EMPTY_LIST(e->routes))
{
FIB_ITERATE_PUT(&fit, n);
fib_delete(&p->rtable, e);
goto loop;
}
}
FIB_ITERATE_END(n);
}
static struct babel_neighbor *
babel_find_neighbor(struct babel_iface *ifa, ip_addr addr)
{
struct babel_neighbor *nbr;
WALK_LIST(nbr, ifa->neigh_list)
if (ipa_equal(nbr->addr, addr))
return nbr;
return NULL;
}
static struct babel_neighbor *
babel_get_neighbor(struct babel_iface *ifa, ip_addr addr)
{
struct babel_neighbor *nbr = babel_find_neighbor(ifa, addr);
if (nbr)
return nbr;
nbr = mb_allocz(ifa->pool, sizeof(struct babel_neighbor));
nbr->ifa = ifa;
nbr->addr = addr;
nbr->txcost = BABEL_INFINITY;
init_list(&nbr->routes);
add_tail(&ifa->neigh_list, NODE nbr);
return nbr;
}
static void
babel_flush_neighbor(struct babel_neighbor *nbr)
{
struct babel_proto *p = nbr->ifa->proto;
node *n;
TRACE(D_EVENTS, "Flushing neighbor %I", nbr->addr);
WALK_LIST_FIRST(n, nbr->routes)
{
struct babel_route *r = SKIP_BACK(struct babel_route, neigh_route, n);
struct babel_entry *e = r->e;
int selected = (r == e->selected_in);
babel_flush_route(r);
if (selected)
babel_select_route(e);
}
rem_node(NODE nbr);
mb_free(nbr);
}
static void
babel_expire_ihu(struct babel_neighbor *nbr)
{
nbr->txcost = BABEL_INFINITY;
}
static void
babel_expire_hello(struct babel_neighbor *nbr)
{
nbr->hello_map <<= 1;
if (nbr->hello_cnt < 16)
nbr->hello_cnt++;
if (!nbr->hello_map)
babel_flush_neighbor(nbr);
}
static void
babel_expire_neighbors(struct babel_proto *p)
{
struct babel_iface *ifa;
struct babel_neighbor *nbr, *nbx;
WALK_LIST(ifa, p->interfaces)
{
WALK_LIST_DELSAFE(nbr, nbx, ifa->neigh_list)
{
if (nbr->ihu_expiry && nbr->ihu_expiry <= now)
babel_expire_ihu(nbr);
if (nbr->hello_expiry && nbr->hello_expiry <= now)
babel_expire_hello(nbr);
}
}
}
/*
* Best route selection
*/
/*
* From the RFC (section 3.5.1):
*
* a route advertisement carrying the quintuple (prefix, plen, router-id, seqno,
* metric) is feasible if one of the following conditions holds:
*
* - metric is infinite; or
*
* - no entry exists in the source table indexed by (id, prefix, plen); or
*
* - an entry (prefix, plen, router-id, seqno', metric') exists in the source
* table, and either
* - seqno' < seqno or
* - seqno = seqno' and metric < metric'.
*/
static inline int
babel_is_feasible(struct babel_source *s, u16 seqno, u16 metric)
{
return !s ||
(metric == BABEL_INFINITY) ||
(seqno > s->seqno) ||
((seqno == s->seqno) && (metric < s->metric));
}
static u16
babel_compute_rxcost(struct babel_neighbor *n)
{
struct babel_iface *ifa = n->ifa;
u8 cnt, missed;
u16 map=n->hello_map;
if (!map) return BABEL_INFINITY;
cnt = u32_popcount(map); // number of bits set
missed = n->hello_cnt-cnt;
if (ifa->cf->type == BABEL_IFACE_TYPE_WIRELESS)
{
/* ETX - Appendix 2.2 in the RFC.
beta = prob. of successful transmission.
rxcost = BABEL_RXCOST_WIRELESS/beta
Since: beta = 1-missed/n->hello_cnt = cnt/n->hello_cnt
Then: rxcost = BABEL_RXCOST_WIRELESS * n->hello_cnt / cnt
*/
if (!cnt) return BABEL_INFINITY;
return BABEL_RXCOST_WIRELESS * n->hello_cnt / cnt;
}
else
{
/* k-out-of-j selection - Appendix 2.1 in the RFC. */
DBG("Babel: Missed %d hellos from %I\n", missed, n->addr);
/* Link is bad if more than half the expected hellos were lost */
return (missed > n->hello_cnt/2) ? BABEL_INFINITY : ifa->cf->rxcost;
}
}
static u16
babel_compute_cost(struct babel_neighbor *n)
{
struct babel_iface *ifa = n->ifa;
u16 rxcost = babel_compute_rxcost(n);
if (rxcost == BABEL_INFINITY) return rxcost;
else if (ifa->cf->type == BABEL_IFACE_TYPE_WIRELESS)
{
/* ETX - Appendix 2.2 in the RFC */
return (MAX(n->txcost, BABEL_RXCOST_WIRELESS) * rxcost)/BABEL_RXCOST_WIRELESS;
}
else
{
/* k-out-of-j selection - Appendix 2.1 in the RFC. */
return n->txcost;
}
}
/* Simple additive metric - Appendix 3.1 in the RFC */
static u16
babel_compute_metric(struct babel_neighbor *n, uint metric)
{
metric += babel_compute_cost(n);
return MIN(metric, BABEL_INFINITY);
}
/**
* babel_announce_rte - announce selected route to the core
* @p: Babel protocol instance
* @e: Babel route entry to announce
*
* This function announces a Babel entry to the core if it has a selected