TCP vulnerability puts BGP at risk

Rumors have been floating around for days, as the referrer log for this site shows large numbers of people looking for "BGP hack" and "BGP MD5". But the cat is out of the bag now:

NISCC Vulnerability Advisory 236929

Please see the page linked above for detailed information. The short version is that TCP sequence numbers turn out much easier to guess than assumed until now, which makes long-lived TCP sessions vulnerable to reset attacks. Since BGP sessions can remain for days, weeks or even months, and other pertinent information is relatively easy to find, BGP is the protocol most affected by this vulnerability.

Fortunately, the BGP TCP MD5 option protects against exactly this problem. Enable it if at all possible. Most, if not all, routers support it. The option is enabled on Cisco routers as follows:

!
router bgp 12345
 neighbor 192.168.0.1 password use-upto-80-characters
!

However, this will break any running BGP sessions so coordinate the change closely with the remote AS.

Since this mechanism operates at the TCP level, host-based routers such as Zebra or Quagga running on BSD or Linux typically don't support this option. However, there is some rudimentary support in both OSes, see the SANS advisory.

Note: The "BGP TTL hack" or GTSM (see below or above) also offers protection against the TCP vulnerability, without adding the MD5 crypto overhead. And good anti-spoofing filters do the same, but the problem there is that the other AS also needs to implement them, something that can't be assumed.

It seems the actual risks aren't as bad as the reports seem to indicate at first glance. I'll post a more detailed analysis later, but from discussions on NANOG it seems the only new aspect is that previously people didn't realize that the RST packet could have any sequence number that falls inside the receive window on the potential victim, which is often around 16k. This means the attacker only has to guess the first 18 bits of the sequence number rather than the full 32 bits. However, she also needs to guess both port numbers, which makes the number of possible combinations an attacker must try around a billion, which amounts to a DoS attack of 10000 packets per second for more than a day.

Permalink - posted 2004-04-20

BGP TTL "hack"

At the NANOG 26 meeting in october 2002, Dave Meyer presented a very simple proposal to protect BGP sessions against attacks: set the TTL to 255 on outgoing packets, and check whether the TTL in received packets is equal to 255. Since routers always lower the Time To Live (or Hop Limit in IPv6) when forwarding a packet, and routers discard packets with a TTL of 0, there is no way for anyone who isn't attached to the subnet in question to inject packets with a TTL of 255 into a subnet. RFC3682 was published in february and describes the details of the "Generalized TTL Security Mechanism (GTSM)".

Cisco has now included GTSM into IOS release 12.3(7)T, as explained in the feature guide. It seems there are some interesting caveats. First of all, Cisco states that enabling the feature using the neighbor ... ttl-security command will only enable the check for incoming packets and not change any behavior as to outgoing packets. So this must mean they always use a TTL of 255 for outgoing packets now. However, older IOS versions set the TTL for BGP packets to 1 (in the absence of any ebgp-multihop settings). If this is the case, then detecting whether a neighbor is directly connected won't be very reliable right now. Then again, Cisco says the feature must be configured on both ends of an eBGP session (no support for iBGP as of yet) which seems to contradict this.

Another thing is that they look for a TTL of 254 or higher. This suggests that for incoming packets with a TTL of 255, they first decrease the TTL and then go on to process the TCP segment. Again, this is not how things work in older IOS versions, as it's perfectly possible to set the TTL to 0 and still interact with a Cisco router on the local subnet. So unless something changed in this regard as well, accepting a TTL of 254 means that there can still be a router in between!

Note that this mechanism only offers protection against attacks on port 179 of a router from "far away". Anyone on the local subnets still gets to do whatever they please and the content of the BGP sessions isn't protected any better than before.

Permalink - posted 2004-04-08

BGP on Cisco 2500

It has been a while since the last news posting. My apologies for that. Here is something to hold you over until I can find some real news:

When perusing the HTTP referrer log, I noticed that a lot of people are finding this site in search of "bgp+2500" or something similar. So... is it possible to run BGP on a Cisco 2500 router?

The short answer is "yes". The IOS images for the 2500 support BGP, including BGP for IPv6. (They do not, however, support OSPF for IPv6 even though OSPF for IPv4 is supported and the "ipv6 router ospf" command may exist. Same thing for IS-IS: the command exists, but the protocol isn't present in any of the 2500 images.)

The slightly longer answer is that a 2500 is of limited use for inter-domain routing. Actually way back when I got started with BGP I used a 2514 with 16 MB RAM and it could hold the entire 35000 or so entry global routing table. From two upstreams even, if I remember correctly. However, in the mean time the global routing table has gotten four times as big, and the 2500's memory limit is still 16 MB. The fact that the 2500 series sports a 68030 CPU doesn't really help either. All of this means that you can only run BGP on a 2500 if you don't send it more than a few thousand routes. You also shouldn't send it a full feed and have the 2500 filter out the unwanted routes, as this will tax the CPU too much and make for many-minute convergence times.

Note that even 35000 routes wouldn't work anymore today, as modern IOS images need a lot more memory for their internal house keeping. On bigger routers it gets even worse because unlike the 2500, those can't run their software from flash, so it must be copied to RAM. Additionally, the switching path of choice is CEF these days, which takes a lot of memory. (Without CEF you'll be using fast switching which uses just as much memory but only when needed, so it's not only the CPU that melts down but you also run out of memory when a slammer-like worm hits.)

So you may be able to get away with a full table on a Cisco with 128 MB RAM (or you may not), but 256 MB gives you much more elbow room. Unfortunately, Cisco still makes boxes that can run BGP but won't take enough memory to do so properly. A good example are the 3550 series multilayer switches.

Permalink - posted 2004-03-31

Apple Safari IPv6 hack

IPv6-enabled operating systems such as Windows, Linux and FreeBSD all come with a web browser that also supports IPv6 and prefers IPv6 when both IPv4 is available. Things are slightly different with Safari, Apple's browser application. Initially, Safari only supported literal IPv6 addresses and some corner case DNS names. In the current version, Safari will do IPv6 if no IPv4 address is available, but it won't prefer IPv6 over IPv4 or fall back to IPv6 when IPv4 doesn't work.

However, Nicholas Humfrey has come up with a trick. By enabling Safari's debug mode and switch off one of the two HTTP loaders that are normally used, Safari will prefer IPv6 when it's available. See the Mac OS X hints article that Nick posted for the details.

Permalink - posted 2004-01-28

Clearing the DF bit

As I wrote a few weeks ago in an article under the name "no ip unreachables", path MTU discovery doesn't work all that well across the internet in practice. Since then, I've noticed that people end up on this site looking for ways to clear the don't fragment bit in the IP header. So here is an example of how to do this on a Cisco router:

! route-map nodf permit 10 set ip df 0 ! interface FastEthernet2/0 ip policy route-map nodf !

Note that the "ip policy route-map nodf" command must be applied on the interface receiving the packets for which the DF bit must be cleared, and not the interface with the reduced MTU itself, where the packets are subsequently transmitted. See a page at Cisco for additional strategies.

Permalink - posted 2004-01-12

New miminum allocation size at RIPE

The RIPE NCC has changed its policy regarding the initial allocation that new LIRs receive. The rule that efficient use for at least a /22 must be demonstrated is now off the table, and the minimum allocation is now a /21 rather than a /20. See the announcement. RIPE also maintains a list of minimum allocation and assignment sizes for their address blocks (linked from the announcement), but this is pretty much useless because filtering on allocation size is too restrictive while filtering on assignment size is isn't restrictive enough for many address blocks. So be very careful when implementing prefix length filtering.

Without the jargon, please!

Right. Most of us get our IP addresses from our ISPs, and ISPs usually have one or more blocks of IP address space of their own. Having their own address space is important for ISPs because this allows them to be independent from their ISPs by allowing them to change ISPs without having to change addresses. (Obviously this is useful to end-users as well, but this changed policy applies to ISPs.) Until now, ISPs that wanted to get address space of their own needed to show that they and/or their customers would start using 1024 addresses (a /22) immediately. In this case, they would get a block of 4096 addresses (a /20). The advantage of having such a large block is that everyone in the world is prepared to store a pointer to it in their routers, making the addresses globally usable without limitations.

Since some networks only accept routing information for the smallest address blocks that RIPE and the other Regional Internet Registries (ARIN, APNIC and LACNIC) give out to ISPs. Smaller address blocks aren't entirely useless, but they may not be globally reachable without having to depend on the ISP the addresses came from, which of course limits ISP independence.

Since RIPE is now giving out blocks of 2048 addresses (/21) from some of their address blocks, networks are expected (and pretty much forced) to accept these blocks. This is good news for small ISPs that want their own independent block: they no longer have to jump through hoops trying to show they need 1024 addresses, or make do with only semi-independent addresses.

Note that the other RIRs haven't changed their policies (or at least there are no announcements to be found). ARIN's policy for instance, is even more restrictive than the old RIPE policy: multihomed networks must show efficient use of a /21 to get a /20, single homed ISPs must even show efficient use of a full /20 to get a /20. So for now the good news only applies to ISPs in the RIPE region, which is roughly Europe, the Middle East, Africa north of the Sahara and the former Soviet Union. For more info, see the RIR policy comparison matrix.

Permalink - posted 2004-01-10

older posts - newer posts