Last week, there was a large route leak that involved Swiss hosting company Safe Host and China Telecom. The route leak made internet traffic for European telecoms operators KPN, Swisscom and Bouygues Telecom, among others, flow through Safe Host and China Telecom against the wishes of the telecom operators involved. See this Ars Technica story for more details.
In this post, I'm going to explain how the interaction between the technical and business aspects of internet routing have made this issue so difficult to fix. At the end I'll briefly describe a proposal that I think can actually make that happen.
The trouble is, this happens all the time. The first big one was the AS 7007 incident in 1997. Then there was the big Youtube - Pakistan incident in 2008, with many more after that. Most route leaks happen by accident, but sometimes attackers manipulate BGP to reroute traffic on purpose, for instance, to steal crypto currency.
Even when there's no malicious intent, the effect of a routing leak can be severe: typically, the unintended path that now carries the traffic isn't prepared for the deluge, so there are severe slowdowns or the destination becomes entirely unreachable if BGP selects a leaked path.
BGP is the internet's routing protocol. Every device connected to the internet has an IP address, and BGP allows remote networks to discover the path to reach these IP addresses. There are currently over 60000 networks (called "autonomous systems") participating in BGP. An autonomous system (AS) can be as small as a handful of servers somewhere in a datacenter, with two routers running BGP, or a world wide network connecting 10% or more of all internet users through hundreds or thousands of routers running BGP.
Left to its own devices, BGP will calculate the shortest path between any two places connected to the internet. Unfortunately, while this straightforward shortest path routing makes technical sense, it doesn't work from a business perspective. This misalignment between the technical and the business aspects of internet routing makes it so hard to protect against route leaks, be they accidental or with malicious intent.
In order to understand that, let's have a look at how different networks interconnect and pay each other with the phone network first. There are many telecom operators running networks spanning different geographical scopes. So for long distance calls, there's usually at least two and sometimes three or four phone companies involved: the local phone company that serves the user that places the call, then maybe one or more long distance carriers, and finally the phone company that serves the user that receives the call. So the user making the call pays their phone company. That phone company then pays the next carrier, and so on, so all the carriers involved get paid. This model works without too much trouble with shortest path routing: as you get more traffic, you get paid more.
With the internet, things are different. Rather than change for "calls", users typically pay for a certain amount of bandwidth. Smaller internet service providers buy service from larger ones, similar how end users buy service from their ISP. This is what we call "transit": a service where you get to send packets to all destinations connected to the internet, and you receive packets that anyone connected to the internet sends to you. As with the phone network, those remote destinations that you want to communicate with are usually served by a different ISP. And as with the phone network, at some point these networks come together and interconnect. But the big difference is that there's no money changing hands here. This is called "settlement-free" or "sender keeps all" interconnection. But usually we simply call it "peering". Each peering partner keeps the money they charge their customer, so they both get paid.
An important consequence of the sender keeps all model is that there's now traffic flows that make money: to or from a paying customer, and traffic flows that don't make money: from a non-paying peer to a another non-paying peer. (Or even worse, traffic flows that cost money: from a non-paying peer to a transit provider.)
In the academic literature, this model is called "valley-free": you get to go up the customer-transit hierarchy, but once you traverse a peering connection or start to go down the hierarchy, you don't get to go back up again or traverse another peering connection.
So to avoid going out of business, we need to make sure our BGP routers only invite peers and transit providers to send traffic destined for our customers. This is done with a filter that only "advertises" our customer's and our own IP addresses to these networks. So our peers and our transit providers will only send us packets with those IP addresses as its destination, and that bandwidth will be paid for by our customers.
Our customers get to see all the IP addresses that BGP carries world-wide. Usually we learn most of those from our transit providers. But the very largest ISPs in the world don't have any transit providers, so they learn all IP addresses that are actively used throughout the internet through peering. These networks are the tier 1 networks.
I'm sure many network engineers creating BGP filters didn't realize they were actually defining business relationships. When we look at the filters at two ends of a connection, we see four possible relationships:
|We send IP address ranges:||They send IP address ranges:||They are our:||We are their:|
|The entire internet||Their own and their customer's||Customer||Transit provider|
|Our own and our customer's||Their own and their customer's||Peer||Peer|
|Our own and our customer's||The entire internet||Transit provider||Customer|
|The entire internet||The entire internet||Mutual backup||Mutual backup|
Last but not least, to make the economics work and also to let BGP do its job properly, it's necessary that we assign a higher preference to customers than to peers, a higher preference to peers than to transit providers, and a higher preference to transit providers than to mutual backup connections (which are rare).
The higher preference for peering means that sometimes leaked routes with very long paths are still preferred over shorter, directer paths.
RFC 7908, Problem Definition and Classification of BGP Route Leaks lists six types of route leaks. Types 5 and 6 can be fixed with RPKI—to be discussed shortly—so I'll skip these here. Types 1 - 4 are all failures to implement the filters that encode the business relationships between autonomous systems properly. Above, I explained what limitations an AS needs to set up to properly advertise their own (and their customer's) IP addresses to other networks. But what if they fail to do this?
Transit providers should have filters that make sure their customers only advertise IP address ranges that belong to them. The AS 7007 and Youtube - Pakistan incidents are examples of what can happen when they fail to do so. Fortunately, the RPKI mechanism can protect agains this. The Resource Public Key Infrastructure lets the legitimate holder of a block of IP addresses generate a Route Origination Authorization that authorizes an autonomous system to advertise the address block. Network operators can download the address holder's certificates and ROAs from repositories and then generate filters based on those and sync those to their routers. So once RPKI is fully deployed—which will take a while, but we're on our way—incorrect advertisements will be filtered out.
Unfortunately, RPKI as it exists today won't do anything agains the Safe Host / China Telecom route leak from last week. It seems that in this case, Safe Host and/or China Telecom propagated addresses they had learned through transit or peering to other peers, in violation of the valley-free model. As RPKI only looks at the origin AS, it won't stop this. There's also another effort to improve the security of BGP: BGPsec. But BGPsec is only concerned about protecting the BGP AS path, which lists all the ASes in a BGP update. As these route leaks typically correctly show the incorrect AS path, BGPsec doesn't provide any benefits here. And BGPsec is a resource intensive protocol, with no implementations on the horizon.
The reason this type of routing leaks is hard to stop is because someone receiving a BGP update doesn't know the business relationships between the ASes in the AS path in the update, making it impossible to filter out updates that contain incorrect paths.
There is some work going on in the Internet Engineering Task Force (IETF) to make this easier. The Route Leak Prevention using Roles in Update and Open messages proposal wants to explicitly assign a role to a BGP connection. That's not a bad idea, but as doesn't let others detect and correct mistakes made earlier in the path, it's probably not going to be very helpful against route leaks.
Another proposal is ASPA. This adds transit provider - customer relationship information to RPKI. This does let others detect earlier mistakes, but I'm not sure how it's going to work in practice. ASPA returns an "invalid" result when there is no transit provider - customer relationship on record: it can't tell the difference between an actual invalid path, and a legitimate path where there is missing provider - customer relationship record. Experience has shown that in such situations, it's extremely hard to get a new mechanism off the ground. ASPA also needs to have a view of the RPKI certificates and the BGP path information in the same place, requiring routers to do a lot more work.
So finally, here it is: my proposal to get rid of these route leaks. It's based on the notion that all ASes in the AS path must have a business or trust relationship with either the source or the destination. So if I want to send a packet to KPN, and Safe Host in Switzerland is in the AS path, that's legitimate if Safe Host is my transit provider (or a transit provider of my transit provider) or if Safe Host is a transit provider (or a transit provider of a transit provider) of KPN. If not, then a path with Safe Host in it is not a legitimate path between me and KPN, so I'll filter it out.
We implement this by extending RPKI ROAs so that in addition to the origin AS, they also list all possible transit ASes. RPKI validators are configured with the list of transit ASes that serve the local AS. So the validator adds the local list to the list of trusted ASes in each ROA, and thus creates a filter that routers can use to reject BGP AS paths with unauthorized AS numbers in them.
As always, coming up with something that can work isn't the hard part. Getting it deployed is invariably the difficult part. I think in this case the deployment scenario is looking good: it's to the advantage of both the origin AS and the validating AS to participate, because that way they can immediately start filtering out route leaks that would affect the communication between them, even if nobody else implements the system.
However, the ROAs are managed by the Regional Internet Registries (the RIPE NCC in Europe), so they will have to implement the update to the ROA format. And the current protocol to send RKPI filters to BGP routers (RFC 8210, will have to be extended to support sending AS path filters.
So who is with me?