BGP Routing Error Takes Down Cloudflare's 1.1.1.1 DNS Resolver Globally
AI-created, human-edited.
On July 14th, 2025, something extraordinary happened that left millions of internet users scrambling. For exactly 62 minutes, Cloudflare's flagship DNS resolver 1.1.1.1 — affectionately dubbed "Quad One" by tech enthusiasts — completely vanished from the internet. What followed was a masterclass in how a single configuration error can bring down one of the internet's most critical services.
Security Now hosts Steve Gibson and Leo Laporte dove deep into this incident, revealing not just what happened, but why it matters for every internet user and what we can learn from this digital disaster.
"This resolver is so popular," Gibson noted, emphasizing just how earth-shaking this outage truly was. The incident affected the majority of 1.1.1.1 users globally, and for many, it meant that "basically all Internet services were unavailable."
The numbers tell the story of Cloudflare's dominance in the DNS space:
- 1.1.1.1 receives 62.6% of its requests for IPv4 addresses
- IPv6 queries make up 18.8% of total requests
- Traditional DNS over UDP still commands 86% of all queries
- DNS over TLS (DOT) holds 7.1% market share
- DNS over HTTPS (DOH) accounts for 4.7%
As Laporte observed, people using 1.1.1.1 "are more sophisticated than a normal user" — they've deliberately chosen to move away from their ISP's default DNS servers.
The incident wasn't the result of a cyberattack or BGP hijack, but rather an internal configuration error that had been lurking in Cloudflare's systems since June 6th — more than a month before the outage occurred.
Gibson explained the technical details: "This configuration error sat dormant in the production network as the new DLS service was not yet in use, but it set the stage for the outage on July 14th."
The problem stemmed from Cloudflare's data localization suite, which allows customers to configure services to meet compliance needs across different regions. A fundamental error linked the universal Quad One DNS IP with services that should only be available in specific locations.
One of the most fascinating aspects of the discussion centered on how Cloudflare's Anycast routing works — and why it made this outage so devastating.
Unlike traditional "unicast" addresses that route to specific physical servers, Cloudflare's 1.1.1.1 is an Anycast address that automatically routes users to the closest Cloudflare data center. As Gibson explained, "that single, ubiquitous quad one IP will automatically cause any client's DNS lookup traffic to be routed to that closest data center for its resolution."
This system explains why Cloudflare consistently performs so well in DNS benchmarks — users are almost always connecting to servers just a few router hops away. But when the BGP routing configuration went wrong, it went wrong globally and instantly.
Perhaps most concerning was the parallel Gibson drew between this incident and the CrowdStrike disaster from the previous year. Both incidents shared a critical flaw: "updates to the configuration do not follow a progressive deployment methodology."
"Even though this release was peer-reviewed by multiple engineers," Gibson noted, "the change did not go through a series of canary deployments before reaching every Cloudflare data center."
Just as with CrowdStrike, there was "too much confidence placed in their automation," leading to an all-at-once deployment rather than incremental testing.
The hosts emphasized a crucial lesson for users: always configure multiple DNS resolvers. "Standard best practice on the internet has always been to configure a pair, at least a pair, of DNS resolvers," Gibson stressed.
For users who had both 1.1.1.1 and 1.0.0.1 configured (assuming the secondary didn't also go offline), the impact would have been minimal — just "a brief stutter" as operating systems automatically switched to the backup resolver.
However, Gibson suspected that both primary and secondary Cloudflare DNS services likely went down together, meaning users with both configured would have lost internet access for the full hour.
One interesting technical detail was that DNS over HTTPS (DOH) traffic remained relatively stable during the outage. Most DOH users access the service through cloudflare-dns.com rather than the IP address directly, and this domain uses different IP addresses that weren't affected by the BGP withdrawal.
Cloudflare has committed to implementing progressive deployment methodologies and moving away from their legacy hard-coded system that proved "error prone." The incident revealed underlying issues with their deployment process that they're now addressing.
For users, this incident serves as a reminder of how dependent we've become on DNS resolution. As Laporte noted, it "shows how dependent on a DNS resolver we are — completely, I mean it is so crucial to the operation of all of the services that we now just take for granted on the internet."
This outage, while disruptive, provides valuable insights into internet infrastructure resilience. Gibson's work on the GRC DNS Benchmark continues to help users make informed decisions about their DNS providers, and incidents like this underscore why having multiple, diverse DNS resolvers configured is more important than ever.
The internet's robust design generally handles failures gracefully, but as Gibson observed, "Internet traffic is great and it works incredibly well right up until it utterly fails, and then it generally fails big."