The Cloudflare Outage of 2022: How AWS also got affected?

HIGHLIGHTS

After 06:27 UTC, users began reporting Cloudflare downtime on DownDetector.
19 datacenters of Cloudflare were down.
As a result of the outage, users started receiving 5xx error messages.
Furthermore, Cloudflare acknowledged the issue and said it is working to resolve it.
The first data center went online at 06:58 UTC.
All data centers were online and functioning correctly by 07:42 UTC.

Introduction to Cloudflare Outage

The Cloudflare outage that took down so many sites on June 21, 2022, was big news, but the real drama was revealed later on as it was discovered that Amazon Web Services went down due to the Cloudflare outage. According to Downdetector, a range of different websites and services were affected by the outage, including Discord, Shopify, Grindr, Fitbit, AWS, and Peloton. How did that happen? Was it really an accident? I’ll tell you how it all went down and what that means for your business!

During the Cloudflare outage of 2022, the widespread disruption didn’t spare even industry giants like AWS, as it had a cascading effect on various parts of the internet ecosystem. The Cloudflare outage, while not directly related to AWS, revealed the intricate interdependencies within the global network infrastructure. AWS services, often relied upon by numerous websites and applications, experienced connectivity issues and latency during the Cloudflare outage, highlighting how even seemingly isolated incidents can ripple through the digital landscape, affecting multiple key players in the cloud computing and content delivery sectors.

What is Cloudflare?

Cloudflare provides a free content delivery network and security service, sitting between users and hosting providers, protecting websites from attacks. It has 150+ data centers around the world. They also offer other services including web acceleration, domain name registration, and DNS management as well as analytics software called Cloudflare Spectrum.

Cloudflare Outage: A look back at what went wrong?

On June 21, 2022, a widespread outage that originated with Cloudflare left multiple websites globally unavailable, causing congestion at key internet hubs like Amazon Web Services (AWS), Twitter, Zerodha, Shopify, Discord, and Canva. As of late, Cloudflare has been converting all of its busiest locations to a more flexible and resilient architecture.

The cause of this outage was a change to the network configuration within Cloudflare’s busiest locations as part of a long-term project to increase resilience.

Background of change in Network configuration

As part of Cloudflare’s ongoing effort to build a more flexible, resilient architecture, they have been converting their busiest locations over the last 18 months. This architecture has been used in 19 data centers: Amsterdam, Atlanta, Ashburn, Chicago, Frankfurt, London, Los Angeles, Madrid, Manchester, Miami, Milan, Mumbai, Newark, Osaka, San Jose, Singapore, Sydney, and Tokyo.

Using this new architecture has improved reliability and allowed for maintenance to be performed without disrupting customer traffic. Since these locations also carry a significant share of Cloudflare’s traffic, any problem there can have a widespread effect. Unfortunately, this is what happened on Tuesday, June 21.

What sites were affected?

According to the outage reporting website DownDetector, multiple websites were experiencing outages, including Discord, Zerodha, Shopify, Amazon Web Services, Twitter, Canva, and even Valorant, a popular battle royale game.

Along with WazirX, Coinbase, FTX, Bitfinex, and OKX, other websites, including Udemy, Splunk, Quora, and Crunchyroll, were unresponsive. The majority of these websites were re-accessible later.

Boost your earning potential with AWS expertise. Explore our certified AWS Courses for a high-paying career

Explore AWS Architect Professional Certification

Timelines:

03:56 UTC: The change was implemented to the first location and no issues were alerted as the data center was having old architecture.

06:17: Except MCP architecture location the change was deployed to all locations of Cloudflare.

06:27: The rollout happened on an MCP-enabled location and the problem started and ultimately affected 19 locations.

06:51: New incident was raised and the root cause was verified.

06:58: The first data center went online.

07:42: All data centers were online and functioning correctly

08:00: Cloudflare closed the incident.

Though this just accounts for 4% of Cloudflare’s total network, it’s 50% of requests impacted.

Why did this affect AWS too?

Cloudflare integrates quickly and easily with AWS. Host your websites and run applications on AWS while keeping them secure, fast, and reliable. Use Cloudflare as a unified control plane for consistent security policies, faster performance, and load balancing for your AWS S3 or EC2 deployment.

Therefore many companies use AWS with Cloudflare and they all got affected due to this Cloudflare outage.

Don’t forget to register for Free Zoom Webinars to boost your knowledge.

Conclusion:

It’s hard to imagine a better time to be an engineer. We have all of these amazing tools at our disposal, and it’s up to us to decide how we will use them. One thing is for sure—things are going to get more interesting as companies continue investing in new ways for us to build things faster and more efficiently. If you are learning cloud computing please check out our AWS solution architect course also check our other blogs for more interesting blogs like this.

All Programs