Amazon has issued a statement saying that it has fixed the major outage in its cloud computing service that had disrupted internet use on Sunday worldwide. The problem had affected many online platforms, including social media, gaming, food delivery, streaming, and financial services.
AWS returns to ‘normal operations’
Amazon Web Services (AWS) said its systems have “returned to normal operations,” ending the 15-hour-long outage. Earlier, the company had shared that the main cause of the outage was resolved, but some AWS services were still facing connectivity problems. This included its Lambda service, which allows apps on smartphones and computers to run using the cloud.
AWS explained that users of its Lambda service might face occasional errors when their functions try to connect to other systems or services, as the company works to fix leftover network issues. To deal with these problems, AWS temporarily reduced the speed at which Lambda checked for messages in the SQS queue. Now that the number of successful operations is increasing and errors are going down, the company is gradually returning the polling rate to normal.
By 8 am as per Eastern Time, AWS updated the outage status from “degraded” to “impacted,” saying they were still working to clear a buildup of delayed user requests.
The company later clarified that all remaining technical issues were fixed.
What led to the massive AWS outage?
The company said the main reason for the outage was an “underlying DNS issue”, a failure in the Domain Name System, which works like the internet’s phonebook by turning website names into numerical IP addresses.
By 6:35 am as per the Eastern Time, AWS said the outage had been “fully mitigated,” meaning most systems were back to normal. However, their engineers later explained that some services were still “experiencing elevated errors.”
The problem started in the US-East-1 (North Virginia) region, which is a key center for AWS operations. This caused a chain reaction that affected many digital services around the world.
Which platforms were affected?
Monitoring site Downdetector showed that users faced problems on several popular platforms such as WhatsApp, Snapchat, Pinterest, Zoom, Signal, Fortnite, Xbox, and YouTube. Work and lifestyle-related apps like Canva, Duolingo, Strava, and Peloton were also affected and showed errors.
Flickr, the photo-sharing website, said it had gone offline for a while because of a major problem with Amazon Web Services.
Will the disruption continue?
AWS engineers said that most systems are becoming stable again, but some minor slowdowns might continue for a while. The company has not yet released a detailed report explaining what caused the problem or how it plans to prevent it from happening again.