Microsoft Azure and Office 365 downed by DNS configuration blunder
A DNS CONFIGURATION BLUNDER has been blamed for an outage that downed Microsoft Azure and 365 services.
Microsoft cloud users across the globe this week moaned that they were unable to connect to a plethora of services for several hours as a result of the DNS borkage.
A range of Microsoft services were impacted, including Azure databases, Microsoft 365, Teams, Dynamics, OneDrive, SharePoint Online and Compute services.
Following reports of the flaw, Microsoft posted an Azure status page update stating: “Customers may experience intermittent connectivity issues with Azure and other Microsoft services (including M365, Dynamics, DevOps, etc).
“Engineers are investigating DNS resolution issues affecting network connectivity. Connectivity issues are resulting in downstream impact to Compute, Storage, and Database services, and some customers may be unable to file support requests.
“More information will be provided as it becomes available. Some customers may start to see recovery.”
After investigating the outage, Microsoft confirmed that “users may be unable to access Microsoft 365 services or features”, adding that it had “identified and corrected a DNS configuration issue that prevented users from accessing Microsoft 365 services and features”
“We’ve observed an increase in successful connections and our telemetry indicates that all services are recovering. We’re continuing to monitor the environment to validate that service has been restored.”
The three-hour outage has since come to an end, with Microsoft confirming that its engineers have mitigated the issue and that most services have been recovered.
“Engineers identified the underlying root cause as a nameserver delegation change affecting DNS resolution and resulting in downstream impact to Compute, Storage, App Service, AAD, and SQL Database services,” it said.
“During the migration of a legacy DNS system to Azure DNS, some domains for Microsoft services were incorrectly updated. No customer DNS records were impacted during this incident, and the availability of Azure DNS remained at 100% throughout the incident. The problem impacted only records for Microsoft services.
“To mitigate, engineers corrected the nameserver delegation issue. Applications and services that accessed the incorrectly configured domains may have cached the incorrect information, leading to a longer restoration time until their cached information expired.”