This week, Microsoft’s Linux package repositories suffered a hiatus for hours, followed by performance issues that spanned a day.
Users who trust packages.microsoft.com repository for extracting Linux distributions, including Ubuntu, Debian, CentOS, OpenSUSE, and Fedora received errors.
Microsoft engineers have recognized the problem and are working to find a solution.
Microsoft’s Linux repositories crash
The packages.microsoft.com repository went down this week due to an extended outage.
Linux and Solaris specialist Štefan Jarina first mentioned the issue on June 16, about a bunch of “404 not found” errors appearing when downloading “.deb” files from the repository.
Jarina’s report was later confirmed by other engineers who experienced the problem, and some saw “500 Internal Server Error” messages when trying to extract Debian packages.
Microsoft engineer Rahul Bhandari chimed in on the same GitHub thread to confirm:
“Our infrastructure team is working on this. There is a problem with some of the mirrors in packages.microsoft.com so according to them, the current ETA to solve this problem is in the next two hours, “said Bhandari.
Bhandari later confirmed that some storage problems were the main cause of these problems.
While the problem was being investigated, several users requested an “incident response report”, on why the mirror sites had also failed in this outage and why it was a recurring problem.
“Will there be an incident report in response to this? I would be particularly interested in why the mirror sites were not available or, if they are available, why there is a single point of failure that affects them all.”
“We have faced issues in the past where packages failed when a deployment was running, but a catastrophic failure of this nature would have affected many production workloads today.”
“Package managers are the backbone of our industry and we must be able to trust them.”
“I have been forced to remove dependency on Microsoft package repositories in favor of self-hosted ones for the time being, which is unnecessary manual maintenance that I would like to avoid if possible.” set engineer Michael Armitage.
Replaces, but users experience degraded performance
Although Microsoft’s initial ETA to resolve the issue was “approximately two hours”, the issue spread well more than 14 hours, and users continue to experience degraded performance.
Microsoft’s chief engineering officer, Ravindra Bhartiya, said:
“We had an incident with packages.microsoft.com which caused the packages to be unavailable. “
“Our engineering team has mitigated the problem and our internal data shows an improvement in availability.”
“If you still have problems, please provide us with more information (” apt-get update | install “output) and we can investigate further,” said Bhartiya.
But even today, at the time of writing, users are complaining about slow download speeds when retrieving packages from Microsoft’s repositories:
Some downloads reportedly took two or three to complete, prompting users to investigate workarounds. Although it seems that performance and availability are slowly improving and getting back to normal.
Large-scale disruptions of critical systems and CDNs have become a common occurrence of late.
Interestingly, the timing of this outage coincides with the Akamai outage that affected leading Australian banks and organizations yesterday, although the two incidents appear to be unrelated.