Airports, healthcare services and businesses could take weeks to fully recover from an IT failure that has wreaked havoc worldwide, experts have warned, after the “largest outage in history” disrupted the lives of millions.
Flights and hospital appointments were cancelled, payroll systems seized up and TV channels went off air after a botched software upgrade hit Microsoft’s Windows operating system.
It came from the US cybersecurity company CrowdStrike, and left workers facing a “blue screen of death” as their computers failed to start. Experts said every affected PC may have to be fixed manually.
In the UK, Whitehall crisis officials were coordinating the response through the Cobra committee. Ministers were in touch with their sectors to tackle the fallout from the IT failure, and the transport secretary, Louise Haigh, said she was working “at pace with industry” after trains and flights were affected.
A Microsoft spokesperson said: “We’re aware of an issue affecting Windows devices due to an update from a third-party software platform. We anticipate a resolution is forthcoming.”
CrowdStrike confirmed the outage was due to a software update from one of its products and was not caused by a cyber-attack. Its founder and chief executive, George Kurtz, said he was “deeply sorry for the impact that we’ve caused to customers”, adding there had been a “negative interaction” between the update and Microsoft’s operating system.
Govia Thameslink Railway (GTR) – the parent company of Southern, Thameslink, Gatwick Express and Great Northern – warned passengers to expect delays. According to the service status monitoring website Downdetector, users in the UK were reporting issues with the services of Visa, BT, big supermarket chains, banks, online gaming platforms and media outlets.
The Sky News and CBBC channels were also temporarily off air in the UK before resuming broadcasting, while Australia’s ABC was also affected.
In financial services, Metro Bank reported problems with its phone lines in the UK and Santander said card payments “may be affected”. Monzo said some customers were reporting issues, while some bankers at JP Morgan were unable to log on to their systems and the London Stock Exchange said there were problems with its news service.
Troy Hunt, a leading cybersecurity consultant, said the scale of the IT failure was unprecedented.
“I don’t think it’s too early to call it: this will be the largest IT outage in history,” he tweeted.
“This is basically what we were all worried about with Y2K, except it’s actually happened this time”, he added, referring to the “millennium bug” that worried IT experts in the run-up to 2000 – but ultimately did not cause serious damage.
The UK’s chartered institute for IT, the BCS, said it could take days and weeks for systems to recover, although some fixes will be easier to implement.
“In some cases, the fix may be applied very quickly,” said Adam Leon Smith, a BCS fellow. “But if computers have reacted in a way that means they’re getting into blue screens and endless loops it may be difficult to restore and that could take days and weeks.”
Alan Woodward, a professor of cybersecurity at the University of Surrey, said the fix required a manual reboot of affected machines and “most standard users would not know how to follow the instructions”. Organisations with thousands of PCs distributed in different locations face a tougher task, he added.
“It’s just sheer numbers. For some organisations it could certainly take weeks,” he said.
Among the companies affected on Friday was Ryanair, Europe’s largest airline, which said on its website: “Potential disruptions across the network due to a global third-party system outage … We advise passengers to arrive at the airport three hours in advance of their flight to avoid any disruptions.”
Heathrow, Europe’s biggest airport, said it was “working hard” to get passengers “on their way”.
Andrea Chan, 21, a recent graduate from London, was dismayed to find her flight from Heathrow to Rome cancelled on Friday.
She said: “I’ve filed for compensation but I think it takes up to 12 weeks for them to process that so I don’t even know if I’ll get my compensation.
“I got a voucher for £10 but that doesn’t really do much. I’m trying to rebook my flight but everything is extremely expensive now. I know it’s out of their control but I feel like there’s a lot of things to accommodate everyone better.”
A spokesperson for Heathrow said: “We continue to work with our airport colleagues to minimise the impact of the global IT outage on passenger journeys. Flights continue to be operational and passengers are advised to check with their airlines for the latest flight information.”
In the US, flights were grounded owing to communications problems that appear to be linked to the outage. American Airlines, Delta and United Airlines were among the carriers affected. Berlin airport temporarily halted all flights on Friday. The aviation analytics company Cirium said 4,295 flights – 3.9% of those scheduled – were cancelled globally on Friday, including 143 UK departures.
GP practices in the UK said they were unable to access patient records or book appointments. Surgeries reported on social media that they could not access the EMIS Web system. It is understood that 999 services were unaffected by the outage, but the Royal Surrey NHS Trust, in the south of England, declared a critical incident and cancelled radiotherapy appointments scheduled for Friday morning. The National Pharmacy Association confirmed that UK services could be affected.
A spokesperson for Keir Starmer said they were unaware of the problem having any impact on government services, but added they recognised the impact it was having more broadly.
The Israeli health ministry said “the global malfunction” had affected 16 hospitals, while in Germany the Schleswig-Holstein university hospital in the north of the country said it had cancelled all planned operations in Kiel and Lübeck.
The University of Surrey’s Alan Woodward said the outage was caused by an IT product called CrowdStrike Falcon which monitors the security of large networks of PCs and downloads a piece of monitoring software to every machine.
“The product is used by large organisations that have significant numbers of PCs to ensure everything is monitored. Sadly, if they lose all the PCs they can’t operate, or only at a much reduced service level,” said Woodward, who added that fixing the problem could take days.
“The major frustration is that to fix the issue will require manual intervention on every affected PC. That will mean enormous delays in recovering and hence disruption for days to come,” he added.
Steven Murdoch, a professor of security engineering at University College London, said many organisations could struggle to carry out the fix swiftly.
“The problem is occurring before the computer is connected to the internet so there is no way to fix the problem remotely, so that requires someone to come out … and fix the problem,” said Murdoch, adding that companies and organisations that have cut back on IT staff or outsourced their IT work would find their ability to address the problem hampered.
However, Ciaran Martin, the former chief executive of the National Cyber Security Centre, said that unlike adversarial cyber-attacks, this problem had already been identified and a solution had been flagged.
“The recovery is not about getting on top of the situation but getting back up. I think it’s unlikely to be very newsworthy in terms of ongoing disruption this time next week,” he said.
CrowdStrike’s president, George Kurtz, tweeted that the incident had been caused by a “defect found in a single content update for Windows hosts”. He added: “This is not a security incident or cyber-attack. The issue has been identified, isolated and a fix has been deployed.”
The problems for businesses in the US were also compounded by problems with Microsoft’s Azure cloud computing business that occurred on Thursday.
Source: theguardian.com