Worse than a bad headline … is having to testify about your public failures before a legislative panel.
On June 6, 2018, executives of TSB Bank faced questioning by the UK Parliament’s influential Treasury Committee concerning what had been variously described as a “botched IT migration,” “a systems debacle,” and an “IT meltdown.”
They were summoned to answer questions from lawmakers about the bank’s failed attempt at migrating its IT systems and the spiralling problems that ensued, affecting millions of customers.
The inquiring MP’s concluded afterwards “The committee has lost confidence in [the bank chief executive’s] ability to provide a full and frank assessment of the problems at TSB, and to deal with them in the best interests of its customers. It is concerned that, if he continues in his position, this could damage trust not only in TSB, but in the retail banking sector as a whole.” That CEO was replaced within three months.
In 2015, after an ownership change, the bank began planning shifting 1.3 billion customer records to a platform built by their new owner. The changeover, which started on Friday, April 20th, was supposed to be completed over the weekend. However, by Monday morning, millions of customers were unable to use online or mobile banking and some were being given access to other people’s accounts.
When the initial cutover went wrong, developers who had access to the system began making live fixes to the production version. There appeared to be no contingency plan or option to revert back to the original platform. The resulting chaos continued for the best part of the following week, with customers locked out of their accounts for days.
The bank struggled to cope with an avalanche of customer complaints, creating a monumental backlog despite quadrupling its complaints team to 260 customer service personnel and reportedly seeking an additional doubling of so-called “complaint-wranglers.” Worse,the bank was overwhelmed at seeing 70 times the level of expected fraud attacks, well above the four times normally anticipated with this type of systems migration. Around 2,200 alerts had amounted to “actual attempts” at cyber fraud. 
Had inadequate testing let everyone down? In his testimony, the CEO alluded to their unit, integration, and volume tests but admitted, “What happened after the migration… clearly showed those tests were misleading.” 
The mechanics seemed reasonable: nine migration acceptance cycles where the extracted data from the legacy system was put it into the new one, as well as five dress rehearsals for the changeover weekend.
However, concerns were expressed that business pressures might have accelerated the system transfer to be conducted without adequate scrutiny. Certainly most traditional development projects relegate testing late and provide too few resources and not enough slack time to respond to test findings.
Subsequent analysis  questioned whether TSB had progressed carefully enough in validating each phase of the designed transition to ensure success before moving on to the next. Smaller-scale proof-of-concept testing with limited test accounts seemed lacking. It was also clear that early live support was not present, shown by the lack of sufficient high-skilled staff available immediately after such a challenging data transfer.
Adequate testing expands its range beyond so-called “happy path” conditions, where all works well. Testing should reveal not only whether systems might fail but also how well they can recover from those failures. Planning for disaster recovery is a fundamental business practice. Evaluating that recovery ought also to be seen as a fundamental testing practice.
Testing – and subsequent improvement – of the banks’s disaster recovery plans might well have reduced the scale of this particular disaster.