Data loss can come from any number of sources, whether it’s a hurricane that causes major flooding in a server room or a bad actor threatening the safety of critical data by demanding a ransom for its safe return. If an organization only has a single copy of data on that waterlogged hard drive or in the hands of a cybercriminal, the odds of recovery are slim to none.
In Veeam’s 2023 Data Protection Trends report, 79% of those surveyed noted a protection gap, meaning that in the case of disaster, they feel that they are inadequately prepared to recover that data safely. And as cyberattacks continue to evolve year after year, data loss may end up being an inevitability for many businesses, making backup & recovery more important than ever to a business’ operations.
The cost of permanent data loss
Obviously, losing crucial data due to natural disaster or cyberattack is never a desired outcome, but it can sometimes be hard to avoid. Regularly tested backup and recovery processes are a great way to ensure the safety of data, but what happens if your organization loses business-critical files, and you have no way to recover them?
The following stories of backups gone awry are great examples of how a proper backup strategy can really save your bacon when the unthinkable happens.
Colonial Pipeline Cyberattack
One of the most infamous cyberattacks in history occurred in May 2021, infecting some of Colonial Pipeline’s systems and shutting them down for several days. The Colonial Pipeline Company halted their pipeline operations in attempt to contain the attack, which massively impacted the US oil infrastructure. The attackers, a hacker group called DarkSide, breached the network through an exposed VPN account password, which was likely obtained through a separate data breach.
After five days of the Colonial Pipeline network being offline, they finally paid the 75 bitcoin (or $4.4 million USD) ransom, and a tool was provided to the company to restore the system. Though the FBI was able to recover 64 of the 75 bitcoin paid to DarkSide, a large amount of money was still lost in the breach.
And even with the ransom paid, Colonial Pipeline ended up restoring from their own backups anyways, due to long restoration times from the tool itself. Unfortunately, it’s not known how much data was lost and why backups were not used initially. But it’s an important lesson in making sure that, even if you have a process in place, backups are always tested.
Toy Story 2 Deleted
You may have seen this story make the rounds on various backup horror story posts, but if you haven’t, it’s a good one. Toy Story 2 was in development in 1998, when one day, one of the employees happened to be looking at a directory in which the assets for the character of Woody were stored when they noticed that the directory of files was decreasing with each refresh.
Turns out, a command was run on the system in attempt to clear out some unwanted files, but unfortunately it was run at the root level of the Toy Story 2 project and the system was slowly working its way through all of the files. They eventually scrambled to shut off the power of the server immediately in a rushed attempt to stop the command from running. But upon being brought on a few hours later, 90% of the work was deleted thanks to the stray command.
Pixar was not new to data being deleted and had regular tape backups running on a regular basis. Unfortunately for the team, these backups were never tested, which means the backups were stored on a tape drive and as the files met the file limit, new data was no longer being added to the drive. Any restores they managed to get were full of errors and no one was sure how they’d recover the lost data.
Fortunately, long story short, the movie’s Supervising Technical Director (Galyn Susman) happened to have a backup stored at her house as she’d been working from home after she gave birth to her son. The backup was about two weeks old, but it was better than nothing. This offsite backup saved the day, and the movie. (Until it was scrapped and re-animated, though not for backup and recovery reasons…)
This story goes to show that, even if you believe you’re backing up your data, testing is essential.
Data Loss Nightmares from Reddit
Not every story of data loss comes from a large company, but many happen within the smaller IT organizations you yourself may be a part of! There are loads of stories of SysAdmins losing backups of VMs, failed drives, database deletions, and a lack of locked-down permissions.
Tips for improving your backup strategy
So, you’ve read through some horror stories and you’re ready to make sure your backup strategy is up to snuff, but you’re not really sure what you need to check. Our Tome of Backup Best Practices is a great asset to have in your back pocket and includes tips on choosing your archival method & storage destination, along with maintenance & restoration.
In the meantime, here are a few steps you can add into your backup strategy to ensure that the backups perform as intended:
Enable comprehensive backup alerts
Don’t just rely on failure alerts, but include alerts for successful backups, cloud syncs, length of backups, etc.
Keep your backup software up-to-date
Update backup software regularly as your backup vendor may introduce new features or fix critical bugs.
Consider your recovery point & time objectives
Recovery point objective (RPO) determines how much data you need to recover, and recovery time objective (RTO) determines how quickly you need the data restored. These metrics will inform how often you set backups as well as your restoration methods.
Consistently audit and test your backup process
This may be the most important takeaway from the stories above. Backups are only good if they’re successful, so make sure you’re auditing your backups regularly and consistently.
Develop a disaster recovery checklist
List everything that needs to be done in case of data failure, including your potential RPO & RTO, who is involved in recovery, where the backup is stored, etc.
Make sure your backups have plenty of redundancy
I’m sure the 3-2-1 backup rule has been covered to death, but there’s a reason. This rule ensures that you have multiple copies of your data in multiple locations so that no matter the disaster, you always have a place to recover from.
Constantly be updating your documentation
Keep the process documented for everyone so that anyone can manage data restoration. Make sure other members of the team are informed on restoration processes.
Backing up SaaS data is still important
Just because data is being stored in “the cloud” doesn’t mean that it’s being backed up by your cloud provider, so run consistent backups of your cloud data.
Print out your disaster recovery plans
If you’re trying to restore data and only had a digital copy of your disaster recovery plans, that document is likely part of what was lost, and you may have trouble getting access to it. A physical copy ensures accessibility.
Don’t neglect your networking hardware
Backups aren’t just for operating systems, servers, and data but also for your switch configs, firewall configs, and more.