Table of Contents

Backup

(partly taken from https://np.reddit.com/r/linux4noobs/comments/68mjjp/psa_what_is_the_321_rule_for_backups_and_why_you/)

Why should I do backups?

Statistically speaking, you will eventually lose something that can't be replaced. Family photos, your thesis, the notes from that meeting, the song you recorded with your dad. It's just a matter of when, what, and how much you lose due to

What should I back up?

You should have backups of any data that A) is useful to you and B) can't be reliably replaced (rescanned, re-downloaded, or get a copy from uncle JimmyJoBob).

Is RAID a backup?

RAIDs are "fault tolerant" in that they can survive some drive problems or failures, but fault tolerance is not considered having an additional copy of the data and doesn't count as a backup as your system is still prone to most of the scenarios listed above.

3-2-1 Rule

The first rule is 3-2-1. Tried and true. You should have at least

Having two copies of the data on the same hard drive (same media) is NOT a backup. If the single drive fails, you've lost both copies. Likewise, having two copies in the same email account (same media) is NOT a backup. If that email service craps out, you've lost both copies.

You don't want a single failure to wipe out both copies. They should be on a cloud storage, removable media, NAS, etc that is not a permanent part of your computer.

Offsite backups

If you have 50 copies of your critical files, stored on 50 different media, but they are all at your home/office… well what if the house burns down or floods? What if someone steals it all? These things can happen. So at least ONE of those copies should be offsite. This could include:

Example

Copy 1: Local HDD. Duh.

Copy 2: Remote storage. For this, I recommend using a FOSS option like Nextcloud, duplicity, rdiff-backup, borgbackup, rsync coupled with a remote server you own / have access to, or a proprietary program such as Crashplan, SpiderOak, Carbonite. These can all be configured to do incremental backups at set times, near-real-time backups, and versioning.

A real-time sync is NOT a remote backup if it doesn't have versioning. That is, if deleting the local file also deletes the remote file in real-time, and there is not way to get an older archived version of the file, that is not a backup.

Copy 3: For this, since we've already satsified the off-site aspect, we can just do a second copy on a local NAS, removable media (HDD/DVD/USB). This allows fast recovery if something as mundane as a local file corruption or primary HDD crash happens.

Test backups

Okay, NOW you have backups. You know what else is not a backup? One that hasn't been tested. If you haven't tested your ability to recover data from a backup, you have no verified backup.

How do you test this? Easy enough. Back up some trivial data you don't care about to all three locations from time to time. Then delete it from your primary drive.

Now, and this is critical, wait a reasonable amount of time. Think about how long it might be before you realize that a file you needed has been corrupted or is missing. A few hours? A few days? You want your test to mimic that scenario as much as reasonably possible.

Now try to recover a copy from each of your backups. If you can do so, you are golden. If not, then one or more of your backup locations is not reliable and should be replaced or fixed.

Reasons against backups

my cloud provider already does backups for me!

You should always have a disaster recovery plan which doesn't put all your eggs in one basket. If the backup data center is the same data center where your production data lies, it's no surprise you lose both if the data center is hit by a natural disaster or mismanagement.

[…] many customers have been surprised that disaster plans are their responsibility and that their data has been lost: "Some customers don’t understand exactly what they bought[…]" 1

Server

Client

What to save in a backup

further reading