Vivek Haldar

Backup in the age of the cloud

Since I moved entirely to ChromeOS I’ve had to rethink my backup strategy. I’ve gone from having beefy machines with tons of local storage to not-so-beefy-yet-capable machines where local storage is small, irrelevant and used as a secondary cache.

The “golden copy” of my data used to be on local disk, remote copies being backups. Now things are a little different. In a cloud world, the golden copy is in the cloud. (In my specific case, “disk” is now Google Drive.) Files are not saved to local disk, they are saved directly to the cloud.

How should backup strategies and techniques change given the transition to the cloud?

The answer has to be motivated by the question: why do you need the backup? That sounds like a silly question, because, well, obviously you need the backup to avoid data loss in the event of losing one copy. But the deeper question is about the risk profile of your data’s copies.

In the large-local-disk scenario, the remote backup was protection against the array of physical risks associated with local storage–disk failure, accidental erasure, loss of laptop. In the cloud world, the risk profile is very different. If you’re using any decent provider, the chances of data loss in cloud storage are negligible. For one, cloud storage options usually employ some degree of redundancy. And more importantly, there is an an army of ops folks carrying pagers tending to the care and feeding of the machines and systems that contain your precious data. All this means that the chance of data loss is much, much smaller than local disk. Actual data integrity has taken a giant leap with cloud storage.

So what is the risk? It shifts, from loss of data to loss of access to data. They sound like the same thing, but the difference is that data loss is permanent, but loss of access may be temporary. There might be service outages. Your wifi or ISP might go down. You might forget your password. The cloud company might go out of business and you might not be able to get your data out in time.

How do you mitigate against these risks of cloud storage? Periodically make a full local copy! We’ve made a full reversal. From local disk as golden, to local disk as backup. This is progress overall, because I think the probability of data loss in a cloud world is much, much smaller, and also the entire setup is much simpler. There is also the incremental but very nice benefit of much higher chances of survival of non-backed-up data (that is saved to the golden source but not to backup yet), because the probability of your single disk failing is much higher than that of a cloud service losing your data.