Benad's Web Site

I've been using the built-in backup tool of Ubuntu, called Duplicity, for quite a while. Still, all along, Duplicity felt old. It essentially does backups with tar, compresses them with gzip and encrypts them with gpg, so the whole backup process feels linear. While Duplicity does use GNU Tar's form of incremental backups and adds in some form of partitioning for each increment, restoring a single file still means that Duplicity may have to decrypt and decompress an entire block if your file just happens to be at the end of that stream.

There is the obvious benefit that those old tools are quite reliable and proven. Still, they are quite slow and inconvenient, to the point where many users end up using Git as a form of backup. This seems to be evidence that there are different kinds of "backups" that differ in their usage patterns, and as such different backup tools for each kind may be preferable.

One backup "kind" is the full system backup, be it a full "byte-level" clone of the disk at the dd level, or a file system one using tar. It's fine if they are slower and more linear, as this kind of backup is done less often, with lesser concerns about optimizing the increments (if any). The goal is to restore the whole system back to a stable state, and less about restoring a specific file from a history of file changes.

A system image clone can be done "offline" by booting into something like Clonezilla. If you use LVM, ZFS or Btrfs, you can also use a "live snapshot" of a file system at a point in time, and then clone that "live" while the system is still running.

The other backup “kind” would then be the user files. They are not specifically tied to a specific system, so they can be safely restored on a new computer. Typically, individual files are restored rather than all of them at once, and since user files are often changed it would be preferable to have a longer backup history for each. Practically, this means that restoring user files is more often non-linear than a full system restore.

And this where something like tar doesn’t work well. Not all user data compresses well, and it can be argued that most of it, by size at least, are media files that are already compressed. User data is often organized in separate folders that have different backup needs, especially around backup frequency. The user data files could come from multiple computers and yet be backed up in the same destination. User data is typically more “valuable”, so they are often backed up online, where storage and bandwidth are ongoing costs. This also means that encryption of those backups should be enabled by default.

There are already quite a few commercial backup software made for user data, for example Crashplan and Backblaze to name a few. But they often don’t support Linux, and users are forced to use their proprietary online storage service. Restic is an open-source backup tool that is storage-agnostic, cross-platform, and is a great alternative to those commercial tools.

Compared to Duplicity, Restic has some great advantages that a quickly visible. First, it is fast. If the first thing you do (and you should) is to try running a small local backup and fully verify it, you'll find that Restic is several times faster; In my experience, Restic takes seconds for what would take minutes with Duplicity, with the exact same local backup storage location. Second, it is quite flexible around file sets and backup scheduling. To put it simply, you can add new or updated files to your backup destination (“repository”), whenever you want, in any combination. While it does mean that you have to build up your own scheduled tasks to run the backups, you are then free to combine what files, from what machine, and when backups will run, all to the same backup destination.

Restic natively supports a few backup storage types. It can also integrate with rclone to support, through it, an even larger number of storage locations. For example, I’ve set up my OneDrive account in rclone, and used that in Restic to use it as my backup storage location, even though Restic doesn’t natively support OneDrive.

There are a few disadvantages with Restic though. It doesn’t support “per-file” compression, as it would reduce the encryption’s security, and Restic favours security over disk usage. It still uses binary deltas for make backup increments smaller, so it’s not that bad. It is still a command-line standalone tool, so it is not as user-friendly as commercial tools, and isn’t “integrated” with the system, like Duplicity is within Ubuntu for example. This also means that while it is cross-platform, it doesn’t use much OS-specific features, specifically for Windows and macOS. Restic is also a relatively new backup tool, not yet considered “1.0”, so it’s still highly recommended you test your backups thoroughly (Restic has built-in tools to fully verify your backups).

Even with all those caveats, Restic became my preferred Linux backup tool, both to local and online destinations.

Published on November 28, 2020 at 10:55 EST

Older post: The Borderlands

Newer post: RAM Compression on Linux