I've probably spent way too much time thinking about Linux backup over the years. But thankfully, I found a setup that works really well for me in 2018 or so, used it for the last few years, and I wrote up a detailed blog post about it just a month ago:
https://amontalenti.com/2024/06/19/backups-restic-rclone
The tools I use on Linux for backup are restic + rclone, storing my restic repo on a speedy USB3 SSD. For offsite, I use rclone to incrementally upload the entire restic repository to Backblaze B2.
The net effect: I have something akin to Time Machine (macOS) or Arq (macOS + Windows), but on my Linux laptop, without needing to use ZFS or btrfs everywhere.
Using restic + some shell scripting, I get full support for de-duplicated, encrypted, snapshot-based backups across all my "simpler" source filesystems. Namely: across ext4, exFAT, and (occasionally) FAT32, which is where my data is usually stored. And pushing the whole restic repo offsite to cloud storage via rclone + Backblaze completes the "3-2-1" setup straightforwardly.
One problem with file based backups is that they are not atomic across the filesystem. If you ever back up a database (or really any application that expects atomicity while it’s running), then you might corrupt the database and lose data. This might not seem like a big problem, but can affect e.g. SQLite, which is quite popular as a file format.
Then again, the likelihood that the backup will be inconsistent is fairly low for a desktop, so it’s probably fine.
I think the optimal solution is:
1) file system level atomic snapshot (ZFS, BTRFS etc)
2) Backup the snapshot at a file level (restic, borg etc)
This way you get atomicity as well as a file-based backup which is redundant against filesystem-level corruption.
Windows' Volume Shadow Copy Service[1] allows applications like databases to be informed[2] when a snapshot is about to be taken, so they can ensure their files are in a safe state. They also participate in the restore.
While Linux is great at many things, backups is one area I find lacking compared to what I'm used to from Windows. There I take frequent incremental whole-disk backups. The backup program uses the Volume Shadow Copy Service to provide a consistent state (as much as possible). Being incremental they don't take much space.
If my disk crashes I can be back up and running like (almost) nothing happened in less than an hour. Just swap out the disk and restore. I know, as I've had to do that twice.
[1]: https://learn.microsoft.com/en-us/windows/win32/vss/the-vss-...
[2]: https://learn.microsoft.com/en-us/windows/win32/vss/overview...
LVM snapshots are copy on write and can be used the same way.
Any backup software that utilizes LVM in this way?
Ie automatically creates a snapshot and sends the incremental changes since previous snapshot to a backup destination like a NAS or S3 blob storage.
I think block-level snapshots would be very difficult to use this way.
I just make a full dedupped backups from LVM snapshots with kopia, but I've set that up only on one system, on others I just use kopia as-is.
It takes some time, but that's fine for me. Previous backup of 25 GB an hour ago took 20 minutes. I suppose if it only walked files it knew were changed it would be a lot faster.
Thanks, sounds interesting. So you create a snapshot, then let kopia process that snapshot rather than the live filesystem, and then remove the snapshot?
Right, for me I'd want to set it up to do the full disk, so could be millions of files and hundreds of GB. But this trick should work with other backups software, so perhaps it's a viable option.
Exactly so.
Here's the script, should it be of benefit to someone, even if it of course needs to be modified:
Awesome, thanks!
wyng backup does this. It uses the device mappers thin_dump tools to allow for incremental backups between snapshots, too:
https://github.com/tasket/wyng-backup
edit: requires lvm thin provisioned volumes
There is also thin-send-recv which basically does the same as zfs send/recv just with lvm:
https://github.com/LINBIT/thin-send-recv
it uses the same functions of the device mapper to allow incremental sync of lvm thin volumes.
Thanks for the pointers, looks very relevant.
It's just such a low-effort peace of mind. Just a few clicks and I know that regardless what happens to my disk or my system, I can be up and running in very little time with very little effort.
On Linux it's always a bit more work, but backups and restore is one of those things I prefer is not too complicated, as stress level is usually high enough when you need to do restore to worry about forgetting some incantation steps.
it depends. Doing a complete disaster recovery of a windows system IMHO can be a real struggle. Especially if you have to restore a system to different hardware, which the system state backup that microsoft offers does not support afaik.
Backing up a linux system in combination with REAR:
https://github.com/rear/rear
and a backup utility of your choice for the regular backup has never failed me so far. I used it to restore linux systems to complete different hardware without any troubles.
I don't think the diffs are usable that way. They're actually more like an "undo log" in that the snapshot space is taken by "old blocks" when the actual volume is taking writes. It's useful for the same reasons as volume shadow copy: a consistent snapshot of the block device. (Also this can be very bad for write performance as any writes are doubled - to snapshot and to to the real device)
Yeah ok, that makes sense. Write performance is a concern, but usually the backups run when there's little activity.
While I do that, is that really the case? I can imagine database snapshots are consistent most of the time, but it can't be guaranteed, right? In the end it's like a server crash, the database suddenly stops.
Your DB is supposed to guarantee consistency even in server crashes. (The Consistency, Durability part of ACID).
That consistency is built on assumptions about the filesystem that may not hold true of a copy made concurrently by a backup tool.
e.g. The database might append to write-ahead logs in a different order than the order in which the backup tool reads them.
That's why you do a filesystem snapshot before the backup, something supported by all systems. The snapshot is constant to the backup tool, and read order or subsequent writes don't matter.
The main difference is that Windows and MacOS have a mechanism that communicates with applications that a snapshot is about to be taken, allowing the applications (such as databases) to build a more "consistent" version of their files.
In theory, of course, database files should always be in a logically consistent state (what if power goes out?).
Well, supported by Windows and MacOS. Linux only if you happen to use zfs or btrfs, and also only if the backup tool you use happens to rely on those snapshots.
I believe basically any filesystem will work if you have it on LVM. Bonus of lv snaps being thin snapshots too
That works if the backup uses a snapshot of the filesystem or a point in time. Then the backup state is equivalent to what you'd get if the server suddenly lost power, which all good ACID databases handle.
The GP is talking about when the backup software reads database files gradually from the live filesystem at the same time as the database is writing the same files. This can result in an inconsistent "sliced" state in the backup, which is different from anything you get if the database crashes or the system crashes or loses power.
The effect is a bit like when "fsync" and write barriers are not used before a server crash, and an inconsistent mix of things end up in the file. Even databases that claim to be append-only and resistant to this form of corruption usually have time windows where they cannot maintain that guarantee, e.g. when recycling old log space if the backup process is too slow.
You can also use lvm2 and then you get atomic snapshots with any file system (I think it needs to support fsfreeze, I guess all of them do).
I never knew this. Thanks for sharing!
lvm requires unallocated space in the volume which makes it kind of garbage to use for snapshots
I agree with you, of course. On macOS, Arq uses APFS snapshots, and on Windows, it uses VSS. It'd be nice to use something similar on Linux with restic.
In my linked post above, I wrote about this:
"You might think btrfs and zfs snapshots would let you create a snapshot of your filesystem and then backup that rather than your current live filesystem state. That’s a good idea, but it’s still an open issue on restic for something like this to be built-in (link). There’s a proposal about how you could script it with ZFS in this nice article (link) on the snapshotting problem for backups."
The post contains the links with further information.
My imperfect personal workaround is to run the restic backup script from a virtual console (TTY) occasionally with my display server / login manager service stopped.
I run this from a ZFS snapshot. What I want backed up from my home dir lives on the same volume, so I don't have to launch restic multiple times. I have dedicated volumes for what I specifically want excluded from backups and ZFS snapshots (~/tmp, ~/Downloads, ~/.cache, etc).
I've been thinking of somehow triggering restic by zrepl whenever it takes a snapshot, but I haven't figured a way of securely grabbing credentials for it to unlock the repository and to upload to s3 without requiring user intervention.
Enjoyed the post, thanks. One question: why don’t you use restic+rclone on macOS? They both support it and I’d assume you could simplify your system a bit…
I only have one macOS system (a Mac Mini) and Arq works well for me. Also I prefer to use Time Machine for the local backups (to a USB3 SSD) on macOS since Apple gives Time Machine all sorts of special treatment in the OS, especially when it comes time to do a hardware upgrade.
I’ve also found Arq to be brilliant on MacOS. It’s especially nice on laptops, where you can e.g. set it to pause on battery and during working hours. Also, APFS snapshots is a nice thing given how many Mac apps use SQLite databases under the hood (Photos, Notes, Mail, etc.).
On Linux, the system I liked best was rsnapshot: I love its brutal simplicity (cron + rsync + hardlinks), and how easy it is to browse previous snapshots (each snapshot is a real folder with real files, so you can e.g. ripgrep through a date range). But when my backups grew larger I eventually moved to Borg to get better deduplication + encryption.
rsnapshot was definitely my favorite Linux option before restic. I find that restic gives me the benefits of chunk-based deduplication and encryption, but via `restic find` and `restic mount` I can also get many of the benefits of rsnapshot's simplicity. If you use `restic mount` against a local repo on a USB3 SSD, the FUSE filesystem is actually pretty fast.
Thanks for the info, I’ll have a closer look at Restic then. Borg also has a FUSE interface, but last time I tried it I found it abysmally slow – much slower than just restoring a folder to disk and then grepping through it. I used a Raspberry Pi as my backup server though, so the FUSE was perhaps CPU bound on my system.
Yea, I don't want to oversell it. The restic FUSE mount isn't anywhere near "native" performance. But, it's fast enough that if you can narrow your search to a directory, and if you're using a local restic repo, using grep and similar tools is do-able. To me, using `restic mount` over a USB3 SSD repo makes the mount folder feel sorta like a USB2 filesystem rather than a USB3 one.
Do you have much of an opinion on why you went with Restic over Borg? The single Go binary is an obvious one, perhaps that alone is enough. I remember some people having un-bound memory usage with Restic but that might have been a very old version.
The big one for me was https://borgbackup.readthedocs.io/en/stable/faq.html#can-i-b....
This was basically one big reason why I went with https://kopia.io . The other might have been its native S3 support.
For me, these traits made restic initially attractive:
- encrypted, chunk-deduped, snapshotted backups
- single Go binary, so I could even backup the binary used to create my backups
- reasonable versioning and release scheme
- I could read, and understand, its design document: https://github.com/restic/restic/blob/master/doc/design.rst
I then just tried using it for a year and never hit any issues with it, so kept going, and now it's 6+ years later.
I use both to try to mitigate the risk of losing data due to a backup format/program bug[1]. If I wasn't worried about that, I'd probably go with Borg but only because my offsite backup provider can be made to enforce append-only backups with Borg, but not Restic, at least not that I could find.[2] Otherwise, I have not found one to be substantially better than the other in practice.
1 - some of my first experiences with backup failures were due to media problems -- this was back in the days when "backup" pretty much meant "pipe tar to tape" and while the backup format was simple, tape quality was pretty bad. These days, media -- tape or disk -- is much more reliable, but backup formats are much more complex, with encryption, data de-dup, etc. Therefore, I consider the backup format to be at least as much of a risk to me now as the media. So, anyway, I do two backups: the local one uses restic, the cloud backup uses borg.
2 - I use rsync.net, which I generally like a lot. I wrote up my experiences with append-only backups, including what I did to make them work with rsync.net here: https://marcusb.org/posts/ransomware-resistant-backups/
I use both, and I never had problems with any of them. Restic has the advantage that it supports a lot more endpoints than ssh/borg, f.e. S3 (or anything that rclone supports). Also borg might be a little bit more complicated to get started with than restic.
One question, why use rclone for the Backblaze B2 part? I use restic as well, configured with autorestic. One command backs up to the local SSD, local NAS, and B2.
I explain in the post. Here's a copypasta of the relevant paragraph:
"My reasoning for splitting these two processes — restic backup and rclone sync — is that I run the local restic backup procedure more frequently than my offsite rclone sync cloud upload. So I’m OK with them being separate processes, and, what’s more, rclone offers a different set of handy options for either optimizing (or intentionally throttling) the cloud-based uploads to Backblaze B2."
So you did! Sorry, hadn't read the post beforehand. Oh, and I too mourned the loss of CrashPlan. Being in Canada, I didn't have the option offered to have a restore drive sent if needed, but thought it was a brilliant idea. On the other hand, I think Backblaze might!
Do you only back up your home directory, or also others? I didn't find info about that in your post.
I backup everything except for scratch/tmp/device style directories. Bytes are cheap to store, my system is a rounding error vs my /home, and deduping goes a long way.
I'm less worried about the size and more about something breaking when doing a recovery.
Let's say you're running Fedora with Gnome and you want to switch to KDE without doing a fresh install. You make a backup, then go through the dozens of commands to switch, with new packages installed, some removed, display managers changed etc. Now something doesn't work. Would recovering from the restic backup reliably bring the system back in order?
The tool from the original post seems to be geared towards that, while most Restic and rclone examples seem to be geared towards /home backup, so I wonder how much this is actually an alternative.
I've been mulling over setting up restic/kopia backups - and recently discovering httm[1] support restic directly in addition to zfs (and) more - I think I finally will.
[1] https://github.com/kimono-koans/httm
I only discovered httm thanks to this thread, and I'll definitely be trying it out for the first time today. Maybe I'll add an addendum to my blog post about it.
I used to use restic with scripting, then I discovered resticprofile, and swiftly replace all my scripts with it.
https://github.com/creativeprojects/resticprofile
I also use Kopia as an alternative to Restic, in case some critical bugs happen to either one of them.
https://kopia.io/
Personally, I've had some issues with Kopia.
I found their explanation here:
https://github.com/kopia/kopia/issues/1764
https://github.com/kopia/kopia/issues/544
Still not solved after many years :(
Now I use Borg + Restic and I am happy
+ GUI for Restic https://github.com/garethgeorge/backrest
+ GUI for Borg https://github.com/borgbase/vorta
For home backup, I have a similar setup with dedup, local+remote backups.
Borgbackup + rclone (or aws) [1]
It works so well, I even use this same script on my work laptop(s). rclone enables me to use whatever quirky file sharing solution the current workplace has.
[1]: https://github.com/kmARC/dotfiles/blob/master/bin/backup.sh
I have used pretty much the same setup for the last 6 years. I run borg to a small server then rclone the encrypted backup nightly to B2 storage.
I have ended up with something very similar. Restic/rclone is awesome combo. https://bobek.cz/restic-rclone/