return to table of content

Bcachefs Merged into the Linux 6.7 Kernel

Ecco
95 replies
6d5h

Having just heard of bcachefs when reading this article, I tried to understand what makes it better than other existing FS but couldn't quite find a clear answer. It feels like it's feature set is equivalent to ZFS.

Do you guys know why someone should get excited by bcachefs?

pantalaimon
57 replies
6d5h

ZFS will never be integrated into the Linux kernel due to it's licence. btrfs is complicated to use and has many pitfalls that can lead to it eating your data.

ndsipa_pomu
56 replies
6d5h

> btrfs is complicated to use and has many pitfalls that can lead to it eating your data

I use btrfs in preference over ext4 for Linux filesystems and turn on zstd compression for performance and a bit of space saving. It seems simple enough for my use case, though I'm not doing any snapshots etc.

What are some of the potential pitfalls?

pantalaimon
18 replies
6d5h

I was very excited about btrfs' advanced features, but that meant that btrfs would bite me multiple times when I expected it to 'just work™':

- RAID5/6 are still not stable

- it will not mount a RAID in degraded mode automatically, failing the high availability promise that might tempt you towards RAID.

- swapfile support exists, but it breaks snapshots (and I don't want to snapshot the swapfile)

- Just an Ubuntu/Debian thing, but snapshots are not integrated into the update process unless you install `apt-btrfs-snapshot` (and know that package exists)

rini17
6 replies
6d3h

- RAID5/6 are still not stable

Why does everyone insist on RAID5? I am completely fine with btrfs RAID1, survived few disk crashes as advertised.

- it will not mount a RAID in degraded mode automatically

it will if using -o degraded, it's the tooling which generally sucks and won't support this. I even had problems booting at all with / on multi-device btrfs using recommended tools (dracut, grub-mkconfig), the bugs are known and unfixed, ended up rolling out own initramfs.

- swapfile support exists, but it breaks snapshots

you are supposed to have all your stuff in a subvolume, and snapshot that, not whole toplevel root filesystem...and yes it should be documented better that it interferes with snapshots

- but snapshots are not integrated..

yep tooling sucks

dinosaurdynasty
5 replies
6d2h

RAID1 is a lot more expensive and/or stores a lot less data than RAID5/6.

rini17
4 replies
6d2h

How much is the "a lot"?

pantalaimon
3 replies
6d2h

You can try yourself: https://carfax.org.uk/btrfs-usage/

But generally, RAID1 gets you 50% of storage space, whereas RAID5 gets you 66% (and any odd combination of disks).

bmicraft
2 replies
6d

RAID5 should actually get you n-1 disks of space (if they're equally large), while raid1 only gets you n/2

__david__
1 replies
5d18h

Raid1 actually gets you less than that. If you mirror 5 drives that only gives you 1 drive worth of data (with 4 redundancies).

But… you have to weigh it against losing all your data. With raid5 if any 2 of your disks die at the same time then you've lost your volume and most likely 100% of your data (no matter how many disks you have).

With raid1 you have to lose _all_ your disks before that happens. Typically that's also just 2 but you can mirror 3 or more drives if you need some data to be _really_ resistant to disk failures and you don't care about "wasting" n-1 times the space.

So you end up trading off efficiency of storage space and resiliency to data loss.

Myself, disks got cheap enough that I always just buy 2 disks and mirror them. I find it easier to reason about overall, especially in the face of a degraded array.

rini17
0 replies
5d17h

We're talking about btrfs raid1 here which always gives you half of total capacity of all disks (except for extreme cases). With 5 drives it's 2.5 drive worth of data. If 2 drives die means 1 drive worth of data was lost (more or less, depends on balance).

keep_reading
5 replies
6d3h

> it will not mount a RAID in degraded mode automatically,

I remember when btrfs was very young and they announced the ability to create mirrors. I tested this out and was pleased, and then I tested the failure scenario: pull a drive, try to boot.

It wouldn't boot!

I jump into IRC and ask if it's expected that you can't boot from a degraded mirror and the answer was "not supported" which means the mirror is pointless.

Obviously it has improved since then as there's a way to force it to work, but I returned to ZFS on FreeBSD and never looked back

nwmcsween
4 replies
5d18h

Yeah this is the probably the worst default for a fs that supports raid.

wtallis
3 replies
5d17h

It's the safest default. If you want a filesystem that automatically downgrades the safety of your data when things go wrong, you should have to opt-in to such behavior (and btrfs does make that possible).

keep_reading
2 replies
5d16h

There's less risk of damage from running off a single disk in a faulted mirror than to rebuild the mirror. The stress the drive is under to rebuild can and will kill drives at the end of their life.

wtallis
0 replies
5d15h

If one half of a mirrored pair dies, you're left with one copy of the data on a drive that's at risk of also dying, plus whatever portion of the data you have backed up. If you then write new data to that drive, it isn't mirrored anywhere and is also guaranteed to not be backed up yet. So the newly written data is at equal or higher risk than the data that was already on the surviving drive. It's not hard to see how silently accepting new writes that cannot be mirrored could be an unacceptable risk to some end users.

With btrfs, you can easily add a new drive so that new writes can be accepted and mirrored immediately, then start rebuilding the old data with lost redundancy (possibly after running another incremental backup, which will be less stressful to the drive than the full rebuild). But the "add a new drive" step happens outside the kernel, in userspace and possibly in meatspace, so it can't be a default action for the kernel to take.

ndsipa_pomu
0 replies
5d5h

If that is a concern, then it sounds like running mirrored disks isn't sufficient for your use case and maybe require three disks to enable a disk failure and a rebuild to not cause an issue.

Also, if you're happy running off a single disk in a faulted mirror, then I'd question why you've got the mirror setup at all.

webstrand
3 replies
6d3h

Can you not just place the swapfile inside of another subvolume? Since subvolumes are not included in snapshots. There's generally a bunch of stuff in /var that you don't want to include in snapshots, too, so it's not like putting the swapfile inside of a subvolume is an exotic task.

pantalaimon
2 replies
6d3h

I guess so, I just didn't know when I created the swapfile and was surprised later when I tried to create a snapshot.

xorcist
1 replies
5d23h

Be careful, just like disk images, you probably don't want to place swapfiles on a CoW filesystem.

Bu9818
0 replies
5d8h

You can mark individual files as No_COW in btrfs, and No_COW + preallocation is a requirement for swapfiles anyway due to how the swap subsystem works.

ndsipa_pomu
0 replies
6d4h

> Just an Ubuntu/Debian thing, but snapshots are not integrated into the update process unless you install `apt-btrfs-snapshot` (and know that package exists)

Thanks - I did not know about that.

I agree with the stance of not mounting a degraded RAID automatically as then the danger is that someone might not notice it and be subjected to total data loss later on. The best option would be to allow over-riding that choice if the RAID is otherwise monitored.

lproven
12 replies
6d2h

Aside from the issues described further downthread:

• On Btrfs the `df` command lies. You can't get an accurate count of free space.

• There is no working `fsck` and the existing repair tools come with dire warnings. Take these very very seriously. I have tested them. They do not work and will destroy data.

• The main point of Btrfs is snapshots. [open]SUSE, Spiral Linux, Garuda Linux and siduction all use these heavily for transactional updates.

But the snapshot tool cannot test that there's enough free space for the snapshot, because `df` lies. So, it will fill up your disk.

Writing to a full Btrfs volume will corrupt it. In my testing it destroyed my root partition roughly once per year. It was the most unstable fs I have tried since the era of ext2 in the mid-1990s. (Yes I am that old.)

creatonez
3 replies
5d18h

> There is no working `fsck` and the existing repair tools come with dire warnings

This stems from a misunderstanding. Fsck and fsck-adjacent tools have three purposes:

1. Replay journal entries in a journalled filesystem, so that the filesystem is repaired to a good state for mounting

2. Scrub through checksums and recover any data/metadata that has a redundant copy

3. In rare cases, a fsck tool encountering invalid data can make guesses as to how the filesystems should be structured -- basically, shot in the dark attempts at recovery.

Btrfs does not need #1 because it is not journalled. Assuming write barriers are working, any partially written copy of the filesystem is valid and will simply appear as if the pending writes had been rolled back. This aspect alone greatly diminishes the need for a fsck tool that filesystems like ext4 have.

As for #2, Btrfs already has scrub support. No issues there.

As for #3, it's questionable whether you should ever rely on such functionality, and fsck tools that do implement such functionality tend to have little maneuverability in the first place.

lproven
1 replies
5d4h

See my reply below.

I disagree.

You seem to be attempting to justify a profound failing by quibbling about the meaning of words or commands.

The real problem is: Btrfs corrupts readily, and it lacks tools to fix the corruption.

What the tools are called, what their functional role is theoretically meant to be, and whether this is justified in a tool of a given name is tangential and relatively speaking unimportant.

curt15
0 replies
4d22h

>The real problem is: Btrfs corrupts readily, and it lacks tools to fix the corruption.

Here's a recent example of corruption unearthed by users after Fedora started defaulting to btrfs: https://bugzilla.redhat.com/show_bug.cgi?id=2169947.

hulitu
0 replies
4d7h

I thought that the main purpose of fsck was to rebuild the inode table and to put the filesystem in a correct state (all inodes linked in inode table).

jhoechtl
2 replies
6d

When did you run in any of these issues recently? These issues were a fact in the past but have been sorted out.

df is not lying because the fs layer reports corrupt data but because of dedup and fs layering.

lproven
1 replies
5d4h

I left SUSE close to the end of 2021, and I had had to reinstall my work laptop twice that year alone. I consider that recent enough to call it current.

> df is not lying

To me, that reads as "df isn't lying because $EXCUSES."

I disagree. I don't care about excuses. I want a 100% accurate accounting of free space at all times via the standard xNix free-disk-space reporting command, and the same from the APIs that command uses so that applications can also get an accurate report of free space.

If a filesystem cannot report free space reliably and accurately, then that filesystem is IMHO broken. Excuses do not exonerate the FS, and having other FS-specific commands that can report free space do not exonerate it. The `df` command must work, or the FS is broken.

The primary point of Btrfs is that it is the only GPL snapshot-capable FS. The other stuff is gravy: it's a bonus. There are distros that use Btrfs that don't use snapshots, such as Fedora.

Some Btrfs advocates use this to claim that the problems are not problematic. If the filesystem is of interest on the basis of feature $FOO, then "product $BAR does not exhibit this problem" is not an endorsement or a refutation if $BAR does not use feature $FOO.

Btrfs RAID is broken in important ways, but that is not a deal-breaker because there are other perfectly good ways of obtaining that functionality using other parts of the Linux stack. If no feature or functionality is lost considering the OS and stack as a whole, then that isn't a problem. However, this remains serious and an issue.

Additional problems include:

• Poor integration into the overall industry-wide OS stack.

Examples:

- Existing commands do not work or give inconsistent results.

- Duplication of functionality (e.g. overlap with `mdraid`)

• Poor integration into specific vendors' OS stacks.

Examples:

- SUSE uses Btrfs heavily.

But SUSE's `zypper` package manager is not integrated with its `snapper` tool. Zypper doesn't include snapshot space used by Snapper in its space estimation.

Snapper is integrated with Btrfs; licence restrictions notwithstanding, I would be much reassured if Snapper supported other COW filesystems.

(This has been attempted but I don't think anything shipped -- https://github.com/openSUSE/snapper/issues/145 . I welcome correction on this!)

The transactional features of SUSE's MicroOS family of distros rely heavily on Btrfs. As such, this lack of awareness of snapshot space utilization deeply worries me. I have raised this with SUSE management, but my concerns were dismissed. That worries me.

What I want to see, for clarity, is for Zypper to look at what packages will be replaced, then ask the FS how big the consequent snapshot will be, and include that snapshot in its space estimation checks before beginning the operation so that at least the packaging operation can be safely aborted before starting.

A better implementation would be to integrate package management with snapshot management so that older snapshots could be automatically pruned to ensure necessary space is made available, while also ensuring that a pre-operation snapshot is retained for rollback. That's harder but would work better.

As it is, currently neither is attempted, and Zypper will start actions that result in filling the disk and thus trashing the FS, and there are no working repair tools to recover.

- Red Hat removed Btrfs support from RHEL. As a result it has had to bodge transactional package management together by grafting Git-like functionality into OStree, then building two entirely new packaging systems around OStree, one for the OS itself and a different one for GUI-level packages. The latter is Flatpak, of course.

This strikes me as prime evidence that:

1. Btrfs isn't ready.

2. Linux needs an in-kernel COW filesystem -- because much of the complexity of OStree, Flatpak, Nix/NixOS, Guix, SUSE's `transactional-update` commands and so on would be rendered unnecessary if it were in there.

yencabulator
0 replies
3d21h

> To me, that reads as "df isn't lying because $EXCUSES."

In that case, df also lies on ZFS, bcachefs, and LVM+snapshots. It's in the nature of thin allocation and CoW; if you ask two things sharing storage how much space they have available, that doesn't mean that space is available to both of them at the same time.

ThatPlayer
2 replies
5d12h

The df issue to me seems like an artifact of new data's RAID level being undecided. Because one of the planned features is per-subvolume raid level. Or at the very least metadata and data can have different RAID levels.

lproven
1 replies
5d4h

Nope. It's still an issue on single volume on a single disk.

ThatPlayer
0 replies
5d

You can use duplicate on a single disk. IIRC metadata defaults to that.

VancouverMan
1 replies
6d1h

I've experienced corruption and data loss with Btrfs each of the times I've tried using it, too, after only about a week of use at most.

Thankfully, all of those incidents were with some non-critical, throw-away VMs where the data loss wasn't really an issue.

I've also used ext4 under the same circumstances for years, and I can't think of a single time that I've lost data, nor have I experienced corruption that fsck couldn't easily deal with.

I, too, would have to go back to the 1990s to think of a filesystem I used that was that unreliable.

After what I experienced, I don't trust Btrfs at all, and I have no plans to ever use it again.

lproven
0 replies
6d

I am glad to hear it's not just me!

I worked at SUSE for 4Y and used it every day. The company is in deep denial about its problems, or that there are any problems, and when I pointed at ZFS as a more mature tool, this was actually mocked.

viraptor
9 replies
6d5h

One that seems to catch out many people is that you need to explicitly add the "degraded" option to allow mounting raid volumes which contain a broken drive. This is opposite to almost all other filesystems. I've seen people confused and thinking they lost the whole volume, even though they just need to replace the bad drive and rebuild as usual.

ndsipa_pomu
8 replies
6d4h

Personally, I agree with that. It's good practise for systems to fail quickly if they find themselves in an uncertain state and with RAID, it's important for the operator to know if they've suddenly lost redundancy so that they can resolve the issue.

keep_reading
5 replies
6d3h

If you can't even boot because of a mystery error how are you supposed to resolve the issue?

It should instead be giving the user error messages written to their terminal, in logs, etc instead of breaking the entire system until the user finds the manual

wtallis
4 replies
6d2h

If your NAS has a hot spare drive installed then it should probably include the degraded mount option by default and automatically add the hot spare drive to the filesystem in the event of a failure. Alternatively, if there's enough free space on the surviving drives to rebalance the array and restore redundancy without replacing the failed drive, that operation could be kicked off automatically. Or the filesystem can be (hopefully temporarily) set to not store new data redundantly, if that is an acceptable risk for the user. But the filesystem cannot know which method the user would prefer; automatically rebuilding the array involves policy decisions that are outside the scope of the filesystem and requires userspace tooling.

If the system doesn't have spare capacity ready, the only sane response is to not boot/mount normally. "giving the user error messages written to their terminal, in logs, etc" isn't a real solution for something like a NAS with no terminal connected and nobody looking at the logs as long as they can still establish a SMB connection; it's too likely to be a silent failure in practice. Mounting the filesystem degraded but read-only makes sense if it's necessary to boot the system so that the user (or their pre-configured userspace tooling) can decide how to deal with the problem, but a lot of Linux distros aren't happy with the root filesystem being read-only.

In summary: there's no single right answer to the problem of a failed drive, and btrfs defaults to what is the safest behavior based on the information available to the filesystem itself. Userspace tooling with more information can make other, less universal choices. A distro that tries to simply adopt btrfs as a drop-in replacement for ext4 probably doesn't have all the tooling necessary to make good use of the unique features of btrfs.

keep_reading
3 replies
5d23h

> If the system doesn't have spare capacity ready, the only sane response is to not boot/mount normally.

It doesn't need the spare to "boot normally" and the system can turn on a scary LED, ring bells, call you, text you, hit you up on WhatsApp, DM you on Instagram, or whatever method you want your NAS to use to notify you there's a degradation. (You're monitoring it right??)

This explanation of "it's dangerous to boot off a degraded array" is lunacy. I will not take this terrible advice from armchair experts when I've been doing this for over 25 years

wtallis
2 replies
5d23h

I didn't say it's dangerous to boot off a degraded array. I said it's dangerous to boot off a degraded array normally. Mounting it degraded but read-only is reasonable, because that prevents silently writing new data without the level of redundancy the user previously requested.

There's nothing terrible about advice against responding to a drive failure by putting the system into an even more precarious state without user interaction.

nwmcsween
1 replies
5d18h

Just wondering have you worked on any large DCs or large NAS or SAN systems? Drive failures are a daily occurrence in places with a lot of spinning metal, having things fail to boot by default would be a nightmare.

wtallis
0 replies
5d17h

> having things fail to boot by default would be a nightmare.

Having things fail to boot would just mean you haven't configured your system appropriately for your environment. If you are using a btrfs RAID filesystem for your root filesystem, and you need that fs to be writeable in order to boot, and you want it to boot even if it's missing a drive, then you need to add an extra mount option and a few lines to your init scripts to persist new downgraded RAID settings in the event a degraded mount was necessary.

But that's hardly the only valid use case for btrfs; plenty of users want strong guarantees about the redundancy of their data rather than silent downgrading.

Also, do you really expect me to believe that any of the large shops still running enough spinning rust to have daily drive failures are still booting off those arrays instead of having separate SSDs as their boot drives? Separate storage of the OS from storage of the important data is such a common and long-ingrained practice that it is embodied in the physical layout of typical server systems, and the primary reason for it is the need for different tradeoffs between performance, redundancy, capacity and cost.

viraptor
1 replies
6d4h

It really depends on what's your situation. At home I probably want the disks to stay idle until replacement. In a bigger production system I want to keep the availability and ping the on-call person to replace the drive in the background.

ndsipa_pomu
0 replies
6d4h

Exactly - the default should be to get someone to pay attention to it when something breaks, and if you're planning on high availability, then you can choose that, assuming you keep an eye on the health of the RAID.

throw0101c
6 replies
6d5h

> What are some of the potential pitfalls?

RAID-5/6:

* https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid5...

viraptor
5 replies
6d4h

Wouldn't describe that as a pitfall. As far as I can tell, every place in the docs/help which talks about those raid modes tell you they're experimental and shouldn't be used. At that point it's "you've done a stupid thing and discovered the consequences you were told about".

throw0101c
3 replies
6d4h

It's a feature that other file systems have: why should I choose Btrfs when (e.g.) ZFS can do everything it can and more on top of that?

I remember when Btrfs was announced in 2007 as I was already running Solaris 10 with ZFS in production (and ZFS had non-"experimental" RAID-5-like RAID-Z from day one). Here we are 15+ years later and Btrfs still doesn't have it?

viraptor
0 replies
6d4h

I'm just saying it's not a pitfall, because it's clearly documented up front and warned about. That's orthogonal to whether you should choose this filesystem and why a feature is implemented or not.

pantalaimon
0 replies
6d4h

The nice thing about btrfs is that you can add/remove drives at will. That allows for easy expansion of an existing pool.

chasil
0 replies
5d22h

The one thing that ZFS cannot do is defragment a pool.

There are abuse patterns that are toxic for ZFS pools (and all other filesystems). Btrfs appears to be able to repair this damage.

https://tim.cexx.org/?p=1236

https://www.usenix.org/system/files/login/articles/login_sum...

pantalaimon
0 replies
6d4h

eh 10 years ago RAID56 was already declared 'pretty much ready', that hint was added later when major data loss bugs were discovered.

curt15
2 replies
6d4h

It performs poorly for certain workloads -- notably DBs -- unless you disable copy-on-write, compression, and checksumming. But then why use it over ext4 in the first place?

ndsipa_pomu
1 replies
6d4h

That's not surprising though, as DBs usually work best when given raw storage - the features of the filesystem are being duplicated by the DB and thus doing almost twice the work.

What seems particularly interesting about Bcachefs is how much it seems to be using database concepts to implement a general filesystem. Ultimately, it seems inevitable that filesystems and databases will converge as they're both supposed to manage data.

PlutoIsAPlanet
0 replies
6d1h

The "fix" to databases and VMs on btrfs by disabling CoW unfortunately disables nearly all the useful features.

Ridj48dhsnsh
2 replies
6d4h

Just an anecdote, but when I used gocryptfs on a btrfs partition, I'd always end up with a few corrupted files on power failure. After switching to gocryptfs on ext4, I never have any corruption.

redneb
1 replies
5d15h

I have used heavily[1] the combination btrfs+gocryptfs for many years and had no problems. The only quirk is that I have to pass the -noprealloc flag to gocryptfs, otherwise the performance is really bad.

[1] by heavily, I mean that I use it for my home directory

Ridj48dhsnsh
0 replies
5d9h

I used it for my home directory too for several years up until a few months ago, but with default settings. The corrupted files were almost always Firefox cookies db and cache files. Maybe there's something specific about how Firefox writes them that makes them prone to corruption.

chasil
0 replies
5d23h

With ZFS, if a redundant member of a pool goes offline for a time but is then returned, only the updated blocks are written to bring it up to date, and this happens automatically.

Unfortunately, btrfs is not that smart, and you must trigger a rebalance event to rewrite every block in the filesystem to return to full redundancy.

This rebalance behavior is a deal-killer for many uses.

kevincox
15 replies
6d2h

IMHO bcachefs has important advantages over ZFS. It is far more flexible. ZFS is really similar to traditional block-based RAID. You can get pretty flexible configurations but 1. They are largely fixed after creation and 2. They only operate at "dataset" level granularity.

bachefs has a really flexible design here where you basically add all of your disks to the storage pool and then you can pick redundancy and performance settings per folder (arbitrary subtrees, not just datasets decided at setup time) or even file. For example you can configure a default of 2 replicas for all data, but for your cache directory set it to 1 replica. If you have an important documents folder you can set that to 3 replicas, or 4.2 erasure coding.

Similarly you can tell it to put your cache folder on devices labeled "ssd" but your documents folder should write to "ssd" but then be migrated to "hdd" when they are cold.

And again, all of this can be set at any time on any subtree. Not just when you initially set up your disks or create the directories.

FullyFunctional
7 replies
6d1h

The #1 point of ZFS is protections again bitrot, ie. checksums on all data. Does bcachefs do this?

lizknope
5 replies
6d

https://bcachefs.org/

It's literally the second item listed on the main web site

Full data and metadata checksumming

FullyFunctional
4 replies
5d17h

Thanks, that's good news. Unfortunately it seems bcachefs isn't using pools like ZFS but instead, like btrfs, creates file systems directly on a collection of disks. That's a bummer if true.

wtallis
2 replies
5d16h

What are the use cases where you find micromanaging vdevs worthwhile compared to automatic storage allocation that can be changed and rebalanced later? bcachefs does offer several per-device controls over data placement and redundancy that btrfs lacks; are those still not enough?

FullyFunctional
1 replies
5d14h

Exactly, I _don't_ want to micromanage devices so I throw them all into a pool (with defined redundancy) and then I allocate file systems and vdevs (for iSCSI) from that. With btrfs (and bcachefs) I have to manually assign devices to file systems and I can't have multiple file systems (well, you can have sub-volumes but you can't avoid a big file system in that collection of devices).

Maybe I'm missing something, but the pool abstraction makes this very clean and clear.

wtallis
0 replies
5d13h

> so I throw them all into a pool (with defined redundancy) and then I allocate file systems and vdevs (for iSCSI) from that.

That sounds backwards. Don't you have to manually define the layout of the vdevs first in order to establish the redundancy, and then allocate the volumes you use for filesystems or iSCSI? If you just do a `zpool create` and give it a dozen disks and ask for raidz2, you're just creating a single vdev that's a RAID6 over all the drives. There's an extra step compared to the btrfs workflow, but if you're not using that opportunity to micromanage your array layout I don't see why you'd prefer that extra step to exist.

> and I can't have multiple file systems (well, you can have sub-volumes but you can't avoid a big file system in that collection of devices).

Isn't this a purely cosmetic complaint? With at least btrfs, you don't even have to mount the root volume, you can simply mount the subvolumes directly wherever you want them and pretty much ignore the existence of the root volume except when provisioning more subvolumes from the root. You can pretend that you do have a ZFS-style pool abstraction, but one that's navigable like a filesystem in the Unix tradition instead of requiring non-standard tooling to inspect.

accelbred
0 replies
5d1h

If you wan't volume manager features, you can put btrfs/bcachefs on top of LVM

Tuna-Fish
0 replies
5d22h

bcachefs currently has full metadata and data checksumming, but there is no scrub implementation. This will likely be in the works soon now that it has been merged.

jl6
5 replies
6d

If you mix a small SSD and a large HDD into a single bcachefs pool, and set it to 2 replicas, do writes have to succeed on both devices before returning success? I.e. is performance constrained by the slowest device? And what happens when the small SSD fills up? Does it carry on writing to the HDD but with only one replica, or does it report no space left?

wtallis
3 replies
5d22h

Are you trying to ask if bcachefs lets you set up stupid configurations? If you only have two devices and you configure it use replication to store two copies of all your data, then you are unavoidably constrained by the capacity of the smaller device. Whether the smaller device is a hard drive or SSD is irrelevant, because neither copy can be regarded as a discardable cache when they're both necessary to maintain the requested level of redundancy.

jl6
2 replies
5d22h

I am trying to understand bcachefs by asking about an edge case that might be illustrative.

> you are unavoidably constrained by the capacity of the smaller device

Sure, so what does bcachefs actually do about it? ENOSPC?

Answers to the question on synchronous write behavior also welcome.

wtallis
1 replies
5d18h

> I am trying to understand bcachefs by asking about an edge case that might be illustrative.

It looks more like you're questioning whether the filesystem that Linus just merged does obviously wrong things for the simplest test cases of its headline features. Has something given you cause to suspect that this filesystem is so deeply and thoroughly flawed? Because this doesn't quite look like trying to learn, it looks like trying to find an excuse to dismiss bcachefs before even reading any of the docs. Asking if everything about the filesystem is a lie seems really odd.

hulitu
0 replies
4d8h

> Asking if everything about the filesystem is a lie seems really odd.

This person was asking pertinent questions, no need to bash it.

I lost some data with XFS after it was declared "stable" in the linux kernel. Every day at around 00:00 the power will fail for around 2 seconds. Other filesystems (ext2, jfs, reiser) will do a fsck, but xfs was smarter. After 2 or 3 crashes the xfs volume will not be usable anymore (no fsck possible).

So yes, some of us do need more than a "trust me, it's ok" when we are talking about our data.

thomastjeffery
0 replies
5d21h

You explicitly configure the cache, as laid out here: https://bcachefs.org/Caching/

In the common case that you mentioned, data present on the full SSD would be overwritten "in standard LRU fashion"; meaning the "Least Recently Used" data would no longer be cached. New data would be written to the SSD while a background "rebalance thread" would copy that data to the HDD. I assume that the "sync" command would wait for the "rebalance thread" to finish, though I will admit my own ignorance on that front.

soupdiver
0 replies
6d2h

that actually sounds quite neat

fodkodrasz
9 replies
6d5h

It is (soon) officially in kernel, as opposed to zfs.

It has a more limited feature set and said to have simpler codebase than zfs/btrfs. It has a single outstanding non-stable feature.

It seems to be in active development, while btrfs seems to have become stagnant/abandonware before it was finished/stabilised completely. I have read several horror stories about data loss, so I have avoided it so far.

On the other hand it is not widely deployed yet, there is less accumulated knowledge than in case of zfs.

I'm looking forward to trying it in my NAS when buying new disks next year. The COW snapshots would fit my needs (automatic daily snapshots, weekly backups).

(Now using LUKS+LVM+ext4, this would give a better, more integrated, deduplicated solution, I have lots of duplicated data right now)

viraptor
8 replies
6d4h

> while btrfs seems to have become stagnant/abandonware before it was finished/stabilised completely

Why would you think so? I can't remember the last time a kernel was released without something at least a bit exciting about btrfs

https://kernelnewbies.org/LinuxChanges#Linux_6.5.File_system...

Geezus_42
6 replies
6d4h

Because they still haven't fixed the write hole issues in certain RAID configurations,which have been known for a decade or more.

viraptor
5 replies
6d4h

A project not finishing some feature is not the same as being abandoned. It seems lots of people are happy to use btrfs in production without that raid mode. In other words, for all the complaining about raid5 that happens every time btrfs is mentioned, you'd think there would be at least one person who cares enough to implement it. Yet people use it in production and keep improving the other parts of that project instead.

Geezus_42
4 replies
6d3h

It's fine for single a single disk or something that presents as a single disk, like a SAN. RAID1 seems fine also. I really wanted to love it, but after it ate my data a couple of times, I gave up trying to use it for mass storage. At that time they didn't have warnings in the documentation and had stated that RAID5/6 were basically complete.

I found using mirrored vdevs in ZFS much easier to manage and much more stable.

creatonez
2 replies
5d19h

RAID1 in Btrfs is not entirely fine. It won't nuke your data and there is no write-hole issue, but if a disk fails you'll have to go into a read-only mode during the rebuild and deal with various hurdles in getting it rebuilt.

wtallis
0 replies
5d17h

> but if a disk fails you'll have to go into a read-only mode during the rebuild and deal with various hurdles in getting it rebuilt.

I can't say for sure that this never happens, but that's certainly not been the failure mode for any of the drive failures my btrfs RAID1 has experienced. I don't think I've ever needed to reboot or even remount my filesystem, just replace the failed drive (physically, then in software). But I always have more than two drives in the filesystem, so a single drive failure only puts a fraction of my data at risk, not everything.

viraptor
0 replies
5d16h

> if a disk fails you'll have to go into a read-only mode during the rebuild

That's not correct, you can rebuild online. The readonly mode is only relevant when you reboot during the failure and don't have the right options set on the volume.

But you totally can replace a live drive without affecting the availability.

wtallis
0 replies
5d20h

> I found using mirrored vdevs in ZFS much easier to manage and much more stable.

That's not exactly a fair comparison. If you restricted your usage of btrfs to a similarly narrow range of features, you would probably have had a much better experience.

creatonez
0 replies
5d19h

> I can't remember the last time a kernel was released without something at least a bit exciting about btrfs

They are fixing the fixable issues, but the on-disk format still makes some gotchas inevitable. It sounds like there's never going to be a great solution to live rebuilding of redundancy.

c0balt
5 replies
6d5h

The main advantage, if I understood it correctly, is supposed to be performance. The promise is to have similar speeds to ext4/xfs with the feature set of btrfs/ZFS. While that sounds nice it took a lot of time to get it stable and upstreamed. Like any FS you might not want to go with the latest shiny thing but there are some that are willing to risk it, similar to debates around Btrfs vs ZFS.

The last benchmarks from Phoenix are a few years old but look promising: https://www.phoronix.com/review/bcachefs-linux-2019

ndsipa_pomu
4 replies
6d4h

From that benchmark article

> The design features of this file-system are similar to ZFS/Btrfs and include native encryption, snapshots, compression, caching, multi-device/RAID support, and more. But even with all of its features, it aims to offer XFS/EXT4-like performance, which is something that can't generally be said for Btrfs.

I was surprised at that as I believed that btrfs is generally faster than ext4. Looking ahead to the last page, the geometric mean or the benchmarks supports that view too.

yencabulator
1 replies
3d21h

I worked on Ceph, a distributed storage system, for a while. Here's what we learned from benchmarks (over a decade ago):

Btrfs goes very fast at first but slows down when it has to start pruning/compacting its on-disk btree structures, and then the performance suffers bad. Thus, btrfs works best when you have a spiky workload that lets it "catch up", and you never fill the disk. Concrete example: historically, removing a snapshot while under load was a disaster, with IO waits over 2 minutes. So, both sides are correct: btrfs is very fast and btrfs is very slow.

XFS was never crazy fast, and has the smallest feature set of bunch, but it just kept chugging at the same pace with almost no change, regardless of what the workload did. In more complex use, you had to avoid triggering bad behavior; e.g. there was a fixed number of write streams open, something like 8, and if you had more than that many concurrent writes going your blocks got fragmented. It was very much a freight train; not particularly fast but very predictable performance, and no serious degradation ever.

ext4 was sort of in between those; mostly very fast, with some hiccups in performance. Great as long as your storage is 100% reliable -- we had scrubbing in-product on top of the filesystem.

We ended up recommending xfs to most customers, at the time. Predictability trumped minor gain in performance, for most uses.

bcrl
0 replies
3d18h

I had a test case with ext4 where it would take 80 seconds to write out an 8MB file on a 16TB filesystem. ext4 does not handle free space fragmentation very well as it still relies on block group bitmaps. Oh the legacy baggage it carries...

BenjiWiebe
1 replies
6d3h

The geometric mean on the last page shows ext4 to be faster than btrfs.

ndsipa_pomu
0 replies
6d3h

Oops - I was thinking that smaller was better.

linsomniac
2 replies
6d4h

Working deduplication would be amazing! ZFS has deduplication, but every time I've tried it has ended in a world of pain. Maybe they've fixed it in a more recent release, but the amount of RAM required for deduplication always outstripped the amount of RAM I had available to give it (the deduplication tables have to reside in RAM).

ptman
0 replies
6d4h

ZFS now has reflink support, which doesn't require lots of RAM, but isn't done automatically while writing. You need to run something like https://github.com/markfasheh/duperemove

mastax
0 replies
6d3h

The deduplication tables can be put in a special vdev now, I think.

lproven
0 replies
6d2h

I tried to explain in this piece:

https://www.theregister.com/2022/03/18/bcachefs/

2OEH8eoCRo0
0 replies
6d5h

> It feels like it's feature set is equivalent to ZFS.

It does something that ZFS can't- be merged into the kernel.

ksec
12 replies
6d2h

Am I correct to assume this is GPL and not BSD or MIT? If this is GPL I guess there is no chance this will ship with BSD. And ZFS is sadly ( but not deal breaking ) CDDL.

nolist_policy
6 replies
5d23h

BSD and GPL is compatible thought, you just have to ship the resulting work under GPL ;)

yjftsjthsd-h
3 replies
5d19h

Thus making it just as bad as CDDL+GPL; you can do it out of tree, but it'll never get mainlined.

arp242
2 replies
5d12h

You mean like ZFS and Dtrace never got mainlined in FreeBSD?

yjftsjthsd-h
1 replies
5d11h

CDDL is copyleft but not viral, which makes it much easier (read: politically possible) to include in a permissively licensed project than anything under a viral copyleft license like GPL.

arp242
0 replies
5d2h

FreeBSD has tons of GPL code, including in the kernel. There's been some effort to move away from that, but it's far from a foregone conclusion that it's impossible to merge new GPL code.

A bigger question would be "why bcachefs when we have a stable ZFS in base?"

klardotsh
1 replies
5d15h

... which in the context of a BSD OS ever supporting a bcachefs drive, means bcachefs is not compatible with BSD OSes given it SPDX identifies as GPL-2.0: https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/...

Yes, you can bundle GPL software with BSD software, much the same as I can bundle EULA'd software with BSD software: under the stricter terms. Obviously that's not the intent of the question, but sure, it's theoretically possible in some future fictional world where the BSDs are GPL'd.

arp242
0 replies
5d12h

You don't need to re-license the BSD code to GPL as that code is not a derived from the GPL'd bcache. Distributing binaries is the only tricky bit: a kernel with GPL code would need to be distributed as GPL. This can be solved with loadable kernel modules. There isn't much stopping anyone from using the code from a license perspective.

The problem is this sort of code is strongly tied to the operating system, and porting it would require significant effort, if even feasible at all.

rollcat
4 replies
6d1h

Which BSD? ;) DragonFly has HAMMER2, and the CDDL doesn't seem to be negatively impacting the development and integration of ZFS in FreeBSD.

I too am not a big fan of the GPL, especially if the text of the license is longer than the program itself. But any filesystem (let alone a modern one) is very much a non-trivial feat of engineering; the author should have the full right to protect their (and their users') interests.

ksec
3 replies
5d22h

I am just thinking if we could have one decent FS across both BSD and Linux. And it would still be better than CDDL.

rollcat
2 replies
5d20h

If you just want to exchange data on removable media, use ExFAT. What problem are you trying to solve?

Different OS's prefer different filesystems, because filesystems tend to be both complicated and heavily opinionated in design and implementation - just like different OS's. Linux is the odd one by supporting several dozen, all the other OS's stick to 1 or 2 (usually "old" and "new" like HFS+/APFS, FAT/NTFS, etc), plus UDF&FAT as the lowest common denominator for data interchange. There is very little precedent / use cases for sharing volumes like you suggest: non-removable disks tend to stay in one machine for their lifetime; dual-booting is extremely niche (where Linux/BSD themselves are all niche) and mostly a domain of enthusiasts.

klardotsh
1 replies
5d15h

Sure, but something as mundane as licensing being the reason we can't even try is kinda depressing. One ZFS-grade filesystem that could have Windows, Linux, and BSD implementations, allowing robust external storage with all the promises of modern snapshotting filesystems? Ah. What a dream.

Maybe in 10-20 years if someone white-rooms a bcachefs or ZFS implementation, I guess.

rollcat
0 replies
5d6h

Again, CDDL is effectively "free enough" (it's basically a fork of MPL/weak copyleft) that both FreeBSD and Linux (although not mainline) incorporate ZFS. Whether it can be mixed with GPL is a very good question, the facts are that 1. it wasn't the intention for it to be incompatible, 2. it was never tested in court, 3. several popular distros nowadays ship the spl.ko & zfs.ko binaries.

I recommend this talk by Bryan Cantrill, which provides more context on the whole licensing story for Solaris: <https://www.youtube.com/watch?v=-zRN7XLCRhc&t=1375s> TL;DW: lots of effort and good will was put into making this code as free as possible, with the intention of making it broadly usable.

There are other reasons why e.g. OpenBSD won't adopt ZFS, the main one being the sheer complexity: <https://flak.tedunangst.com/post/ZFS-on-OpenBSD>. Again, different projects, different goals.

Apple heavily considered ZFS (even advertised it as an upcoming feature), but then gave in to NIH. Probably because it didn't fit their plan for mobile, and seeing the runaway success in that dept you can't really blame them.

But we digress! Is ZFS even a good system for removable media? Heck no. What do you want to use it with? Digital cameras? Portable music players? 2007 called and wants its toys back. Backups? Yeah, that can work, but there's little value in being able to recover backups on a foreign OS. Giving someone a random file on a thumb drive? Use ExFAT (or even plain old FAT32), just keep it simple!

jbverschoor
9 replies
6d4h

From the FAQ https://bcachefs.org/FAQ/ :

Bcachefs is safer to use than btrfs and is also shown to outperform zfs in terms of speed and reliability

So what makes it more reliable? I can't find a simple overview of the design / reasoning behind the whole thing and what makes it 'better' than the rest.

__turbobrew__
5 replies
6d

>Bcachefs is safer to use than btrfs

That is a pretty bold claim given that Facebook runs btrfs in prod across the majority of their fleet and almost nobody uses bcachefs.

KennyBlanken
1 replies
5d23h

Facebook et al generally structure their systems to tolerate node failures through redundancy at higher levels. In short: they can not care, or work around, design problems most others can't - or use it for a specific purpose, such as logging, etc.

Btrfs was terrible in the early days, took ages to "git gud", and given a filesystem is supposed to be among the most stable code in the OS, that burned a lot of bridges. It wasn't until fairly recently that btrfs could tolerate being completely filled.

I have no idea how valid the claims are, but bcache's developer claims that btrfs suffers from a lot of terrible early design decisions that can't be undone.

The show-stopper for me is that bcachefs lacks a scrub:

> We're still missing scrub support. Scrub's job will be to walk all data in the filesystem and verify checksums, recovery bad data from a good copy if it exists or notifying the user if data is unrecoverable.

The only argument I see for btrfs is that it supports throwing random drives into a pool and btrfs magically handles redundancy across them.

...but it doesn't support tiered storage like bcachefs does. We're well past "I want to have a redundant filesystem I can randomly add a drive to and magically my shit is mirrored." These days people want to have a large pile of spinning rust with some SSD in front of it, and doing that in all but ZFS is kind of a pain.

devit
0 replies
5d19h

Can't you do most of what scrub would do with something like:

   find . -xdev -print0|xargs -0 cat > /dev/null
(ideally replacing cat with something that continues after errors)
stonogo
0 replies
5d23h

Citing the origin of "move fast and break things" as the exemplar of safety is itself pretty bold.

Nathanba
0 replies
5d15h

I only read that they use it for their build servers and use the snapshotting feature for quickly resetting the build environments. It's not a common use of a filesystem imo

MrDrMcCoy
0 replies
5d23h

Could just mean that RAID5/6 isn't broken...

kevincox
0 replies
6d2h

> Bcachefs is safer to use than btrfs

Citation needed. With a sample size of 1 it ate my data and BTRFS has been running perfectly fine on that system (after I bailed off of bcachefs) and other systems. I think it is great that they consider data safety very important but it will take lots of testing and real-world experience to validate that claim.

dralley
0 replies
6d3h

Well, keep in mind that language has remained unchanged for a few years now. Certainly BcacheFS has some theoretical advantages (lack of write hole etc.) but BTRFS has nonetheless improved since then and BcacheFS has gotten more complex.

It's still a very exciting filesystem so I'm sure we'll be seeing third parties test it rigorously very soon.

Tuna-Fish
0 replies
5d22h

The bcachefs architecture overview is here: https://bcachefs.org/Architecture/

The claim for reliability comes from the idea that bcache has been in heavy production use for a decade, and considered rock solid with plenty of testing of corner cases, and that bcachefs builds a filesystem over the bcache block store so that most of the hard things (like locking and such) are managed by the underlying block store, not the bcachefs layer. This way, the filesystem code itself is simple and easy to understand.

throw0101c
8 replies
6d4h

One pet peeve is with some of nomenclature/UI that they are using (§2.5):

> Snapshots are writeable and may be snapshotted again, creating a tree of snapshots.

* https://bcachefs.org/bcachefs-principles-of-operation.pdf

> bcachefs provides btrfs style writeable snapshots, at subvolume granularity.

* https://bcachefs.org/Snapshots/

Every other implementation of the concept has the implicit idea that snapshots are read-only:

* https://en.wikipedia.org/wiki/Snapshot_(computer_storage)

The word "clone" seems to have been settled on for a read-write copy of things.

pm215
6 replies
6d4h

I'm not sure that's completely standard terminology. For example LVM has "snapshots" that are read-write.

throw0101c
3 replies
6d4h

From lvcreate(1) on an Ubuntu 22.04 LTS system I have CLI on:

       -s|--snapshot
              Create a snapshot. Snapshots provide a "frozen image" of an ori‐
              gin LV.  The snapshot LV can be used, e.g. for backups, while
              the origin LV continues to be used.  This option can create a
        […]
* Also: https://manpages.ubuntu.com/manpages/lunar/en/man8/lvcreate....

The word 'frozen' to me means unmoving / fixed.

As mentioned in the Wikipedia article, the analogy comes from photography where a picture / snap(shot) is a moment frozen in time.

I've admined NetApps in the past, used Veritas VxFS back in the day, and currently run a lot of ZFS (first using it on Solaris 10), and "snapshot" has meant read-only for the past few decades whenever I've run across it.

pm215
2 replies
6d3h

The LVM howto https://tldp.org/HOWTO/LVM-HOWTO/snapshotintro.html says "In LVM2, snapshots are read/write by default". And RedHat's docs https://access.redhat.com/documentation/en-us/red_hat_enterp... include text "Since the snapshot is read/write".

So I think LVM2 snapshots are indeed read/write. Perhaps that manpage sentence was not updated since LVM1 read-only snapshots ?

(I agree with you that 'snapshot' to me strongly suggests read/write; I'm just saying that you can't actually rely on that assumption because it's not just bcachefs that doesn't use that meaning.)

yrro
0 replies
6d3h

The man page continues:

> This option can create a COW (copy on write) snapshot, or a thin snapshot (in a thin pool.) [...] COW snapshots are created when a size is specified. The size is allocated from space in the VG, and is the amount of space that can be used for saving COW blocks as writes occur to the origin or snapshot.

Likely snapshots _were_ originally read-only, and the description of creating thin and COW snapshots was added later, but the man page text was not re-written completely; rather the description of thin and COW snapshots were added to the end of the existing text.

pm215
0 replies
5d7h

I've just noticed that I wrote "I agree with you that 'snapshot' to me strongly suggests read/write" and of course I meant "read only"! Hope that wasn't too confusing...

tw04
0 replies
6d4h

It is basically universal in the enterprise storage world.

brnt
0 replies
6d4h

Id call a RW snapshot a "fork".

hulitu
0 replies
4d8h

How does this compare to file versioning ?

the_duke
8 replies
6d5h

Exciting!

I've played around with it a few times, since it's been easily available in NixOS for a while. Didn't run into any issues with a few disks and a few hundred GB of data.

Some very interesting properties, including (actually efficient) snapshots, spreading data over multiple disks, using a fast SSD as a cache layer for HDDs, built-in encryption (not audited yet though!), automatic deduplication, compression, ...

A lot of that is already available through other file systems (btfs, zfs) and/or by layering different solutions (LVM, dm-crypt, ...) , but getting all of it out of the box with a single FS that's in the mainline kernel is quite appealing.

mike_hock
4 replies
6d3h

I don't find this lack of separation of concerns appealing, especially for crypto since it dilutes auditing resources. And blockdev-level crypto is simpler and harder to fuck up.

Snapshots and compression OTOH are better done on the FS level.

kevincox
3 replies
6d2h

> blockdev-level crypto is simpler and harder to fuck up.

Simpler yes, but since generally no extra metadata is allocated it is vulnerable to some cryptanalysis notably comparing different snapshots of the drive. Doing this properly requires storing unique keys for different versions of data. Doing this with typical blockdev-level encryption is very expensive (you either need to reduce the effective block size which disrupts lots of software that assumes things about block size or store the data out-of-line (typically at the end of the disk) which requires up to 2x writes. Doing this in the filesystem allows strong encryption with minimal performance impact (as the IV write is co-located with data that is changing anyways).

Hello71
2 replies
6d1h

> or store the data out-of-line [...] which requires up to 2x writes

and because writes to separate sectors aren't atomic, you probably want to add journaling or some kind of CoW for crash safety, and oh look now you're actually just writing a filesystem and it's not simpler anymore.

mike_hock
1 replies
5d23h

No, you're writing something that happens to resemble a small subset of the functionality of a filesystem.

Most importantly, you're not duplicating the effort for every filesystem that you want to support encryption for, and the code can largely remain fixed once mature.

Tobu
0 replies
5d20h

kernel developers are free to factor in common functionality, calling into it library style, without making it into an externally visible layer. IIRC fs encryption and case insensitivity are often done that way, and I think I'd count the page cache and bios as larger library-style components as well (as opposed to the VFS layer which is more in a framework style).

jhoechtl
2 replies
6d

It should be highlighted that all these features are supported by btrfs too which has certainly seen much more test in the last years.

Yes there have been issues in the past but since quite a a while its stable and feature-rich. Easily the most advanced free file system

the_duke
1 replies
6d

btrfs does not have the fast-disk cache feature I think, and definitely not the built-in encryption.

MrDrMcCoy
0 replies
5d23h

It also doesn't do block volumes or active dedupe. Btrfs is great, but I'm very excited about having all these features in one place!

sekao
8 replies
6d4h

As I said in the last thread, the author could use support: https://www.patreon.com/join/bcachefs

This has been a solitary and largely self-financed effort by Kent over many years. He must feel pretty great to finally see this happen!

KMag
7 replies
6d2h

When I became a supporter yesterday, the average was under $10 per supporter per month. There's no shame in being a small supporter.

Edit: I currently see 279 supporters for a total of $2,328 per month, so $8.34 average per month per supporter.

KennyBlanken
6 replies
5d23h

It's currently 428 @ $2382, $5.5/per.

The pre-populated options are $20 and $100 a month, which is a lot. I think he'd get a lot more supporters if he dropped those asks to more like $1 and $5.

tpetry
1 replies
5d21h

Its not like that more people will pay with $1. most probably the same amount will pay but he‘ll just get only 10% of the funding.

unixhero
0 replies
4d22h

Huh 10%???

sroecker
1 replies
5d9h

I can't even see the pay what you like feature on the app.

KMag
0 replies
5d1h

Not sure about the app, but on the website, it was below the tiers. On my laptop screen, the pay-what-you-like option was below the fold.

slyall
0 replies
5d12h

I used to support him for $1/month previously, I think it was a specific tier.

From memory I stopped because I was adding other creators and I got the impression he was doing okay.

KMag
0 replies
5d23h

I should have said "paid members" instead of "supporters" to be more precise. He's currently at 432 members, but of those only 283 are paid members.

Either way, $5 is probably both the median and the mode for the payment distribution, whether you include non-paying members or not.

carlhjerpe
2 replies
5d22h

It's surprising no one has mentioned the fact that ZFS doesn't support suspend/resume. So it's a fat no-go for laptops, whereas btrfs and hopefully bcachefs can shine bright supporting all cool features on my laptop so I can learn by playing with them.

(LVM + LUKS + BTRFS does it for me right now)

kaliszad
1 replies
5d17h

Why wouldn't ZFS support suspend/resume? Me and several other people I know happily suspend/ resume a laptop with Root-on-ZFS with Debian or FreeBSD every day without problems.

carlhjerpe
0 replies
5d17h

https://github.com/openzfs/zfs/issues/260

I stand corrected, last time I checked they didn't support freeze/unthaw on ZoL, it's always worked on FreeBSD though.

contr-error
1 replies
6d3h

So not Bca chefs!

withinboredom
0 replies
6d2h

I can't unsee this. Thanks!

treesciencebot
0 replies
6d4h
shmerl
0 replies
6d1h

Nice and congrats!

How stable it to use for day to day desktop tasks?

seanw444
0 replies
6d3h

Finally! I've been waiting to give this a try. I thought about adding it manually, but didn't realize it required patching and recompiling my kernel, which wasn't a worthwhile endeavour to me.

olavgg
0 replies
6d2h

I love storage and filesystems, and I am really looking forward to play with bcachefs. Now bcachefs can be tested easily by millions and build a solid reputation as a true next generation Linux filesystem.

_joel
0 replies
6d4h

Brilliant, I played with this for Ceph OSD's waaaaay back and it worked quite well, albeit a little fragile to deploy.