return to table of content

Visualizing Ext4

aunderscored
7 replies
11h43m

Very cool. This kind of data visualisation can really help understand some of the intricacies of how the disk format actually puts things on disk. e.g. the metadata carefully prealloced for at least some usage. I was interested to see what would happen when it ran out of space but unfortunately the animation stopped before that time was reached

pocketarc
6 replies
11h34m

The first thing that popped into my mind was the old "defragment disk" visualisation. It looked a lot like the later XP version[0].

Sitting in front of the computer watching the old 95/98 defrag program doing its thing[1] is a nice childhood memory for me.

[0]: https://qvdesign.files.wordpress.com/2012/03/defrag-original... [1]: https://academy.avast.com/hs-fs/hubfs/New_Avast_Academy/how_...

iforgotpassword
2 replies
11h17m

The crazy thing about the 9x defrag was that it required that no other program access the disk while it was working. When anything else wrote to disk, it would say "disk contents changed, starting over..."

You were expected to close down all programs and little utilities that hid in the systray (sorry, notification area). The OS itself was just the OS, there was no indexing happening, no update check, no random nonsense nobody understands. You could be absolutely sure nothing would access the disk on a cleanly booted win9x.

pixl97
0 replies
2h41m

As 95 went on making sure hidden applications didn't exist got harder and harder too. It seemed over time more applications would use the equivalent of a TSR application that ran in the background with no icon in the systray/start bar and on a regular basis would fiddle with the disk restarting the entire process.

I remember some commonly used winmodem driver would cause this behavior.

mjevans
0 replies
9h48m

At that point I'd rather it ran like the old DOS versions or later Filesystem checks at startup (very useful for boot partitions).

aunderscored
2 replies
11h29m

I have a similar memory, there's something entrancing about watching a computer do a computationally (in space or time) difficult task with a visualisation about what exactly is actually happening. For defrag it was mostly a progress bar, but still it was fascinating to young me

samsquire
1 replies
10h20m

I kind of think of computer microarchitecure as a factory, as in factorio (Which I admit I have not played, because I've been warned against it)

There's instruction stream streams into L3, L2, L1 caches from DRAM, PCIE, DMA

then there's multiple cores and they each have register files, they shuffle numbers between registers and the caches.

There's reordering going on, there's parallelisation going on, lots of conveyor belts.

It's all so complicated factory.

aunderscored
0 replies
7h46m

Absolutely, just the results of the factory are not physical. You can apply this analogy quite far inside software development, engineers are factories producing data for a compute factory to further process etc. Also I'll echo warnings about factorio, you will lose days. But so very worth it.

densh
4 replies
5h25m

Looking at this diagrams I wonder if there are any file systems that allow for metadata to be stored on a separate device. For example store data on HDD and metadata on an associated SSD drive. I guess the benefits would not be extraordinary to outweigh the added complexity since metadata is much easier to cache in memory.

rubiquity
0 replies
2h36m

BcacheFS does this

riddley
0 replies
4h20m

ZFS does. I've heard of other file systems that can put their journal on a separate device but web search sucks these days and I ran out of time to figure out which.

pixl97
0 replies
2h35m
loeg
0 replies
2h6m

Facebook abuses XFS realtime mode to do this. Omar discusses it some here: https://lwn.net/Articles/943693/

d33
3 replies
6h48m

This inspired me to do this experiment:

dd if=/dev/zero bs=1K count=$(( 256 * 3 )) of=a.ext4

mfks.ext4 a.ext4

mkdir a

sudo mount a.ext4 a

cd a

sudo chown 1000:1000 .

python3 -c 'open("a", "wb").write(b"\xff\x00\x00" * 2000)'

python3 -c 'open("b", "wb").write(b"\xff\xff\x00" * 2000)'

python3 -c 'open("c", "wb").write(b"\xff\x00\xff" * 2000)'

cd ..

sudo umount a

(echo -n 'P6\n512 512\n255\n' ; cat a.ext4 ) > a.ppm

convert a.ppm a.png

The resulting a.png is reversible - you can convert it back to .ppm file, skip first 15 bytes and you should get a valid .ext4 back.

pizzafeelsright
2 replies
2h42m

if twitter didn't do compression it would be fun to store large files as images with twitter as a filesystem

tecleandor
1 replies
1h56m

Well, you can maybe put some qr codes less sensible to compression XD. OK, OK, it was a joke, just do some steganography.

colejohnson66
0 replies
42m

It's possible to abuse Unicode codepoints to store bytes: https://github.com/qntm/base131072

xt00
2 replies
9h49m

I think in many people's goal to "simplify using a computer" it ends up making things that could easily be educational without actively trying to teach you anything -- basically sparks curiosity and informs a bit. (like this great example the author shows here). One example of this (that previously existed in actual computers) is the old trusty red hard drive light telling you that the hard disk is active... if you were like me, you knew the game was going to actually load this time when it showed a particular pattern and you heard the hard drive make a satisfying sequence of fast disk reads. Seems like a nice compromise is to hide the "advanced view" but keep it there for the curious people who likely will be the next generation of computer nerds making the world go 'round.

zontorol
1 replies
8h46m

When you were a young little nerd old timers were saying "such a shame modern computers don't have LEDs showing the individual state of each bit of the control registers like our mainframe did. Computers these days are so dumbed down. Users can't even see where the instruction pointer is which is so useful to get a feel for what the hardware is actually doing"

jjoonathan
0 replies
2h22m

Sure, but I think it's fair to say that feedback has gotten too minimal when you have to put sleep(10) all over your diagnostic processes because there is no other way to tell if the computer/app is laggy or dead.

anupcshan
1 replies
10h42m

I found this nbdkit demo for visualizing filesystem IO interesting - https://rwmj.wordpress.com/2018/11/04/nbd-graphical-viewer/

Levitating
0 replies
1h55m

The author of which is in this thread as well

sgarland
0 replies
6h34m

This reminded me of innodb_ruby [0]. Super useful set of tools to visualize and learn about the InnoDB structure. Example usage [1].

[0]: https://github.com/jeremycole/innodb_ruby

[1]: https://blog.jcole.us/2014/10/02/visualizing-the-impact-of-o...

rwmj
0 replies
6h42m

I did a true graphical visualisation of ext4 at FOSDEM a few years ago. The video is here, the visualisation starts at about 20 minutes:

https://archive.fosdem.org/2019/schedule/event/nbdkit/

Edit: If you're confused about the bit where I talk about the filesystem trims in "blue", well that's because apparently the projector at FOSDEM could not render the light blue colour I was using. I didn't know about this while giving the talk, it looked fine on the laptop screen. There's an accompanying video on my blog which is rendered correctly: https://rwmj.wordpress.com/2018/11/04/nbd-graphical-viewer/

H8crilA
0 replies
5h24m

You can use the Kaitai IDE to visualize various binary formats, down to each byte (or bit). If I remember correctly it has definition files for ext4.

Cieplak
0 replies
8h22m

There's a command-line utility called pixd [1] that generates similar data visualizations on the command line. That said, it only shows static representations of binary data and is not nearly as cool as buredoranna's animated gifs showing filesystem changes over time.

It can be helpful to plot these sorts of pixel arrangements on a Hilbert curve, rather than plotting pixels line by line. I learned this trick from a Ghidra plugin called cantordust [2]. 3blue1brown offers some mathematical intuition for the effectiveness of a Hilbert curve pixel arrangement [3].

[1] https://github.com/FireyFly/pixd

[2] https://inside.battelle.org/blog-details/battelle-publishes-...

[3] https://www.youtube.com/watch?v=3s7h2MHQtxc&t=311s