My big pet peeve is AWS adding buttons in the UI to make "folders".
It is also a fiction! There are no folders in S3.
When you create a folder in Amazon S3, S3 creates a 0-byte object with a key that's set to the folder name that you provided. For example, if you create a folder named photos in your bucket, the Amazon S3 console creates a 0-byte object with the key photos/. The console creates this object to support the idea of folders.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-...
Is that really so different from how folders work on other systems? A directory inode is just an inode.
Yes. It is, in practice, incredibly different.
Imagine you have a file named /some/dir/file.jpg.
In a filesystem, there’s an inode for /some. It contains an entry for /some/dir, which is also an inode, and then in the very deepest level, there is an inode for /some/dir/file.jpg. You can rename /some to /something_else if you want. Think of it kind of like a table:
In S3 (and other object stores), the table is like this: The kind of queries you can do is completely different. There are no inodes in S3. There is just a mapping from keys to objects. There’s an index on these keys, so you can do queries—but the / character is NOT SPECIAL and does not actually have any significance to the S3 storage system and API. The / character only has significance in the UI.You can, if you want, use a completely different character to separate “components” in S3, rather than using /, because / is not special. If you want something like “some:dir:file.jpg” or “some.dir.file.jpg” you can do that. Again, because / is not special.
Thank you, now I understand what the special 0-byte object refers to. It represents an empty folder.
Fair enough, basing folders on object names split by / is pretty inefficient. I wonder why they didn't go with a solution like git's trees.
Alas, no. It represents a tag, e.g. «folder/», that points to a zero byte object.
You can then upload two files, e.g. «folder/file1.txt» and «folder/file2.txt», delete the «folder/», being a tag, and still have the «folder/file1.txt» and «folder/file2.txt» file intact in the S3 bucket.
Deleting «folder/» in a traditional file system, on the other hand, will also delete «file1.txt» and «file2.txt» in it.
It's a matter of a client UI implementation. You can't delete a non-empty folder with POSIX API on common filesystems or FTP too.
However, there are file managers, FTP clients, and S3 clients that will do that for you by deleting individual files.
But if the S3 semantics are not helping you, e.g. with multiple clients doing copy/move/delete operations in the hierarchy you could still end up with files that are not in "directories".
So essentially an S3 file manager must be able to handle the situation where there are files without a "directory"—and that I assume is also the most common case as well for S3. Might just not have the "directories" in the first place.
I have personally never seen the 0-byte files people keep talking about here. In every S3 bucket I’ve ever looked at, the “directories” don’t exist at all. If you have a dir/file1.txt and dir/file2.txt, there is NO such object as dir. Not even a placeholder.
Yeah, this post was the first one I had even heard of them.
Deleting folder/ in a traditional file system will _fail_ if the folder is not empty. Userspace needs to recurse over the directory structure to unlink everything in it before unlinking the actual folder.
What, exactly, is inefficient about it?
Think for a moment about the data structures you would use to represent a directory structure in a filesystem, and the data structures you would use to represent a key/value store.
With a filesystem, if you split a string /some/dir/file.jpg into three parts, “some”, “dir”, “file.jpg”, then you are actually making a decision about the tree structure. And here’s a question—is that a balanced tree you got there? Maybe it’s completely unbalanced! That’s actually inefficient.
Let’s suppose, instead, you treat the key as a plain string and stick it in a tree. You have a lot of freedom now, in how you balance the tree, since you are not forced to stick nodes in the tree at every / character.
It’s just a different efficiency tradeoff. Certain operations are now much less efficient (like “rename a directory” which, on S3, is actually “copy a zillion objects). Some operations are more efficient, like “store a file” or “retrieve a file”.
I think what you’re describing is simply not a hierarchical file system. It’s a different thing that supports different operations and, indeed, is better or worse at different operations.
I think it is fair to say that S3 (as named files) is not a filesystem and it is inefficient to use it directly as such for common filesystem use cases; the same way that you could say it for a tarball[0].
This does not make S3 a bad storage, just a bad filesystem, not everything needs to be a filesystem.
Arguably is it good that S3 is not a filesystem, as it can be a leaky abstraction eg in git you cannot have two tags name "v2" and "v2/feature-1" as you cannot have both a file and a folder with the same name.
For something more closely related to URLs than filenames forcing a filesystem abstraction is a limitation as "/some/url", "/some/url/", and "/some/url/some-default-name-decided-by-the-webserver" can be different.[1]
[0] where a different tradeoff is that searching a file by name is slower but reading many small files can be faster.
[1] maybe they should be the same, but enforcing it is a bad idea
"folders" do not exist in S3 -- why do you keep insisting that they do?
They appear to exist because the key is split on the slash character for navigation in the web front-end. This gives the familiar appearance of a filesystem, but the implementation is at a much higher level.
Except, S3 does let you query by prefix and so the keys have more structure than the second diagram implies: they’re not just random keys, the API implies that common prefixes indicate related objects.
That’s kind of stretching the idea of “more structure” to the breaking point, I think. The key is just a string. There is no entry for directories.
That’s something users do. The API doesn’t imply anything is related.
And prefixes can be anything, not just directories. If you have /some/dir/file.jpg, then you can query using /some/dir/ as a prefix (like a directory!) or you can query using /so as a prefix, or /some/dir/fil as a prefix. It’s just a string. It only looks like a directory when you, the user, decide to interpret the / in the file key as a directory separator. You could just as easily use any other character.
One operation where this difference is significant is renaming a "folder". In UNIX (and even UNIX-y distributed filesystems like HDFS) a rename operation at "folder" level is O(1) as it only involves metadata changes. In S3, renaming a "folder" is O(number of files).
From reading the above, if you have a folder 'dir' and a file 'dir/file', after renaming 'dir' to 'folder', you would just have 'folder' and 'dir/file'.
There is really no such thing as a folder in S3.
If you have something which is dir/file, then NORMALLY “dir” does not exist at all. Only dir/file exists. There is nothing to rename.
If you happen to have something which is named “dir”, then it’s just another file (a.k.a. object). In that scenario, you have two files (objects) named “dir” and “dir/file”. Weird, but nothing stopping you from doing that. You can also have another object named “dir///../file” or something, although that can be inconvenient, for various reasons.
Imho, renaming "folders" on S3 results in copying and deleting O(number of files)
More like O(max(number of files, total file size)). You can’t rename objects in S3. To simulate a rename, you have to copy an object and then delete the old one.
Unlike renames in typical file systems, that isn’t atomic (there will be a time period in which both the old and the new object exist), and it becomes slower the larger the file.
Querying ids by prefix doesn’t make any sense for a normal ID type. Just making this operation available and part of your public API indicates that prefixes are semantically relevant to your API’s ID type.
“Prefix” is not the same thing as “directory”.
I can look up names with the prefix “B” and get Bart, Bella, Brooke, Blake, etc. That doesn’t imply that there’s some kind of semantics associated with prefixes. It’s just a feature of your system that you may find useful. The fact that these names have a common prefix, “B”, is not a particularly interesting thing to me. Just like if I had a list of files, 1.jpg, 10.jpg, 100.jpg, it’s probably not significant that they’re being returned sequentially (because I probably want 2.jpg after 1.jpg).
by this logic the file "foo/bar/" correspond to the filename "f:o:o:/:b:a:r:/" (using a different caracter as separator)
"filesystem" is not a name reserved for Unix-style file systems. There are many types of file system which is not built on according to your description. When I was a kid, I used systems which didn't support directories, but it was still file systems.
It's an incorrect take that a system to manage files must follow a set of patterns like the ones you mentioned to be called "file system".
Terms evolve and now filesytem and "system of files" mean different things,
I would argue that not supporting folders or many other file operations make something not a filesystem today.
You're free to argue whatever you want, but claiming that a file system should have folders as the parent commenter did, or support specific operations, seems a bit meaningless.
I could create a system not supporting folders because it relies on tags or something else. Or I could create a system which is write-only and doesn't support rename or delete.
These systems would be file systems according to how the term has been used for 40 (?) years at least. Just don't see any point in restricting the term to exclude random variants.
Yeah hacker used to not mean someone hacking into a computer and breaking a password, then it did then now it means both that and a tech tinkerer.
Let’s start with the fact that you’re talking to an HTTP api… Even if S3 had web3.0 inodes, the querying semantics would not make sense. It’s a higher level API, because you don’t deal with blocks of magnetic storage and binary buffers. Of course s3 is not a filesystem, that is part of its definition, and reason to be…
I think if you focus too narrowly on the details of the wire protocol, you’ll lose sight of the big picture and the semantics.
S3 is not a filesystem because the semantics are different from the kind of semantics we expect from filesystems. You can’t take the high-level API provided by a filesystem, use S3 as the backing storage, and expect to get good performance out of it unless you use a ton of translation.
Stuff like NFS or CIFS are filesystems. They behave like filesystems, in practice. You can rename files. You can modify files. You can create directories.
Right, the NFS/CIFS support writing blocks, but S3 basically does HTTP get and post verbs. I would say that these concepts are the defining difference. To call S3 a filesystem is not wrong in abstract, but it’s not different than calling Wordpress a filesystem, or DNS, or anything that stores something for you. Of course, it will be inefficient to implement a block write on top of any of these, that’s because you have to literally do it yourself. As in, download the file, edit it, upload again.
I think the blocks are one part of it, and the other part is that S3 doesn’t support renaming or moving objects, and doesn’t have directories (just prefixes). Whenever I’ve seen something with filesystem-like semantics on top of S3, it’s done by using S3 as a storage layer, and building some other kind of view of the storage on top using a separate index.
For example, maybe you have a database mapping file paths to S3 objects. This gives you a separate metadata layer, with S3 as the storage layer for large blocks of data.
Even youngsters are yelling at clouds now. Just a different kind of cloud.
In S3 each file is identified with a full path.
Not only you cannot rename a single file, but you also cannot rename a "folder" (because that would imply a bulk rename on a large number of children of that "folder")
This is the fundamental difference between a first class folder and just a convention on prefixes of full path names.
If you don't allow renames, it doesn't really make sense to have each "folder" store the list of the children.
You can instead have a giant ordered map (some kind of b-tree) that allows you for efficient lookup and scanning neighbouring nodes.
UMich LDAP server, upon which many were based, stored entrys’ hierarchical (distinguished) names with each entry, which I always found a bit weird. AD, eDirectory, and the OpenLDAP HDB backend don’t have this problem.
Another challenge is directory flattening. On a file system "a/b" and "a//b" are usually considered the same path. But on S3 the slash isn't a directory separator, so the paths are distinct. You need to be extra careful when building paths not to include double slashes.
Many tools end up handling this by showing a folder named "a" containing a folder named "" (empty string). This confuses users quite a bit. It's more than the inodes, it's how the tooling handles the abstraction.
Coincidentally I ran into an issue just like this a week ago. A customer facing application failed because there was an object named “/foo/bar” (emphasis on the leading slash).
This created a prefix named “/“ which confused the hell out of the application.
You can create a simulated directory, and write a bunch of files in it, but you can't atomically rename it--behind the scenes each file needs to be copied from old name to new.
The payload still contains a list of other inodes though
What exactly do you think a folder is? It’s just an abstraction for organising data.
S3 doesn’t have that abstraction.
The console UI shows folders but they don’t actually exist in S3. They’re made up by the UI.
It sounds like they have that abstraction in the UI. But if the CLI and API don't have it too, that's weird.
Yeah, the UI and CLI show you “folders”. It’s a client-side thing that doesn’t exist in the actual service. Behind the scenes, the clients are making specific types of queries on the object keys.
You can’t examine when a folder was created (it doesn’t exist in the first place), you can’t rename a folder (it doesn’t exist), you can’t delete a folder (again, it doesn’t exist).
That's just an implementation detail of well known filesystems.
Yes, which is why it's not ideal to reuse the folder metaphor here. Users have an idea how directories work on well-known filesystems and get confused when these fake folders don't behave the same way.
Are all your s3 keys opaque strings (like UUIDs)?, do you use / (slash) in your keys?
If you truly believe S3 has absolutely no connection to folders, you would answer Yes and No.
I don’t think that’s a defensible standpoint.
Folders are an important part of the way most people use filesystems.
Similarly the UI in linux is making up the notion of folders and files in them. But we don't say it doesn't exist.
No, they're not made up. A folder (or directory) is a specific type of inode, just a file is.
S3 doesn't have folders. The UI fakes them by creating a 0-byte object (or file, if you will). It's a kludge.
The UI will fake them without even creating the 0-byte object.
Directories actually exist on the filesystem, which is why you have to create them before use and they can exist and be empty. They don't exist in S3 and neither of those properties do, either. Similarly, common filesystem operations on directories (like efficiently renaming them, and thus the files under them) are not possible in S3.
Of course it can still be useful to group objects in the S3 UI, but it would probably be better to use some kind of prefix-centric UI rather than reusing the folder metaphor when it doesn't match the paradigm people are used to.
Speaking of user interfaces with optical illustions about directory separators:
On the Mac, the Finder lets you have files with slashes in their names, even though it's a Unix file system underneath. Don't believe me? Go try to use the Finder to make a directory whose name is "Reports from 2024/03/10". See?
But as everyone knows, slash is the ONLY character you're not allowed to have in a file or directory name under Unix. It's enforced in the kernel at the system call inteface. There is absolutely no way to make a file with a slash in it. Yet there it is!
The original MacOS operating system used the ":" character to delimit directory names, instead of "/", so you could have files and directories with slashes in their names, justs not with colons in their names.
When Apple transitioned from MacOS to Unix, they did not want to freak out their users by reaming all their files.
So now try to use the Finder (or any app that uses the standard file dialog) to make a folder or file with a ":" in its name on a modern Mac. You still can't!
So now go into the shell and list out the parent directory containing the directory you made with a slash in its name. It's actually called "Reports from 2024:03:10"!
The Mac Finder and system file dialog user interfaces actually switche "/" and ":" when they show paths on the screen!
Try making a file in the shell with colons in it, then look at it in the finder to see the slashes.
However, back in the days of the old MacOS that permitted slashes in file names, there was a handy network gateway box called the "Gatorbox" that was a Localtalk-to-Ethernet AFP/NFS bridge, which took a subtly different approach.
https://en.wikipedia.org/wiki/GatorBox
It took advantage of the fact (or rather it triggered the bug) that the Unix NFS implementation boldly made an end-run around the kernel's safe system call interface that disallowed slashes in file names. So any NFS client could actually trick Unix into putting slashes into file names via the NFS protocol!
It appeared to work just fine, but then down the line the Unix "restore" command would totally shit itself! Of course "dump" worked just fine, never raising an error that it was writing corrupted dumps that you would not be able to read back in your time of need, so you'd only learn that you'd been screwed by the bug and lost all your files months or years later!
So not only does NFS stand for "No File Security", it also stands for "Nasty Forbidden Slashes"!
https://news.ycombinator.com/item?id=31820504
[...]
From the Unix-Haters Handbook:
https://archive.org/stream/TheUnixHatersHandbook/ugh_djvu.tx...
Don't Touch That Slash!
UFS allows any character in a filename except for the slash (/) and the ASCII NUL character. (Some versions of Unix allow ASCII characters with the high-bit, bit 8, set. Others don't.)
This feature is great — especially in versions of Unix based on Berkeley's Fast File System, which allows filenames longer than 14 characters. It means that you are free to construct informative, easy-to-understand filenames like these:
1992 Sales Report
Personnel File: Verne, Jules
rt005mfkbgkw0 . cp
Unfortunately, the rest of Unix isn't as tolerant. Of the filenames shown above, only rt005mfkbgkw0.cp will work with the majority of Unix utilities (which generally can't tolerate spaces in filenames).
However, don't fret: Unix will let you construct filenames that have control characters or graphics symbols in them. (Some versions will even let you build files that have no name at all.) This can be a great security feature — especially if you have control keys on your keyboard that other people don't have on theirs. That's right: you can literally create files with names that other people can't access. It sort of makes up for the lack of serious security access controls in the rest of Unix.
Recall that Unix does place one hard-and-fast restriction on filenames: they may never, ever contain the magic slash character (/), since the Unix kernel uses the slash to denote subdirectories. To enforce this requirement, the Unix kernel simply will never let you create a filename that has a slash in it. (However, you can have a filename with the 0200 bit set, which does list on some versions of Unix as a slash character.)
Never? Well, hardly ever.
Apparently Sun's circa 1990 NFS server (which runs inside the kernel) assumed that an NFS client would never, ever send a filename that had a slash inside it and thus didn't bother to check for the illegal character. We're surprised that the files got written to the dump tape at all. (Then again, perhaps they didn't. There's really no way to tell for sure, is there now?)I'm having a lot of fun imagining this being said to a kid who's trying to buy some folders for school.
Is it an abstraction for requesting the data you want, or an abstraction for storing the data in a retrievable manner?
Weird that it says folders now. I remember it being very strictly called a prefix when I was at AWS.
I think it's just the Web console, It's still prefix in the APIs and CLI.
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObje...
The web console even collapses them like folders on slashes, further obfuscating how it actually works. I remember having to explain to coworkers why it was so slow to load a large bucket.
Hmm well there's no folders but if you interact with the object the URL does become nested. So in a sense it does behave exactly like a folder for all intents and purposes when dealing with it that way. It depends what API you use I guess.
I use S3 just as a web bucket of files (I know it's not the best way to do that but it's what I could easily obtain through our company's processes). But in this case it makes a lot of sense though I try to avoid making folders. But other people using the same hosting do use them.
Except stuff like s3 cli has all these weird names for normal filesystem items and you have to bang your head to try to figure it out what it all means
(also don't get me started on the whole s3api thing)
I see you getting downvotes, but you’re speaking the honest truth, here.
This!
I’m fine with it, I actually appreciate the logic and simplicity behind it, but the amount of times I’ve tried to explain why “folders” on S3 keep disappearing while people stare at me like I’m an idiot is really frustrating.
(When you remove the last file in a “folder” on S3, the “folder” disappears, because that pattern no longer appears in the bucket k/v dictionary so there’s no reason to show it as it never existed in the first place).
I don't know why you are being downvoted, what you said is true and confuses many newcomers.