As I've said in other threads https://news.ycombinator.com/item?id=38847750#38862450, I highly recommend writing an ELF by hand at least once. It's a great exercise to understand the basic parts of an executable. It's also helpful if you want to go the opposite direction of this article - bottom-up instead of top-down.
Lots of other great discussion in various threads on that other HN post.
Writing an ELF file by hand is something I did recently: https://github.com/avik-das/garlic/blob/master/recursive/elf...
To explain the format to myself and others, I also created an interactive visualization for the bytes in the file. It helps me to click on a byte, see an explanation for it and see related bytes elsewhere in the file highlighted. https://scratchpad.avikdas.com/elf-explanation/elf-explanati...
That's an awesome interactive page! Did you write it by hand, or did you use some sort of generator/tool?
I agree it’s very nice. I’d also like to know how it was done.
Also if you click two bytes that are in the same caption group one after another then it bugs out.
Thanks for the feedback. I replied in a sibling comment about how I made it.
For the bug, feel free to email me at avik at avikdas dot com if you'd like. The behavior I verified just now (for me) is that if you click one byte to highlight it, then clicking any other byte in the same group will remove the highlighting.
I wrote it all by hand :)
Lately, I've been using Svelte for interactive visualizations (see my post on using a tool called Astro with Svelte: https://avikdas.com/2023/12/30/interactive-demos-using-astro... ). But this one is all hand-written JS!
You and your web site are huge inspirations for me.
Wow. Cool! I’m a CS teacher and definitely going to use this. Thanks for your work! (Anyone aware of a Windows or Mac equivalent?)
Wow! Very nice work. This is super educational resource. Again nice work.
I’ve had such fun making interactive educational visualisations like this. My life’s work is going into an interactive simulation of the USB protocol. Unfortunately I’m yet to bang it out over a weekend.
Similarly, I'd recommend writing a simple ELF loader. There's a fair bit of implementation complexity in dynamic linking, but if you only support static ELFs then it's straight-forward.
Yes, likewise I wrote a reader that simply tried to parse every bit of a complex ELF binary to report its structure and quickly found myself in poorly documented territory. It’s an education if you want it.
I assume there's a better modern source (assuming for some reason you don't want to reference libbfd &c, but really, you at least want to cross-check with it) but BITD there was an AT&T System V book - I think it was https://books.google.com/books?id=mrImAAAAMAAJ but only at about 70% confidence, it's part of a series and might have been one of the others - that was "arcane but true" for ELF on existing platforms (at the time, which was the mid 1990s, which is why I hope there's a better starting point now...)
I had written a static ELF loader for reasons, but when I was no longer to compile a static version of the binary I wanted to load, I found it wasn't too hard to load the system's dynamic loader instead. That's kind of the best of both worlds --- I can run a dynamicly linked binary, and I didn't have to do the linking and relocations.
I've seriously considered writing an ELF loader that uses a special symbol (like _resolve) where dynamic library resolving is done imperatively. The flexibility from libdl always feel underwhelming.
Any good resources on the matter? I'm gonna need to write a fully featured ELF loader for my language soon. I need to prepare.
Modifying existing ELFs can also be extremely educational and fun. It's a bit frustrating at first because it's more or less impossible to debug this stuff when it doesn't work but when things finally start working it's awesome. Turns out it's possible to patch ELFs in all sorts of interesting ways. With the auxiliary vector you can even have introspection at runtime: Linux gives us the address of the program header table and from there you can get to anywhere. Just gotta extend a LOAD segment to cover the whole binary.
For example I wrote tools to embed lisp modules and code right into my lisp's interpreter executable. The embedded segment is loaded from the ELF automatically, the interpreter just finds it and runs the code. I'm so proud of this little feature I wrote an article about it.
https://www.matheusmoreira.com/articles/self-contained-lone-...
Would be cool if mainstream languages adopted this method.
These tricks can be easily done simply and portably, without caring about the underlying executable format.
One trick is that you can reserve some global array in the executable, which is prefixed by a byte sequence that doesn't occur anywhere. A small utility can find that byte sequence and write custom data after it to create a customized executable.
I think that, also, many executable formats don't mind if something is appended to the executable. If the executable somehow knows its original size (you can write that size somewhere using the previous trick: no grotty executable format parsing required), it can open itself, seek to that offset and read the data.
I think this might be how CLISP creates an .exe file on Windows; I think it takes the base clisp.exe and combines it with the lispinit.mem image into one file.
Put the offset at the end! Famously, the table of contents of a .zip file are the end rather than the beginning, which has many useful properties (such as being able to patch the contents by only appending to an existing file). And you can concatenate an executable and a .zip and get a file which is both.
Yes. Cosmopolitan libc has support for exactly this. It contains a lot of platform-specific hacks in order to open the executable though. I went through the implementation.
I think the problem is this notion of a "memory image". It would be so much easier if the kernel just copied the entire file into memory and called it a day.
This is neat but has a limitation in that it cannot be expanded after the program is compiled and linked. Resizing the array would invalidate all pointers that follow as well as render incorrect any code that takes its size.
This can be solved with a layer of indirection: just append the data to the executable and write its size and file offset in the array. That way the data block can be freely resized. That's the solution people told me to use and indeed the one that I usually see in existing repositories. The problem is you run into some additional complexity later which results in the loss of portability and thus the main reason to choose this method.
That's the crux of the issue. How does an executable open itself? That's where portability goes out the window. I've seen source code that opens argv[0] which is under the control of the parent program and therefore unreliable. I've seen code which opens /proc/self/exe which is Linux specific. I've seen code that calls Win32 API functions to get the path to the executable. All this just so it can open and load into memory a file which the kernel has already loaded, just so it can read some additional data off of it.
My solution sidesteps that question entirely. It just adds a LOAD segment for the embedded data which instructs the kernel to map it in automatically before the program even runs. There's no need to open, seek or read anything, it will already be there by the time execution begins.
The auxiliary vector contains a pointer to the program's segments table so it can reach the data from there. Then it's just a matter of walking this table looking for the custom descriptor segment. It's all done in a structured way, using the standard magic number locations and ranges. There's no chance of a magic number being recognized by mistake.
The only possible portability issue is the availability of the AT_PHDR, AT_PHENT and AT_PHNUM entries in the auxiliary vector. I'm not sure if they're standard. I know Linux has them and it's all I personally care about but if these entries do turn out to be standard then I can confidently say that my method is portable to any ELF-based operating system.
is ELF unique to x86/x64?
It is not, and it has never been.
ELF is platform agnostic, and has been used in operating systems on nearly every existing CPU platform since mid 90's (with a few notable exceptions being OS X, AIX, the embedded world and Windows).
Seconded.
Doing it is basically a hand assembly. One reads the documentation, selects the bytes needed using a processor data sheet, orders them into the various sections, populates the ELF fields and then it really does boil down to typing them all in.
Pre-ELF times, on say an 8 bit Apple 2, the machine code monitor, allowed input of the program bytes directly. Those are then executed.
Storing to disk is only a bit more involved, and there is another opportunity! Disk sector editors allow one to create a file...
...and so it goes!
Chris Wellons' "A Magnetized Needle and a Steady Hand," a piece on building an ELF executable from scratch:
https://nullprogram.com/blog/2016/11/17/