return to table of content

Making a PDF that's larger than Germany

jl6
34 replies
10h25m

PDF is a fabulous format. I mean, it’s an awful format in so many ways, technically speaking, but the net effect of having a self-contained static file in your custody stands in blissful contrast to the user-hostile dynamic/SaaS website that can be taken away at a moment’s notice. PDF/A is the true PDF - it strips out most of the dangerous cruft.

Anyway, if you like weird PDF hijinks, here’s a polyglot PDF/A CSV file that is also its own original soundtrack as a polyglot Amiga soundtracker mod:

https://www.lab6.com/6

amelius
19 replies
8h1m

PDF is an executable file. Many people are worried about running Javascript but still use PDF files without problems.

shp0ngle
8 replies
7h57m

Yeah but the javascript can only do things inside the pdf.

berkes
5 replies
7h32m

What javascript can escape my browser, (edit: or an HTML page) for example?

lukan
2 replies
5h41m

XMLHttpRequest to send anything the site knows anywhere.

And row hammer, to breach the sandbox.

https://en.wikipedia.org/wiki/Row_hammer

berkes
1 replies
3h46m

Row hammer is an exploit. It wasn't "by design".

While that may technically be "escaping the sandbox" it's a different case, because it was never meant to work, will be fixed and often is fixed.

afiori
0 replies
2h43m

Almost every "escaping the sandbox" is due to some kind of bug.

Sure if the PDF standard exposed a "globalThis.runBlobAsNativeExecutable" function it would be worse, but it is still escaping the sandbox.

grotorea
1 replies
4h15m

Are the non-browser PDF readers more vulnerable? Do most even execute the Javascript?

kevincox
0 replies
2h57m

I would expect so simply because browsers are fairly hardened pieces of software. Adobe Acrobat is decently hardened but it seems to be far behind browsers.

It is worth noting that Chromium and later Firefox both added PDF viewers that live inside the browser sandbox. They are essentially web-apps that render the PDF. When I worked at Google they strongly recommended using Chrome for opening PDF files because they felt much more comfortable about its security and sandboxing than other PDF readers.

On another perspective is that you are likely browsing the internet anyways. In fact you likely got the PDF by visiting a website. So you have already exposed a huge attack surface (your browser) to a possible hostile adversary. It is better to expose them to the same attack surface again (plus whatever security the PDF reader itself provides) than to give them a fresh new attack surface.

muskypirate
0 replies
5h42m

It is not about JS. Look into BadPDF as an example.

jk3000
0 replies
7h29m

Famous last words :)

JKCalhoun
4 replies
4h46m

For better or worse, the years I spent working on Preview for Apple (and PDFKit) I felt bad that our (Apple's) PDF implementation was far short of Adobe's.

Radars would show up with PDFs attached, "Preview Does Not Display 3D Image in PDF Like Acrobat" or similar. And I would feel so ... inadequate.

PDFKit could render and capture basic annotations ... and that was about it. We could show you forms, allow editing, but if the PDF had Javascript that would add two fields and put the sum in a third field I had to shrug and say, "Oh well." The effort of hoisting a JavaScript interpreter/runtime was beyond my skillset anyway.

But then I kind of came to see our subset of PDF support as a kind of feature. It's true, we left out the kitchen sink. Adobe was/is clearly interested in putting everything into PDF.

And I mean, as pointed out here, at least you could open a PDF in Preview and not worry about any Javascript executing. ;-)

peter_l_downs
2 replies
1h40m

If it makes you feel any better, Preview is by far the best PDF viewer and editor (I use it for signatures and adding text) I've ever used. I like that the PDF previews in Finder are instant and accurate. I like that it shows as much PDF and as little UI/menubar as possible. I like that it never asks me to upgrade or log in. The search tools work well. I can stitch PDFs together (if I google how to, always forget) and pull certain pages out as their own files.

For all of the PDFs I've ever encountered, Preview has been sufficient and capable. Thank you for your hard work!

amluto
0 replies
28m

And somehow Acrobat (current paid version from Creative Cloud) is the worst PDF form filing option.

B1FF_PSUVM
0 replies
13m

If it makes you feel any better, Preview is by far the best PDF viewer and editor

Seconded. At least most pleasant to use for most things, and never balked at anything I needed to see, fortunately.

gruturo
0 replies
43m

Thank you, thank you, THANK YOU for not having put all that cruft in, and by Apple's sheer size, effectively discouraging many from producing and circulating those abominations.

Adobe has an awful track record of security (how many exploits in the past 25 years were in Acrobat (not the PDF spec, the actual Acrobat software) and in Flash?) but PDF is an amazing gift to the world, and, thanks to people like you, effectively safer than how Adobe designed it :))

Unfortunately I have the full Acrobat on my work computer, mandated by my employer, sigh, but that's another story.

eirikbakke
2 replies
3h3m

When I ordered an official PDF copy of my college diploma, the order form had an option to enable "tracking" in the PDF file. Sure enough, when the recipient opened the PDF file (and when I tried it myself on a different machine), I got a notification from the company that generated the PDF...

SpaghettiCthulu
1 replies
1h34m

That's horrific! I had no idea that was even a feature of PDFs!

layer8
0 replies
6m

PDFs are roughly on par with web pages feature-wise, including JavaScript or other actions that execute on load. Adobe did this, of course, to stave off the competition from the early web. Nowadays, PDF readers disable most of that by default (if they even support it).

venusenvy47
1 replies
3h46m

Is it really executable on the OS? Doesn't it require a native application to run it on an OS?

afiori
0 replies
2h48m

No, they are not executable by the OS (generally).

Formats are on a gradient between "completely code" and "completely data" and PDFs are quite close to the "completely code" extreme'; I guess this is what the parent meant.

rqtwteye
5 replies
5h11m

“ PDF is a fabulous format”

I will never forgive the pain PDF caused me when I worked on a project to parse millions of PDF files from various sources. Just reconstructing paragraphs was a huge effort not even mentioning parsing tables. I think we should do better for something that’s basically a standard. PDF manuals also suck big time.

JKCalhoun
4 replies
4h53m

PDF is supposed to a be a printer format, not a word processing document format. While I too would love to nail down a PDF subset to be a standard (for example requiring the accessibility tags that make text extraction easy) perhaps trying to create a hybrid format, one that satisfies both printers and resizable windows, is already an impossible goal.

(I've always had to keep my love of PDF a secret from fellow nerds. But here's another secret, I like printing documents out from time to time.)

da_chicken
2 replies
4h33m

I really appreciate what PDF can accomplish, but I also really dislike that it turns into a black box. There really ought to be something that can describe a document structure and also describe document layout in a durable and portable manner. In the range of XML/JSON <-> HTML+CSS <-> PDF <-> PS <-> RAW, it really does feel like there's something missing between HTML and PDF.

And it can't be LaTeX, because the document shouldn't be a programming language at all. "The document is a program" has proven itself to be a terrible scheme overall.

layer8
0 replies
2m

PDF includes optional document structure information. Most PDF creation software chooses to not generate it, though.

JKCalhoun
0 replies
2h0m

ePub is kind of trying to be that? Or maybe that hews too close to HTML.

It can reflow but tries to paginate HTML ... the way printing a web page tries to paginate HTML, ha ha.

grotorea
0 replies
4h8m

I wonder a bit if we wouldn't have a easier time extracting data, resizing pages etc if we sent HTML files instead of PDF. Are even half of PDFs printed at all?

ourmandave
2 replies
6h24m

The "in your custody" part is important, when Amazon starts yoinking books from your account.

https://www.nytimes.com/2009/07/18/technology/companies/18am...

financypants
1 replies
1h55m

I buy all my books paperback, even if I listen to them on audible, for posterity’s sake.

shiftpgdn
0 replies
1h2m

Do you buy them after you’ve finished listening?

throwaway290
1 replies
9h28m

Does it anything else? Maybe pwn me via my PDF viewer?;)

TeMPOraL
0 replies
8h41m

It contains Bitcoin hashes, rendering them one by one as it mines them.

shp0ngle
1 replies
7h58m

about pdf/a... until recently there was not even an easy way to figure out if pdf is really pdf/a; now there is (verapdf) and it's crazy complex piece of software

and maybe I'm wrong but the only way to convert arbitrary pdf to pdf/a with open source software is to convert it to postscript and back with ghostscript - which is affero licensed... with all the possible problems it entails. (there is old version that is just gpl, works on most pdfs but is 15 years old or such.)

i needed to deal with pdf/a in a previous job... was not fun.

Elzair
0 replies
2h18m

You could use the pdfium library as an alternative to Ghostscript.

martin_a
0 replies
9h16m

PDF/A is the true PDF

As someone working in the graphic industry, I'd say PDF/X is the true PDF, but ymmv. :-)

whartung
13 replies
17h34m

It seems germane at this point to paraphrase Steven Wright.

“I have a map of the United States. It’s actual size. “

galaxyLogic
9 replies
12h29m

I have the map of US in my cell-phone.

I'm somewhat confused by its directions however when I look at the map and want to go somewhere. Is the top-part of the map where I'm moving? Or is the top-part North?

Seems it is not North and that is confusing because maps I've seen before have North at the top always.

If I turn 90 degrees, the map turns around. But I thought it was I who turned around.

And if I stop, the map cannot know where I'm going because I'm not going anywhere. So it is almost like I have to start moving before the map can tell me where to turn.

Or if I hold the smart-phone in front of my eyes the top of the map is towards the sky. Am I supposed to look at the map from above?

What are some good tactics on how to use Google-map on your cell-phone?

TowerTall
4 replies
12h16m

I really hate that too. You are in a intersection and the voice says "Drive north for x miles/km". What is wrong with "turn right and drive for x miles/km"? I normally have zero clue in what direction north is especially when I am in a location i have never been before. I drive a bike and have the phone in my pocket and can therefore not see any arrow that the app might display. I only have the audio to navigate from.

thaumasiotes
0 replies
5h59m

You are in a intersection and the voice says "Drive north for x miles/km".

Does that really happen? I have never experienced it. How do they tell which way is north?

Highway 101 runs through San Jose pretty much due east/west, but because it also runs up to San Francisco, it is officially a north-south highway. So you check your position on the map and you're traveling due east along an east/west road. Is that "north"? (Of course not. It's "south".)

roxgib
0 replies
8h5m

It will do that if it doesn't already know what direction you're travelling, which is usually because you've just activated navigation and you aren't moving yet. Unless I happen to know which direction north is or which way to towards my destination I'll just pick a random direction and it will adjust the route if I guessed wrong.

robertlagrant
0 replies
8h46m

That's odd. My Google Maps tells me to turn left or right. It doesn't use compass directions.

jeffhuys
0 replies
9h28m

That’s Google maps for you. Try another one, most have way better voice cues (amongst other things!).

flexagoon
1 replies
8h17m

There are two modes in Google Maps - one shows the map in a fixed rotation (north on top by default, but you can rotate the map with two fingers), the other mode automatically rotates the map based on what direction you're facing. *Facing*, not moving, so you don't actually have to walk for it to determine the direction.

You can switch between the modes by clicking a compass icon

wongarsu
0 replies
5h7m

Part of the confusion might be that it's pointing in the direction the phone is facing. Which is kind of obvious, but notably doesn't work if you put your phone in an upright phone holder, as many people do in their car.

p1mrx
0 replies
9h38m

If you want North to be up, tap the compass icon.

TeMPOraL
0 replies
8h32m

What are some good tactics on how to use Google-map on your cell-phone?

For navigation?

1. Don't activate navigation. It's broken six ways to Sunday, and burns through battery like there's no tomorrow. Use route preview instead (i.e. the step after searching, but before activating the voice nav proper).

2. Use your fingers to rotate the map so it always faces the same way you're going.

3. If confused, recenter and press the compass so it rotates to have North at the top, and continue from there.

Now FWIW, I use Google Maps when navigating on foot/scooter, or as a pilot in the car. If I were a driver... I'd probably buy TomTom or whatever nav that's not shit.

venusenvy47
0 replies
3h43m

I always liked his related joke: "I want to get a tattoo of myself on my entire body only 2" taller.".

https://scomedy.com/quotes/10779

iggldiggl
0 replies
10h22m

Umberto Eco also has something to say on that subject – On the Impossibility of Drawing a Map of the Empire on a Scale of 1 to 1:

https://s3.amazonaws.com/arena-attachments/881694/cb6119367b...

fuzztester
0 replies
17h2m

I have a map of the Universe. Dunno, it keeps expanding ...........................................,............................................................................................................................

mrb
11 replies
17h47m

Fun experiment alexwlchan! Two small mistakes in your post: you write "15,000,000,000.00 in" and "that the size of a page is 15 billion inches", but it should be 15 million.

You said you had difficulty formatting text. Here is a "hello world" pdf that just has these two words on a page: copy and paste this text (stripping leading spaces on each line) and save it in a .pdf file. Basically in order to write text you have to define a font (object 5) and then a stream with a Tf command to use the font, a Td command to position the text, and a Tj command to write it.

    %PDF-1.2
    1 0 obj
    <<
     /Type /Catalog
     /Pages 2 0 R
    >>
    endobj
    2 0 obj
    <<
     /Type /Pages
     /Kids [ 3 0 R ]
     /Count 1
     /MediaBox
     [ 0 0 612 792 ]
    >>
    endobj
    3 0 obj
    <<
     /Type /Page
     /Parent 2 0 R
     /Resources 4 0 R
     /Contents 6 0 R
    >>
    endobj
    4 0 obj
    <<
     /ProcSet[/PDF/Text]
     /Font <<
      /F1 5 0 R
     >>
    >>
    endobj
    5 0 obj
    <<
     /Type /Font
     /Subtype /Type1
     /BaseFont /Times-Roman
    >>
    endobj
    6 0 obj
    <<
     /Length 52
    >>
    stream
    BT
    /F1 48 Tf
    185 400 Td
    (Hello World)Tj
    ET
    endstream
    endobj
    trailer
    <<
     /Root 1 0 R
    >>

whartung
4 replies
17h42m

Is the xref at the end of a PDF required or not? Seems like it is in the spec.

jchw
2 replies
17h9m

By the spec, yes. Some PDF readers will parse it anyway, some will not. In my experience depending on the renderer the xref table can be varying degrees of malformed before things go wrong. Edge's old PDF reader (the one before Acrobat and after PDFium) for example seemed to tolerate just about anything, falling back to the latest version of objects if the xref table was broken. There's also other mistakes you can make, like for example, the xref table requires carriage returns (each entry in the table is supposed to be an exact number of bytes) but some PDF readers will still interpret the xref table even if the carriage returns are missing.

whartung
1 replies
17h1m

As I understand it, the xref entries don’t require a carriage return, but they require a fixed line length. If you don’t want to use a CR, you can pad with a space.

So CR/LF, space/LF, and space/CR are all valid endings.

jchw
0 replies
16h41m

Yep:[1]

The byte offset in the decoded stream shall be a 10-digit number, padded with leading zeros if necessary, giving the number of bytes from the beginning of the file to the beginning of the object. It shall be separated from the generation number by a single SPACE. The generation number shall be a 5-digit number, also padded with leading zeros if necessary. Following the generation number shall be a single SPACE, the keyword n, and a 2-character end-of-line sequence consisting of one of the following: SP CR, SP LF, or CR LF. Thus, the overall length of the entry shall always be exactly 20 bytes

This is interesting. Never actually saw anything other than CRLF in practice, even inside of PDF files that otherwise were LF-only.

[1]: https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandard... page 41

mrb
0 replies
17h24m

It is required according to the standard. But in practice most PDF viewers don't care. They may complain the PDF is "damaged" or "no valid xref was found", but they will render it perfectly fine.

wodenokoto
3 replies
13h29m

"15,000,000,000.00 in" and "that the size of a page is 15 billion inches", but it should be 15 million.

Can you help me count zeroes? Why is it million and not billion?

mrb
0 replies
12h36m

He put too many zeroes. It should be "15,000,000.00 in" or 15 million.

jetrink
0 replies
13h4m

The numerical and word versions are equal, but they're both wrong. 15 billion inches is to the distance from the Earth to the Moon.

croes
0 replies
13h2m

If we crank it all the way up to the maximum of UserUnit 75000, Acrobat now reports the size of our page as 15,000,000,000.00 x 15,000,000,000.00 in – 381 km along both sides, matching the original claim. If you’re curious, you can download the PDF.

15 billion inches are 381,000km. The original claim is the limit is 15 million inches.

alexwlchan
1 replies
11h51m

Argh! I knew I was going to make a numerical mistake somewhere, thanks for spotting it. Correction will be up shortly. Thanks for spotting it! :D

And thanks for the text example! This looks like what I was trying, but clearly I had a mistake somewhere.

dingensundso
0 replies
11h3m

Spotted another math mistake: > The default unit size is 1/72 inch, so the page is 300 × 72 = 4.17 inches.

anotheraccount9
9 replies
17h36m

For a moment I wasn't sure if I wanted to click on the link.

chaxor
4 replies
15h53m

On an only slightly related note: is there any good way to check PDFs for malware/executables?

If I'm stuck with an attempt at it, the best I can think of is opening in a new QEMU or docker with no Internet access, but that's 1) a fair but of work to check something, and 2) probably not even that secure. Using some cli tool, like xxx, bat, or ranger, that does some processing to extract the text and looking at just that feels more secure - but I know it really isn't.

What is a simple tool to "clean" PDFs? An ML tool that does QEMU/docker/no-net to extract the content, turns that into game, and saves a typst/latex template with it would probably be the best possible outcome - but that's a decent (yet potentially very lucrative) task.

worewood
1 replies
14h43m

What you mean with "PDFs with malware/executables"?

If you're talking about embedded active content within them, then a reader application can just ignore/not run it.

If you're talking about a crafted PDF that exploits, let's say, font rendering bugs inside the reader than it's near impossible. Keep your applications updated.

arunsivadasan
0 replies
10h16m

There is a Chrome addon called SquareX https://chromewebstore.google.com/detail/kapjaoifikajdcdehfd... The founder is pretty reputed in the Cybersecurity field.

peddling-brink
0 replies
13h18m

For analysis, I’ve used Didier’s tools. If you just want a safe way to open it, upload it to a cloud storage provider which destructively renders the pdf. Box or Google drive should work.

https://blog.didierstevens.com/programs/pdf-tools/

flexagoon
0 replies
8h15m

There are some pdf readers that protect you against those things.

On Android, for example, there is the GrapheneOS Pdf Viewer [1]. It's readme has a pretty good explanation of how it works.

1: https://github.com/GrapheneOS/PdfViewer

qwertox
3 replies
17h29m

It also screams buffer overflow.

maxerickson
2 replies
17h26m

PDF readers are probably mostly pretty hardened against "naive" non-conforming content.

kirubakaran
1 replies
16h44m

probably mostly pretty hardened

Quite possibly perhaps that might be true-ish to some extent, I think, but take that with a grain of salt, I'm not an expert, that's just my wild guess :-p

maxerickson
0 replies
16h19m

It's pretty ridiculous to peel that off the following qualifier.

Readers have been aggressively attacked for a long time. It's certainly not impossible that some basic demonstration PDF will cause an issue, but it's probably not reasonable to expect it.

0134340
6 replies
16h27m

Please don’t try to print it.

Sounds like a print bomb waiting to happen. Last time I had a printer it was next to impossible to cancel a print job on Windows. Back when people had wifi printers that were open or ill-secured, those were fun times.

matheusmoreira
2 replies
13h52m

it was next to impossible to cancel a print job on Windows

It's still impossible. The only reliable method I've found consists of turning the printer off and then deleting the print job in the queue. Only way to get Windows to actually delete it. Doesn't work unless the printer is sitting right next to me, of course. I have no idea why this is so hard.

tazjin
0 replies
4h21m

Some ~12 years ago, I was debugging POS integration with a receipt printer and accidentally sent garbage postscript to the receipt printer, which printed it out verbatim.

Stopping it was impossible. Power cycling that printer had absolutely no effect. It wrote the unfinished print job to some kind of persistent memory, and by god it was going to finish it.

It went through something like 2 1/2 rolls of receipt paper (yes it dutifully awaited the new rolls and then just continued) and due to the thermal printing process it smelled very odd, and I had quite a few metres of raw Postscript afterwards to decorate a wall with.

askvictor
0 replies
8h27m

And sometimes windows won't delete it from the print queue as it can't talk to the printer. Fun times.

kome
1 replies
8h36m

when i read

Please don’t try to print it.

my first reaction has been: you are not my mom. >:-)

bombcar
0 replies
6h7m

Just click “scale to fit on one page”.

foreigner
0 replies
10h42m

Only slightly more reliable method: unplug the printer and throw the computer out the window.

tremarley
4 replies
16h41m

“But unlike Acrobat, the Preview app doesn’t have an upper limit on what we can put in MediaBox. It’s perfectly happy for me to write a width which is a 1 followed by twelve 0s:

Screenshot of Preview’s Document inspector, showing the page size of 352777777777.78 x 10.59 cm. If you’re curious, that width is approximately the distance between the Earth and the Moon. I’d have to get my ruler to check, but I’m pretty sure that’s larger than Germany.”

The size of every planet in our solar system, put next to each other, can fit in this doc with room to spare

MichaelZuo
1 replies
15h54m

Now I wonder how large of a file size would such a PDF be if it wasn't empty space...

pas
0 replies
15h30m

pdf supports vector graphics! or it can be just a lot of "a" characters, it supports compression/repeat, right?

svantana
0 replies
56m

By my counting, that document is ~ 373 km^2, which is much smaller than germany. It turns out the ruler was needed after all

croes
0 replies
12h56m

You have one 7 too many. 352777777777.78cm are 3,527,777.7777778km.

kepano
4 replies
18h43m

I cannot let this opportunity go by without quoting On Exactitude in Science by Borges in its entirety

". . . In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography."

https://en.wikipedia.org/wiki/On_Exactitude_in_Science

staplung
1 replies
18h10m

Or a portion of one of it's inspirations: Lewis Carroll's Sylvie and Bruno Concluded

  "We actually made a map of the country, on the scale of a mile to the mile!"

  "Have you used it much?" I enquired.

  "It has never been spread out, yet," said Mein Herr: "the farmers objected: they said it would cover the whole country, and shut out the sunlight ! So we now use the country itself, as its own map, and I assure you it does nearly as well."

defrost
0 replies
15h44m

Also Carroll, from The Hunting of the Snark

    He had bought a large map representing the sea,

    Without the least vestige of land

    And the crew were much pleased when they found it to be

    A map they could all understand.


    “What’s the good of Mercator’s North Poles and Equators,Tropics, Zones, and Meridian Lines?”
    
    So the Bellman would cry

    and the crew would reply

    “They are merely conventional signs!


    “Other maps are such shapes, with their islands and capes!

    But we’ve got our brave Captain to thank

    (So the crew would protest) that he’s bought us the best

    A perfect and absolute blank!”

jancsika
0 replies
13h32m

There are some funny lines from They Might Be Giants' "Women and Men" that run along the same lines:

Women and men have crossed the ocean,

They now begin to pour

Out from the boat and up the shore.

Two by two they enter the jungle,

And soon they number more,

Three by three as well as four by four.

Soon the stream of people gets wider,

Then it becomes a river,

River becomes an ocean,

Carrying ships that bear

Women and men.

**

Borges: map of an area gets so detailed it becomes the same size as the area.

TMBG: creatures multiply and ultimately overrun an area so fully that their group behavior recreates the ecology of the area they took over

iggldiggl
0 replies
10h20m

And Umberto Eco expanded on that with On the Impossibility of Drawing a Map of the Empire on a Scale of 1 to 1:

https://s3.amazonaws.com/arena-attachments/881694/cb6119367b...

remoquete
3 replies
9h44m

This post reminds me of Umberto Eco's intellectual divertissements. More specifically, this fantastic piece, "On the Impossibility of Drawing a Map of the Empire on a Scale of 1 to 1."

https://s3.amazonaws.com/arena-attachments/881694/cb6119367b...

alberto_ol
2 replies
4h30m

Umberto Eco quoted Jorge Luis Borges:

https://en.wikipedia.org/wiki/On_Exactitude_in_Science

remoquete
0 replies
4h24m

Yes, though it'd be perhaps more accurate to say it expanded upon the theme, as Wikipedia says.

B1FF_PSUVM
0 replies
6m

Speaking of Borges, sometimes I'm sadly reminded of his spoof of categorization schemes ("Animals: those that belong to the Emperor, embalmed ones," etc. ) : https://en.wikipedia.org/wiki/Celestial_Emporium_of_Benevole...

jdlyga
3 replies
17h8m

You know you're reading a good technical article when it measures pdf width in kilometers

apapapa
2 replies
11h9m

microSD cards can contain millions of miles in a very small space...

geraldhh
1 replies
4h36m

or billions of feet

coldpie
0 replies
3h2m

A truly unfathomable quantity of toes on a single SD card!

macropin
2 replies
15h57m

Reminds me of this PDF I created more than a decade ago from a Postscript implementation of the game of life. Seems it still works, but causes MacOS preview to crash. https://andrewcutler.net/docs/joke/life.pdf

maleldil
0 replies
4h1m

It doesn't cause Preview to crash on Sonoma. FWIW, I can't see any animation, just the final state, while Firefox's PDF reader does show some animation. Skim has the same behaviour as Preview but doesn't show the grid.

JKCalhoun
0 replies
4h41m

Whew! Didn't crash on my Mac OS — just a static Game of Life render. This machine is still on Monterey FWIW.

codeflo
2 replies
16h18m

I take offense to that diagram; Germany should refuse to be covered by a PDF that's not in proper DIN format.

In theory, DIN paper sizes go all the way from subatomic to the size of the universe. It seems like A(-39) is barely too small to cover Germany's land mass, but A(-40) should be more than sufficient. That's 882 x 1247 km if I didn't miscalculate.

Cacti
1 replies
11h40m

Oh here we go again.

groestl
0 replies
11h0m

That's actually quite funny, especially because Germany almost has portrait DIN format.

JackSlateur
2 replies
9h53m

Long story short: the original tweet makes a confusion between PDF (the file format) and adobe acrobat (the PDF reader) : the 381km2 is an acrobat limit, not a PDF limit

Funny document, still

mr_mitm
0 replies
9h7m

Interesting, I always though Acrobat was the reference implementation of PDF.

latexr
0 replies
5h45m

Long story short: the original tweet

I don’t think the tweet is relevant at all and it’s a disservice to this post to feature it that prominently in a summary. A more interesting conclusion is that PDF files can have dimensions larger than the Universe, and an example is provided.

DontBreakAlex
2 replies
9h40m

Wait, pdf files aren't binary ?!

roxgib
0 replies
7h58m

I was surprised that the underlying format doesn't implement compression (though I assume objects can be compressed). Perhaps I shouldn't be surprised since I often get text only PDFs with unreasonably large sizes.

gaazoh
0 replies
8h44m

I just had the exact same reaction! So I opened a random PDF I had laying around, and yes, it's mostly a text format. Some (most) objects are binary data streams, but some are also text data. Likewise, objects may or may not be compressed, obviously compressed streams are binary data. But the file structure is text, some objects are xml, and you can figure out quite a lot of stuff just by looking at a pdf in a text editor, and it might not even be that long: the single page PDF I just looked at is just over 1500 lines long, I can definitely manually scroll through it (although offsets are in bytes, not lines, which make them not very useful for manual lookup).

wiradikusuma
1 replies
14h24m

I open the PDF in Google Chrome on a Mac. When I Ctrl+P, the dialog says it's 1 Page. I don't try to print it, but I think it will not consume more than 1 page?

Also, PDF preview in Chrome simply showing it like a normal PDF, but Preview seems confused (gray background instead of white)?

justsomehnguy
0 replies
10h50m

I don't try to print it

Well, you can even without consuming a single sheet: just print to PDF.

Preview seems confused (gray background instead of white)?

It tries to render it and fit in the preview.

whoisthemachine
1 replies
18h11m

While the Germany PDF actually scrolls pretty quickly at 100% zoom (makes one realize just how much text is read in a day), the Universe one is pretty fun, Firefox's PDF reader at 100% zoom obviously doesn't budge the scrollbar at all.

pitherpather
0 replies
17h36m

Hackaday soon: Synchronizing a treadmill to a pdf the size of Germany.

Obligatory?: The pdf is not the territory.

poulpy123
1 replies
7h23m

what does mean "larger than germany" for a document file ?

654wak654
0 replies
7h20m

PDFs have physical dimensions in them (I think most document formats do), so you can literally define a square of 5x5cm in the document for example.

oneseven
1 replies
17h4m

Slightly tangential: if you are hacking on PDFs, manually or otherwise, this is an incredibly useful tool: https://pdfcpu.io/ (not the author, just a user)

vendiddy
0 replies
6h3m

Thanks for this. Any other tools that are useful when hacking on PDFs? I need to do a lot of programmatic PDF manipulation at work.

nayuki
1 replies
13h49m

I'll analyze PNG for comparison. The largest width and height is 2147483647 (2^31 - 1). Using the pHYs chunk (physical pixel dimensions), the lowest density we can specify is 1 pixel per metre. So, 2 billion metres (2 gigametres) is somewhat bigger than the diameter of the sun at 1.39 Gm. https://en.wikipedia.org/wiki/Orders_of_magnitude_(length)#g...

Using the sCAL chunk (physical scale) would allow extremely large dimensions because it uses ASCII floating-point.

lifthrasiir
0 replies
11h44m

Using the sCAL chunk (physical scale) would allow extremely large dimensions because it uses ASCII floating-point.

AFAIK sCAL is more about the image's subject, not the image itself. A 1:10,000,000 scale world map would be < 10 m wide according to pHYs, but it will be ~40,000 km wide according to sCAL.

markussss
1 replies
18h59m

This was a fun read! Thank you

tkgally
0 replies
17h49m

I second that comment! That was the most enjoyably nerdy thing I’ve read in quite a while.

jakey_bakey
1 replies
8h0m

I love the texture of your website, it's like nice tactile wallpaper.

JKCalhoun
0 replies
4h38m
gr33nq
1 replies
18h49m

Coincidentally, I just finished watching a video that explored the same topic of massive and unique PDF files: https://www.youtube.com/watch?v=ZvVNRRQjDh8

NooneAtAll3
0 replies
17h43m

yeah, I went here to post it as well

not every time one can see a whole game integrated into a PDF!

denysvitali
1 replies
11h7m

I can't wait for people to start rendering their CVs with this trick >:)

roxgib
0 replies
7h57m

I'm a bit disappointed in myself that it didn't occur to me to submit my CV in A3.

_bax
1 replies
8h19m

An idea to send a DDOS attack to company LAN printers

mglz
0 replies
7h15m

Wouldn't they just check their available paper, notice there is no AGermany size and give up?

yannis
0 replies
14h33m

And of course you can try and produce this pdf using TeX. In this post https://tex.stackexchange.com/a/27482/963 I created a pdf of 15283 pages (lettersize) filled with lorem ipsum text and without the program running out of memory.

xanth
0 replies
16h24m

Related CGPGrey video comparing metric paper sizes to comparable objects from the plank scale to the galactic[1]; could have been an XKCD comic

[1]: https://www.youtube.com/watch?v=pUF5esTscZI

user2342
0 replies
5h59m

"Please don’t try to print it." :-)

poulsbohemian
0 replies
18h35m

About 30 years ago I interviewed to be a summer intern at Microsoft, and one of the interviewers asked a question very similar to this but regarding Excel. This is the kind of topic that never gets old for understanding a person’s curiosity and ability to dissect the potential issues.

msarris
0 replies
11h57m

So I guess the question is, how did she figure out the size of the entire universe?

lovegrenoble
0 replies
3h22m

How many viruses it could contain )

ipsum2
0 replies
15h41m

Chrome's PDF reader reports the file size as disappointly 200.00 × 200.00 in (square)

drewcoo
0 replies
18h11m

Please, nobody tell Randall Munroe!

Because I would literally spend days worth of time scrolling.

danbruc
0 replies
9h21m

So what is the actual limit if any? I just had a quick look at ISO 32000-2:2020 [1] and think the answer is none or implementation depended if you want. In the file format a media box is a rectangle, a rectangle is an array of four numbers, and a number is either an integer or a real. Numbers are represented as strings, so there is no a priori limit on their range and there seem to be no requirements on the minimum or maximum range of values an implementation has to support. The appendix only says that IEEE 754 is a commonly used format to represent reals and that this might impose limits.

[1] https://developer.adobe.com/document-services/docs/assets/5b...

RecycledEle
0 replies
9h6m

"Your Scientists Were So Preoccupied With Whether Or Not They Could, They Didn’t Stop To Think If They Should."

--Ian Malcolm in the original Jurassic Park film