HN comments for: Llama.ttf: A font which is also an LLM

electric_mayhem

22 replies

1d4h

2024-06-23 14:19:38 UTC

While cool, technically… From a security perspective today I learned that TrueType fonts have arbitrary code execution as a ‘feature’ which seems mostly horrific.

samwillis

15 replies

1d4h

2024-06-23 14:25:55 UTC

Not really, no more so than a random webpage running js/WASM in a sandbox.

The only output from the WASM is to draw to screen. There is no chance of a RCE, or data exfiltration.

xg15

7 replies

1d3h

2024-06-23 14:58:00 UTC

It's still horrible, not in a (direct) security but in an interop sense: Now you have to embed an entire WASM engine, including proper sandboxing, just to render the font correctly. That's a huge increase of complexity and attack surface.

simonw

5 replies

1d3h

2024-06-23 15:02:31 UTC

I'm hoping that in a few years time WASM sandboxes will be an expected part of how most things in general purpose computing devices work.

There's very little code in the world that I wouldn't want to run in a robust sandbox. Low level OS components that manage that sandbox is about it.

xg15

3 replies

1d3h

2024-06-23 15:14:32 UTC

Normalizing the complexity doesn't make it go away.

Ideally, I'd like not to execute any kind of arbitrary code when doing something mundane as rendering a font. If that's not possible, then the code could be restricted to someting less than turing complete, e.g. formula evaluation (i.e. lambda calculus) without arbitrary recursion.

The problem is that even sandboxed code is unpredictable in terms of memory and runtime cost and can only be statically analyzed to a limited extent (halting problem and all).

Additionally, once it's there, people will bring in libraries, frameworks and sprawling dependency trees, which will further increase the computing cost and unpredictability of it.

simonw

2 replies

1d1h

2024-06-23 16:58:18 UTC

That's why I care so much about WebAssembly (and other sandbox) features that can set a strict limit on the amount of memory and CPU that the executing code can access.

Lockal

1 replies

15h45m

2024-06-24 02:44:01 UTC

Exactly that! And speaking of quotas, nobody can explain, why Ethereum Virtual Machine-like quotas were not enforced in the standard.

Imagine that you download a .odt/docx/pdf form with embedded font in LibreOffice in 2025. You start to type some text... And font start to saturate FPU ports (i.e. div/sqrt) in specific pattern. Meanwhile some tab in browser measures CPU load or port saturation by doing some simple action, and capture every character you typed.

kaibee

0 replies

4h36m

2024-06-24 13:53:19 UTC

Meanwhile some tab in browser measures CPU load or port saturation by doing some simple action, and capture every character you typed.

iirc browsers fuzz the precise timing of calls for exactly this reason already?

rft

0 replies

1d2h

2024-06-23 16:23:06 UTC

Your comment reminded me of this great talk [1] (humor ofc). While it talks about asm.js, WASM is in may ways, IMO, the continuation of asm.js

[1] https://www.destroyallsoftware.com/talks/the-birth-and-death...

Bluestein

0 replies

1d3h

2024-06-23 15:09:29 UTC

While neat in a "because we can" kind of sense, it really is maddening: Have we gone "compute-mad" and will end up needing a full-fledged VM to render ever-smaller subsets of UI or content until ... what?

What is the end game here?

It is kind of like a "fractal" attack surface, with increasing surface the "deeper" one looks into it. It is nightmarish from that perspective ...

turnsout

3 replies

1d3h

2024-06-23 14:52:08 UTC

The risk is that you could have the text content say one thing while the visual display says another. There are social engineering and phishing risks.

alexvitkov

2 replies

14h25m

2024-06-24 04:04:30 UTC

If you control the font, you control the content as well, I don't see the attack vector.

turnsout

0 replies

1h52m

2024-06-24 16:37:41 UTC

If you can trick someone into installing the font, you can now control what they read. Unfortunately a lot of hacks involve the user doing something dumb and avoidable.

If this font format is successful, then given enough time, it will become legacy. People won't be as vigilant about it, and they won't understand the internals as well. This is why TIFF-based exploits became so common 20-30 years after TIFF's heyday.

socksy

0 replies

11h17m

2024-06-24 07:12:13 UTC

Certain design tools type sites like Canva or Pitch allow you to upload fonts and obviously control the content. They are frequently used by phishers to make official looking phishing pages on a trusted source, leading to a cat and mouse game where the companies try to catch phishing like indicators in the content and flag them up for human review or block immediately.

In that case being able to show arbitrary other text would definitely be a hindrance because the scanning software typically looks at the data stored in the database. However I think you don't need a Turing machine to exploit this — you could have a single ligature in a well crafted font produce a full paragraph of text.

Perhaps there's an alternative vector where someone's premade font on a site that doesn't allow font uploading can be exploited to make arbitrary calculations given certain character strings. Maybe bitcoin mining, if you could find a way to phone home with the result

kenferry

0 replies

1d2h

2024-06-23 16:27:38 UTC

Why do you say that? Security exploits involving fonts are extremely common.

electric_mayhem

0 replies

1d3h

2024-06-23 14:51:55 UTC

I’m open to your idea, but can you explain in technical terms why a wasm sandbox is invulnerable to the possibility of escape vulnerabilities when other flavors of sandboxes have not been?

Hizonner

0 replies

1d3h

2024-06-23 15:17:41 UTC

Not really, no more so than a random webpage running js/WASM in a sandbox.

... except that it can happen in non-browser contexts.

Even for browsers, it took 20+ years to arrive at a combination of ugly hacks and standard practices where developers who make no mistakes in following a million arcane rules can mostly avoid the massive day-one security problems caused by JavaScript (and its interaction with other misfeatures like cookies and various cross-site nonsense). During all of which time the "Web platform" types were beavering away giving it more access to more things.

The Worldwide Web technology stack is a pile of ill-thought-out disasters (or, for early, core architectural decisions, not-thought-out-at-all disasters), all vaguely contained with horrendous hackery. This adds to the pile.

The only output from the WASM is to draw to screen.

Which can be used to deceive the user in all kinds of well-understood ways.

There is no chance of a RCE, or data exfiltration.

Assuming there are no bugs in the giant mass of code that a font can now exercise.

I used to write software security standards for a living. Finding out that you could embed WASM in fonts would have created maybe two weeks of work for me, figuring out the implications and deciding what, if anything, could be done about them. Based on, I don't know, a hundred similar cases, I believe I probably would have found some practical issues. I might or might not have been able to come up with any protections that the people writing code downstream of me could (a) understand and (b) feasibly implement.

Assuming I'd found any requirements-worthy response, it probably would have meant much, much more work than that for the people who at least theoretically had to implement it, and for the people who had to check their compliance. At one company.

So somebody can make their kerning pretty in some obscure corner case.

px43

3 replies

1d3h

2024-06-23 15:11:23 UTC

If you think that's bad, until very recently, Windows used to parse ttf directly in the kernel, meaning that a target could look at a webpage, or read an email, and be executing arbitrary code in ring0.

Last I checked there were about 4-10 TTF bugs discovered and actively exploited per year. I think I heard those stats in 2018 or so. This has been a well known and very commonly exploited attack vector for at least 20 years.

anthk

2 replies

21h59m

2024-06-23 20:30:44 UTC

The same with Wav files.

plaguuuuuu

1 replies

5h15m

2024-06-24 13:14:39 UTC

how can a wav file do anything? isnt it just raw data essentially?

dTal

0 replies

2h48m

2024-06-24 15:41:25 UTC

I'm pretty sure it can't. There's nothing in a WAV file that's meant to be executed. A quick google turns up a DirectX vulnerability from 2007 (a validation error that's not inherent to the WAV format per se), and a recent case of WAV files being used to conceal malicious payloads (but coupled with a loader).

Having said that, the "arbitrary code" found in TrueType is not really arbitrary either - it's not supposed to be able to do anything except change the appearance of the font. From a security standpoint, there's no theoretical difference between a WAV and a TTF font - neither can hurt your machine if the loader is bug-free. Practically speaking though, a font renderer that needs to implement a sort of virtual machine is more complex, and therefore more likely to have exploitable bugs, than a WAV renderer that simply needs to swap a few bytes around and shove them at a DAC.

rft

0 replies

1d2h

2024-06-23 16:19:46 UTC

(Sadly) this is nothing new. Years ago I wrangled a (modified) bug in the font rendering of Firefox [1, 2016] into an exploit (for a research paper). Short version: the Graphite2 font rendering engine in FF had/has? a stack machine that can be used to execute simple programs during font rendering. It sounded insane to me back then, but I dug into it a bit. Turns out while rendering Roman based scripts is relatively straightforward [2], there are scripts that need heavy use of ligatures etc. to reproduce correctly [3]. Using a basic scripting (heh) engine for that does make some sense.

Whether this is good or bad, I have no opinion on. It is "just" another layer of complexity and attack surface at this point. We have programmable shaders, rowhammer, speculative execution bugs, data timing side channels, kernel level BPF scripting, prompt injection and much more. Throwing WASM based font rendering into the mix is just balancing more on top of the pile. After some years in the IT security area, I think there are so many easier ways to compromise systems than these arcane approaches. Grab the data you need from a public AWS bucket or social engineer your access, far easier and cheaper.

For what it's worth, I think embedded WASM is a better idea than rolling your own eco systems for scripting capabilities.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1248876

[2] I know, there are so many edge cases. I put this in the same do not touch bucket as time and names.

[3] https://scripts.sil.org/cms/scripts/page.php?id=cmplxrndexam...

oopsallmagic

0 replies

18h28m

2024-06-24 00:00:57 UTC

It's technically not arbitrary. There is a stack, of sorts, but IIRC it has a depth of six or so, by default. You can do cool stuff with font shaping, but you can't easily execute arbitrary code.

xrd

14 replies

1d4h

2024-06-23 14:18:08 UTC

After watching part of the video, I believe the world would benefit from a weekly television program where you could tune in each week to watch something weird, brilliant and funny. This would be a great episode #1 for that television show.

btown

4 replies

1d2h

2024-06-23 15:43:14 UTC

On the esoteric software engineering side, Tom7 is the channel you're looking for! https://www.youtube.com/@tom7

godelski

1 replies

12h10m

2024-06-24 06:19:31 UTC

This is honestly one of my favorite channels. It is one of those things where if one asks "why" the only answer that can be given is "because."

Just look at the 4 most recent videos. Maybe start with "Harder Drive: Hard drives we didn't want or need" where he tries to make hard drives out of things that shouldn't be hard drives. This includes: by pinging the entire internet, tetris, and Covid-19 tests. But in truth the absurdity is a deep dive into the nature of how data can be stored and encoded. I think it should encourage people to pursue knowledge for the sake of knowledge and how there are frequently deep insights into seemingly dumb questions, as long as you dig deep enough.

  We do these things not because they are hard, but because they are harder drives!

LoganDark

0 replies

9h49m

2024-06-24 08:40:10 UTC

On his website he has an acadamic paper about red i removal. Not red eye removal. Red i. As in he superimposes a big red Comic Sans letter `i` over an image and then tries different techniques to remove it again...

vient

0 replies

8h23m

2024-06-24 10:06:47 UTC

And, as expected, the very first reference in the post is to the Tom7's video. Of course it would be.

nextaccountic

0 replies

20h9m

2024-06-23 22:20:48 UTC

Also Paralogical https://youtube.com/@paralogical-dev

pininja

3 replies

1d3h

2024-06-23 14:41:40 UTC

This reminds me of Posy. His channel is so fun, weird, and captivating. https://youtube.com/@posymusic

hawski

1 replies

23h25m

2024-06-23 19:04:11 UTC

Watching his videos is often very calming for me at the same time. Visually striking beauty in simple things, calm narration and pleasing music. I also recommend his Lazy channel.

pininja

0 replies

22h24m

2024-06-23 20:05:18 UTC

Watching “Thing on my carpet” is like witnessing curiosity in its purest form

xrguy

0 replies

3h14m

2024-06-24 15:15:35 UTC

he is like david attenborough of tech

haunter

1 replies

1d1h

2024-06-23 16:36:59 UTC

Adult Swim's Off the Air

https://www.adultswim.com/videos/off-the-air

or on Youtube https://www.youtube.com/playlist?list=PLQl8zBB7bPvLWfGCVicg_...

DANmode

0 replies

12h27m

2024-06-24 06:02:03 UTC

If you're into dissociative media at the moment.

surfingdino

0 replies

19h54m

2024-06-23 22:35:19 UTC

So, basically TechMoan /s

fimdomeio

0 replies

9h11m

2024-06-24 09:18:53 UTC

Depending on your kind of weird this might interest you This Exists on youtube https://youtube.com/@thisexists/videos

amelius

0 replies

2024-06-23 17:33:36 UTC

Didn't Slashdot try this?

polshaw

10 replies

1d3h

2024-06-23 14:44:18 UTC

This is cool, as far as a practical issue though (aside from the 280gb TTF file!) is that it makes it incompatible with all other fonts; if you copy and paste your "improved" text then it will no longer say what you thought it did. It just alters the presentation, not the content. I guess you would have to ocr to get the content as you see it.

I was wondering why this was never used for an simpler autocorrect, but i guess that's why.

Also perhaps someone more educated on LLMs could tell me; this wouldn't always be consistent right? Like "once upon a time _____" wouldn't always output the same thing, yes? If so even copying and pasting in your own system using the correct font could change the content.

magnat

5 replies

1d2h

2024-06-23 15:54:16 UTC

if you copy and paste your "improved" text then it will no longer say what you thought it did

It's not a bug, it's a feature - a DRM. Your content can now be consumed, but cannot be copied or modified - all without external tools, as long as you embed that TTF somehow.

Which kind of reminds me of a PDF invoices I got from my electricity provider. It looked and printed perfectly fine, but used weird codepoint mapping which resulted in complete garbage when trying to copy any text from it. Fun times, especially when pasting account number to a banking app.

yjftsjthsd-h

2 replies

14h8m

2024-06-24 04:21:36 UTC

Eh, what AI taketh, AI can give; modern OCR has gotten mostly decent. If you're on Windows you should try the powertools OCR tool.

skissane

0 replies

9h55m

2024-06-24 08:34:38 UTC

If you're on Windows you should try the powertools OCR tool.

Which is open source (MIT-licensed), the source code is here: https://github.com/microsoft/PowerToys/tree/main/src/modules...

It is written in C#, and uses the Windows.Media.Ocr UWP API to do the actual OCR part: https://learn.microsoft.com/en-us/uwp/api/windows.media.ocr?... – so if your app runs on Windows it can potentially call the same API and get OCR for free

Apple provides OCR through VisionKit ImageAnalyzer API – https://developer.apple.com/documentation/visionkit/imageana... – albeit that is only officially supported to call from Swift (although apparently you can expose it to Objective C if your write a "proxy Swift framework"–a custom Swift framework that wraps the original and adds @objc everywhere–I assume such a proxy framework could be autogenerated using reflection, but I'm not sure if anyone has written a tool that actually does that). There is also the older VNRecognizeTextRequest API which is supported by Objective C, but its OCR quality is inferior.

I'm not sure what the best answer for Linux or Android is. I guess https://github.com/tesseract-ocr/tesseract ?

OmegaMetor

0 replies

6h19m

2024-06-24 12:10:15 UTC

A very similar thing is also just built in to the screenshot tool, at least in Windows 11, easier for me to use since it's the same keybind as always to take a screenshot, then it's just a tool in it.

mbb70

1 replies

2024-06-23 17:57:10 UTC

This is while pretty much all software that extracts structured data from PDFs throws away the text and just OCRs the page. Too many tricks with layouts and fonts.

knallfrosch

0 replies

6h4m

2024-06-24 12:25:01 UTC

I'm always surprised how "generate PDF from Word" turns one word into 10 different print points, all with just a single letter.

Or even straight lines in a table. The straight lines from a table boundary get hacked into pieces. You'd think one line would be the ideal presentation for a line, but who are you to judge PDF?

Retr0id

1 replies

1d3h

2024-06-23 14:49:53 UTC

If there's any randomness involved in inference, it ought to be deterministic as long as the same seed is used each time.

furyofantares

0 replies

19h46m

2024-06-23 22:43:51 UTC

Is there even any possibility of using a different seed? I'd doubt the WASM shaper has accesss to any source of non-determinism.

nacs

0 replies

13h35m

2024-06-24 04:54:28 UTC

The small model/TTF is only 60MB.

The 280GB you saw is the Llama3-70B model which is basically chatgpt level (if not better).

NoobSaibot135

0 replies

3h30m

2024-06-24 14:59:08 UTC

this wouldn't always be consistent right? Like "once upon a time _____" wouldn't always output the same thing, yes?

Would be cool if you could turn up/down the LLM’s temperature by pressing different keys other than just !!!!

Say pressing keyword numbers 0-9

xg15

8 replies

1d3h

2024-06-23 14:55:11 UTC

The font shaping engine Harfbuzz, used in applications such as Firefox and Chrome, comes with a Wasm shaper allowing arbitrary code to be used to "shape" text.

Has there already been a proposal to add scripting functionality to Unicode itself? Seems to me we're not very far from that anymore...

crazygringo

2 replies

1d1h

2024-06-23 17:09:41 UTC

To Unicode? Good god please no. Unicode is just codepoints. I shudder to think what adding scripting support to that would even mean.

Maybe you meant adding it to OpenType?

xg15

1 replies

22h19m

2024-06-23 20:09:56 UTC

I was being sarcastic, but yes, I meant unicode...

crazygringo

0 replies

22h14m

2024-06-23 20:15:20 UTC

Sometimes you just can't tell, you know... OK, my sanity is restored, thanks. :)

winternewt

0 replies

2024-06-23 17:30:38 UTC

You mean encoding executable code in plain text files, that execute when you open them? No, that seems unnecessary and very insecure.

reportgunner

0 replies

6h5m

2024-06-24 12:24:44 UTC

That sounds disgusting.

oopsallmagic

0 replies

18h30m

2024-06-23 23:59:13 UTC

No, because Unicode doesn't concern itself with rendering, it's just for codepoints.

magicalhippo

0 replies

1d3h

2024-06-23 15:00:31 UTC

Unicode OS when?

DemocracyFTW2

0 replies

1d2h

2024-06-23 15:48:21 UTC

Considering the actual complexity of rendering e.g. Urdu in decent, native-looking way you presumably do want some Turing-complete capabilities at least in some cases, cf "One handwritten Urdu newspaper, The Musalman, is still published daily in Chennai.[232] InPage, a widely used desktop publishing tool for Urdu, has over 20,000 ligatures in its Nastaʿliq computer fonts." (https://en.wikipedia.org/wiki/Urdu#Writing_system)

Edit—the OP uses this exact use case, Urdu typesetting, to justify WASM in Harfbuzz (video around 6:00); seems like Urdu has really become the posterchild for typographic complexity these days

simonw

7 replies

1d3h

2024-06-23 14:59:26 UTC

The font shaping engine Harfbuzz, used in applications such as Firefox and Chrome, comes with a Wasm shaper allowing arbitrary code to be used to "shape" text.

In that case could you ship a live demo of this that's a web page with the font embedded in the page as a web font, such that Chrome and Firefox users can try it out without installing anything else?

binwiederhier

3 replies

1d3h

2024-06-23 15:15:57 UTC

In the video he shows that the font file size is 290GB, so I would assume that's a little prohibitive.

codezero

1 replies

1d3h

2024-06-23 15:21:32 UTC

That’s only for a 70B param LLM. The one he includes is 15M params and weighs about 60MB. Not tiny, but doable.

bandrami

0 replies

14h12m

2024-06-24 04:17:26 UTC

That's smaller than Noto

azeirah

0 replies

1d3h

2024-06-23 15:25:40 UTC

That's LLaMa-3-70B. The demo he gives at 6:09 is tinystories-15m, which is 30.4MB, so you'd only have to add the font to that (80~KB?)

https://huggingface.co/nickypro/tinyllama-15M/tree/main

chazeon

1 replies

1d3h

2024-06-23 15:14:01 UTC

As shown in the video, the font is 280 GB, so opening such a page will practically be a nightmare, especially if you are on cellular.

yreg

0 replies

2h31m

2024-06-24 15:58:19 UTC

The font is 60MB.

Usage: Just download llama.ttf (60 MB download, since it's based on the 15M parameter TinyStories-based model demoed above) and use it like you would any other font.

erk__

0 replies

23h55m

2024-06-23 18:33:54 UTC

The wasm shaper is an experimental feature that is not enabled in any browser at the moment.

closetkantian

7 replies

1d3h

2024-06-23 14:34:48 UTC

This is really cool, but I'm left with a lot of questions. Why does the font always generate the same string to replace the exclamation points as he moves from gedit to gimp? Shouldn't the LLM be creating a new "inference"?

As an aside, I originally thought this was going to generate a new font "style" that matched the text. So for example, "once upon a time" would look like a storybook style font or if you wrote something computer science-related, it would look like a tech manual font. I wonder if that's possible.

closetkantian

6 replies

1d3h

2024-06-23 14:58:37 UTC

So, another poster cleared up my first question. It's probably because the seed is the same. I think it would have been a better demo if it hadn't been, though.

thomasfromcdnjs

3 replies

1d1h

2024-06-23 16:33:45 UTC

But having the same "seed" doesn't guarantee the same response from an LLM, hence the question above.

wavemode

1 replies

2024-06-23 17:37:28 UTC

I fail to understand how an LLM could produce two different responses from the same seed. Same seed implies all random numbers generated will be the same. So where is the source of nondeterminism?

furyofantares

0 replies

2024-06-23 18:10:46 UTC

I believe people are confused because ChatGPT's API exposes a seed parameter which is not guaranteed to be deterministic.

But that's due to the possibility model configuration changes on the service end and not relevant here.

dragonwriter

0 replies

23h40m

2024-06-23 18:48:58 UTC

Barring subtle incompatibilities in underlying implementations on different environments, it does, assuming all other generation settings (temperature, etc.) are held constant.

fuglede_

1 replies

2024-06-23 17:54:02 UTC

You got it, same seed in practice, but also just temperature = 0 for the demo actually. A few things I considered adding for the fun of it were 1) a way to specify a seed in the input text, 2) a way to using a symbol to say "I didn't like that token, try to generate another one", so you could do, say, "!" to generate tokens, "?" to replace the last generated token. So you would end up typing things like

"Once upon a time!!!!!!!!!!!!!!!!!!!!!!!!!!!!!SEED42!!!!!??!!!??!"

and 3) actually just allow you to override the suggestions by typing what letters on your own, to be used in future inferences. At that point it'd be a fairly generic auto-complete kind of thing.

jameshart

0 replies

23h24m

2024-06-23 19:05:08 UTC

Using the input characters to affect the token selection would increase the ‘magic’ a little.

As it is, if you go back into a string of !!!!!!!!!! That has been turned into ‘upon a time’, and try to delete the ‘a’, you’ll just be deleting an ! And the string will turn into ‘once upon a tim’.

If you could just keyboard mash to pass entropy to the token sampler, deleting a specific character would alter the generation from that point onwards.

stgiga

6 replies

15h9m

2024-06-24 03:20:29 UTC

WebAssembly in fonts doesn't sound very secure, coming from someone who is certified in cybersecurity and has spent years doing font stuff.

ohmyiv

4 replies

14h55m

2024-06-24 03:34:06 UTC

Yes, that's the general consensus in the comments. It doesn't even sound safe to me and I'm not a full security pro. But OP did it as a PoC/for fun. It's okay to have fun still.

lewispollard

3 replies

5h38m

2024-06-24 12:51:47 UTC

It's not what OP did that isn't safe, it's the mechanism that he used in HarfBuzz.

ohmyiv

2 replies

2h24m

2024-06-24 16:05:46 UTC

Sorry for not disclosing everything that could go wrong, but you seemed to have missed my point while trying to be exact.

lewispollard

1 replies

1h37m

2024-06-24 16:51:54 UTC

Again, it's not that anything the OP did is unsafe or could go wrong.

ohmyiv

0 replies

44m

2024-06-24 17:45:43 UTC

Again, thanks for missing my point.

lifthrasiir

0 replies

14h59m

2024-06-24 03:30:07 UTC

But probably much better than custom VM like TrueType bytecodes or embedded PostScript...

geor9e

5 replies

2024-06-23 18:23:22 UTC

build Harfbuzz with -Dwasm=enabled and build wasm-micro-runtime, then add the resulting shared libraries, libharfbuzz.so.0.60811.0 and libiwasm.so to the LD_PRELOAD environment variable before running a Harfbuzz-based application such as gedit or GIMP

It'd be lovely if someone embedded the font in a website form to save us all the trouble of demoing it

erk__

4 replies

2024-06-23 18:27:36 UTC

It would not be of much use as no browser enables this experimental feature. So unless you somehow build a wasm build of Harfbuzz with the feature enabled and embed it on there nothing will happen.

choppaface

1 replies

22h42m

2024-06-23 19:47:14 UTC

And thank goodness it’s disabled, or we could have another JBIG2 https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-i...

pennomi

0 replies

20h5m

2024-06-23 22:24:28 UTC

Yeah I know these posts are all funny use cases, but all I can see are font-based security nightmares.

curtisf

0 replies

14h38m

2024-06-24 03:51:50 UTC

Are there _any_ generally available consumer applications (document viewers, printers, obscure browsers, ...) that use a TTF font renderer with the WASM feature enabled?

Lockal

0 replies

14h36m

2024-06-24 03:53:29 UTC

According to demo, this feature has no opt-in, so if Android/iOS/any Linux distro ships with "better fonts feature for LibreOffice" it will be enabled in every text editor/browser/electron app, up to systemd blue screen of death.

tcsenpai

4 replies

1d2h

2024-06-23 16:14:10 UTC

I may be doing this wrong but...the font provided just install as OpenSans and does not provide any functionality at least in mousepad or LibreOffice Writer. I am talking about the 90mb one

fuglede_

3 replies

2024-06-23 18:00:02 UTC

Yeah, sorry, that could have been clearer, I added a few more instructions. Basically, chances are that even if you've got Harfbuzz running, you're still running a version with no Wasm runtime. If so, chances are you can get away with building it with Wasm support, then add the built library to LD_PRELOAD before running the editor.

tcsenpai

2 replies

23h42m

2024-06-23 18:47:30 UTC

That was useful. I have indeed compiled and installed wasm-micro and now meson build it successfully. Tho "meson compile -C build" returns an error about not finding "hb-wasm-api-list.hh". Do you have any experience of that?

EDIT: Nevermind. Using the exact commits you linked give another error (undefined reference to wasm_externref_ref2obj). I give up

fuglede_

1 replies

22h54m

2024-06-23 19:35:52 UTC

Another font connoisseur put together a script here that might be helpful: https://github.com/hsfzxjy/Harfbuzz-WASM-Fantasy/blob/master...

tcsenpai

0 replies

21h41m

2024-06-23 20:48:50 UTC

Managed to build it using "-DWAMR_BUILD_REF_TYPES=1" And it works! Now is time to dive deep :)

jonathaneunice

4 replies

1d3h

2024-06-23 14:42:25 UTC

I never imagined a future in which PDFs talked back. Now I can.

bandrami

1 replies

14h13m

2024-06-24 04:15:55 UTC

PostScript is Turing complete and your PDF reader is a PostScript interpreter. So, yeah, potentially any PDF is instructions to a general-purpose computer.

layer8

0 replies

12h36m

2024-06-24 05:52:57 UTC

PostScript is deprecated in PDF though.

anthk

1 replies

22h2m

2024-06-23 20:27:11 UTC

PostScript files are dynamic code. You can create polygons dynamically with commands. And, of course, font FX's, styles, elipses...

Also, there's a ZMachine interpreter (text adventure player) written in PostScript which can play Zork and some libre games such as Calypso with just GhostScript, the PostScript interpreter most software use to render PostScript files.

fourthark

0 replies

12h50m

2024-06-24 05:39:45 UTC

llama.pdf when?

bastien2

3 replies

22h33m

2024-06-23 19:56:36 UTC

Well this definitely won't get exploited at all or lead to new strict limits on what Harfbuzz/WASM can do

lxgr

2 replies

22h15m

2024-06-23 20:14:15 UTC

WASM sandboxing is pretty good! Together with the presumably very limited API with which this can communicate with the outside world, I wouldn't be too concerned.

To me, it's a great reminder that the line between well-sandboxed turing-complete execution environments and messy implementations of decoders for "purely declarative" data formats can be quite blurry.

Said differently, I'd probably trust Harfbuzz/WASM more than the average obscure codec implementation in ffmpeg.

pbmahol

1 replies

11h52m

2024-06-24 06:37:50 UTC

Is there scientific proof of above claim such as "WASM sandboxing is pretty good!" ?

At least most if not all ffmpeg decoders and demuxers are fuzzed all the time and any found issue is addressed.

lxgr

0 replies

4h26m

2024-06-24 14:03:42 UTC

Fuzzing is good, robust sandboxing is better, I'd argue. There's just a much smaller surface area to cover for the latter.

Is there scientific proof of above claim such as "WASM sandboxing is pretty good!" ?

I'm not aware of quantitative studies, but just from a design perspective, the surface that a WASM runtime presents seems intrinsically easier to defend than that of, say, the full Unix userspace that ffmpeg instances usually run in.

Anecdotally, many high-profile iOS and Android vulnerabilities originated in some more or less obscure codec implementation.

amai

3 replies

10h41m

2024-06-24 07:48:50 UTC

Does this mean fonts are Turing complete nowadays? Sounds like a pretty bad idea for security.

exDM69

1 replies

7h58m

2024-06-24 10:31:10 UTC

TrueType fonts have had a Turing complete virtual machine (almost?) since the beginning. It is used for "hinting" to allow partially colored pixels at low resolutions to remain legible. It's basically a program that decides whether to color a pixel or not to allow fine tuning of low resolution rasterization.

This isn't used as much today with modern large resolutions where we can get decent image quality from just rasterizing the font outline with anti aliasing.

This example, however, is using wasm embedded to ttf fonts which is not the same as ttf hinting byte code.

amai

0 replies

3h18m

2024-06-24 15:11:41 UTC

TrueType fonts have had a Turing complete virtual machine (almost?) since the beginning. It is used for "hinting" to allow partially colored pixels at low resolutions to remain legible. It's basically a program that decides whether to color a pixel or not to allow fine tuning of low resolution rasterization.

That sounds like an awful idea, too. I think a font file should describe the fonts form, but it should not describe how it is gonna be rendered. That should be up to the render engine of the device that is going to display the font (printer driver, monitor driver...). But I guess this idea is from a time when people were still using bitmap fonts.

michaelt

0 replies

9h36m

2024-06-24 08:53:41 UTC

Apparently the font can only embed WASM, which is sandboxed so it can't do anything except turning a buffer of codepoints into glyphs and positioning them.

Of course, back in the 1990s Java and Flash were supposed to be sandboxed. So who knows?

LeonigMig

3 replies

1d4h

2024-06-23 14:11:51 UTC

this is over my head

polshaw

2 replies

1d3h

2024-06-23 14:33:07 UTC

The critical part is knowing that TTF fonts can include a virtual machine.. then he pops an llm into that and replaces instances of !!!!!! with whatever the llm outputs.

oopsallmagic

0 replies

18h26m

2024-06-24 00:02:55 UTC

Not exactly. Harfbuzz, the font shaping library, has an optional feature to use WASM for shaping. Normal font hinting is much more restricted, precisely because Turing-complete fonts are a horrible idea.

abecedarius

0 replies

1d3h

2024-06-23 14:45:04 UTC

Thank you. I wasn't going to watch a video to find out how the LLM actually affects any output.

wiradikusuma

2 replies

1d3h

2024-06-23 14:37:57 UTC

So how do you copy the output?

simonw

0 replies

1d3h

2024-06-23 15:00:42 UTC

Screenshot and OCR!

phaym

0 replies

1d3h

2024-06-23 15:02:27 UTC

Since it only alters the presentation of the text, not the text/data itself, maybe using a type of image-to-text tool like this could work: https://www.imagetotext.info/

I guess that’s the closest you get to copying.

hsfzxjy

2 replies

1d4h

2024-06-23 14:06:11 UTC

cool. is there a github repo to produce this thing?

skilled

1 replies

1d3h

2024-06-23 14:47:49 UTC

https://github.com/fuglede/llama.ttf

simonw

0 replies

1d3h

2024-06-23 15:03:43 UTC

I love that the "Why?" section is deliberately left blank.

rhyjyrtjhtyn

1 replies

1d2h

2024-06-23 15:32:19 UTC

The author categorizes this as "pointless" but some things I can think of is being able to create automated workflows within an app that didn't previously allow it or had limited scope and then creating app interoperability with other app's using the same method.

ComputerGuru

0 replies

1d2h

2024-06-23 15:41:22 UTC

You mean via wasm hinting in general or embedded llm in specific? Because I don’t see why you need an llm for that.

freitasm

1 replies

22h12m

2024-06-23 20:17:46 UTC

Stopped watching when the demo showed the letter O with a slash. That would confuse me a lot. I am an old timer and expect the zero to have it.

samatman

0 replies

2h48m

2024-06-24 15:41:53 UTC

It's not possible to write the letter Ø without a slash. The slash is part of the letter.

bitwize

1 replies

1d3h

2024-06-23 15:03:40 UTC

The font shaping engine Harfbuzz, used in applications such as Firefox and Chrome, comes with a Wasm shaper allowing arbitrary code to be used to "shape" text.

Oh, this can't be used for nefarious purposes. What could POSSIBLY go wrong?!

oopsallmagic

0 replies

18h28m

2024-06-24 00:01:53 UTC

The engine isn't built with it by default, so this is a non-issue.

Xlythe

1 replies

1d3h

2024-06-23 15:03:38 UTC

It seems like it'd be possible to, instead of typing multiple exclamation points, have one trigger-character (eg. ). And then replace that character visually with an entire paragraph of text, assuming there aren't limits to the width of a character in fonts. I suppose the cursor and text wrapping would go wonky, though.

You could also use this to make animated fonts. An excuse to hook up a diffusion model next?

Lockal

0 replies

14h41m

2024-06-24 03:48:02 UTC

"animated fonts" - not really; all meaningful applications not only calculate shaping once, they also aggressively cache the result (mentioned in https://robert.ocallahan.org/2024/06/browser-engine.html)

But things like this might be possible (for now): https://gwern.net/dropcap

UncleOxidant

1 replies

1h50m

2024-06-24 16:38:57 UTC

Can someone explain why HarfBuzz isn't a potentially serious security vulnerability? Couldn't someone create a .ttf file that looks like one of the standard .ttf files but includes similar capability to this llama.ttf to execute arbitrary code?

progbits

0 replies

1h37m

2024-06-24 16:52:32 UTC

https://webassembly.org/docs/security/

zharknado

0 replies

22h21m

2024-06-23 20:08:28 UTC

My takeaway is that if you can efficiently simulate rendering raster graphics with text ligatures, you could run Doom in a TTF.

Right?

yourfriendpalsy

0 replies

1d3h

2024-06-23 14:57:45 UTC

Interesting idea, but needs to be ported to the Typescript type system.

tcsenpai

0 replies

20h42m

2024-06-23 21:47:20 UTC

After your help and troubleshooting, I am happy to notify you that your work has been archived (https://archive.tunnelsenpai.win/archive/1719179042.512455/i... and in the Internet Archive). Thanks!

ranger_danger

0 replies

22h39m

2024-06-23 19:50:11 UTC

This is terrifying.

pk-protect-ai

0 replies

1d2h

2024-06-23 16:04:22 UTC

I will never allow my linux to update my fonts ever again ... Arbitrary code execution in its finest form.

petters

0 replies

22h17m

2024-06-23 20:12:31 UTC

The page links to https://www.coderelay.io/fontemon.html which is a game embedded into a font. Playable in the browser.

lacoolj

0 replies

1d3h

2024-06-23 15:19:06 UTC

Hello. I'm Dr. Sheldon Cooper. And welcome to Sheldon Cooper Presents: Fun with Fonts

kylehotchkiss

0 replies

17h26m

2024-06-24 01:02:56 UTC

Is this the AI hype cycle equivalent of in browser crypto mining? (once the file size goes down a little)

jraph

0 replies

23h7m

2024-06-23 19:22:19 UTC

(Show HN)

ilrwbwrkhv

0 replies

12h58m

2024-06-24 05:31:00 UTC

This is so so awesome! One of the best things I have seen so far this year.

fuglede_

0 replies

1d6h

2024-06-23 12:24:20 UTC

Very much inspired this earlier HackerNews post which put Tetris into a font, today we put an LLM and an inference engine into a font so you can chat with your font, or write stuff with your font without having to write stuff with your font.

https://news.ycombinator.com/item?id=40737961

fitsumbelay

0 replies

4h21m

2024-06-24 14:08:05 UTC

excellence

exe34

0 replies

1d3h

2024-06-23 15:28:03 UTC

your engineers were so busy finding out if they could, they never stopped to ask if they should!

est

0 replies

16h42m

2024-06-24 01:47:04 UTC

first time I've heard of harfbuzz.

So we could expect latex.ttf very soon?

bbor

0 replies

20h25m

2024-06-23 22:04:52 UTC

Wow, this is incredible. OP you (I?) should train a few models with different personalities/tasks and pair them with the 5 GitHub Monaspace fonts accordingly, allowing people in multifont programs to easily get different kinds of help in different situations. Lots of little ideas sparked by this… in general, I think this a good reminder that we are vastly underestimating fonts in discussions of UI (and, it appears, UX in full!)

anthk

0 replies

22h11m

2024-06-23 20:18:38 UTC

A Z Machine in a TTF font, anyone?

UncleOxidant

0 replies

2h0m

2024-06-24 16:29:03 UTC

about 1/3 way through the video and I'm getting the impression this is an elaborate joke.

NayamAmarshe

0 replies

2024-06-23 18:12:59 UTC

This is the coolest thing I've seen this week.

Dwedit

0 replies

2024-06-23 17:53:24 UTC

I thought the Bad Apple font was really neat, but this is just too much.