One of the reasons is to make the papers more accessible to people with disabilities, especially the blind. I participated in a conference they hosted on this a few months ago, I recommend taking a look at the recordings if you're interested in thinking on this.
Blind person here, can confirm this. Reading PDFs with a screen reader is bad, reading PDFs that come from LaTeX is worse, reading LaTeX math is pretty much impossible. All the semantic info you need is just thrown away.
You can make decently accessible PDFs but it's lots of work, you need Acrobat on the producer' side and might also need it on the consumer's side. Free tools don't even come close. There's also the fact that the process of making accessible PDFs in Acrobat isn't itself accessible.
With that said, the way screen readers treat HTML math certainly isn't perfect, it's geared more towards school children than anything above calculus. I'm probably going to stay with my LaTeX source files for now. At least ArXiv offers those, not many sites do. To be fair, that approach also has its own set of problems (particularly when people use some extra fancy formatting in their math equations, making the markup hard to read), but I find this to be the best approach for me so far, at least on AI/ML papers.
Huh. It would seem like, of all the things which should make it easy to generate the correct accessibility information, the pipeline of compiling a paper from source code in LaTeX should nail it... maybe we should all pitch in to some pool to pay someone to put in the required effort to connect all the dots?
Kind of tangential, but it's also kind of surprising how difficult it is in LaTeX to make a plot of an equation.
Say I have Equation \ref{eq}. Why can't I just say "plot \ref{eq} for x from -6 to 11" and get my graph?
And yes, I know about pgfplots, PSTricks, TikZ etc. But in all those cases, I need to define the same equation twice, in different syntax to boot. It's kind of unsatisfying.
TeX is a very arcane language, and it doesn't support floating point numbers. Few languages would be less suited for making a plotting library.
Both pgfplots and PSTricks and TikZ are plotting libraries. It seems like it shouldn't be that hard to let them plot an equation written elsewhere in different syntax.
Pretty much for the same reason you cannot press a word and get a pop-up dictionary definition in a paper book.
To be clear, I meant in the LaTeX source code. And there I can already write code that plots equations, I just have to re-type the equation in a new syntax.
Surprisingly it’s not easy, and depending on the field it can be quite challenging. The reason for this is that TeX captures the visual aspects of typesetting, not the semantic meaning of the mathematics.
A simple example is ‘\sum’ which provides no way to capture the expression being summed over - because that’s not necessary for typesetting. That’s not the case in, say, MathML.
Writing MathML is no fun though because mathematical formulae are visually ambiguous and we rely on the context to know how to read them, e.g. does ‘f(x - 1)’ mean function f called with argument x - 1, or does it mean variable f multiplied by x - 1?
Hold on... Are you telling me that all these complex sentences are being typed out based on your voice alone? That's insane.
There are braille keyboards too
Or normal keyboards? Many people can type blind. Some learned to do so while born blind, others became blind after they had already learned this skill.
I would assume that the majority of persons on HN are not looking at their keyboard as they type.
I was just giving an additional way to use a computer not known by many. Either way, we shouldn't rely on the skills of a few to interact with a computer.
I'd say it would be simple to talk type these using windows 11's redux of voice typing. Pretty damn accurate and easy to modify/variate text/options. I use it all the time to make tech/engineering blog posts, faster and more organic than typing, typically, and it learns your technoacronyms. Combined with voice access, it makes it trivial to fully operate your computer (well, at least, browse the web, email, and media apps) from across the room. For anyone who hasn't tried the updated version, highly suggest hitting windowskey+h and giving it a shot.
Hm tangential question but shouldn't touch typing be well accessible for many blind computer users?
? blind people can use keyboards
I wrote an app called PDF Reflow that reflows the original PDF using image processing to cut out words into tiles so you see the reflowed version of the text in their original look.
https://www.appblit.com/pdfreflow
Any chance of releasing an Android version?
+1
It’s using web technologies so yes it could also be on Android. I’ll see what can be done.
Gv (part of ghostscript) used to do a good job of this for two column documents. When zoomed in to show one column width of text, the spacebar ran through the top of column 1, then the bottom of column 1, then the top of column 2 and so on.
The amount it scrolled probably depended on the aspect ratio of the window, so it might be multiple key presses to scroll an entire column.
+1
I teach math at a university. A couple years ago I had two blind students in my section of first-year calculus, and I really struggled with the tooling. Using latexml, I could produce documents that one of the students could use with a screen reader, but the other student never managed to make it work on their machine. Both students prefer braille but I didn't find anything open source that could typeset mathematical braille easily. Our disability resource office sends things out to a contractor to typeset into braille; the turn-around is measured in weeks.
Anyway, if you (or anyone else reading this) has suggestions I'd really appreciate it!
This seems a massive gap in the market - many institutions have funding earmarked for such things.
I wonder if this is a useful service that an llm could actually outperform humans on.
Interesting! I never thought about this, thank you for sharing.
What kind of turn-around time would be practical? Could you point me to any typeset mathematical braille that would be an example of a solution to your problem? Is Nemeth the only important standard, or are others important for you too?
I'm wondering if it's practical to set this up as back-office work here in Vietnam. There are some outlying provinces here where there are very few job opportunities. Job opportunities for the blind also round down to zero here (e.g. I could hire for proofreading). Maybe there's room to do something cool here.
How's English proficiency (and American braille code proficiency) like in Vietnam?
Keep in mind that most blind people who speak English fluently but don't live in an English-speaking country (myself included) can't read English braille, or at least not well. Because of how voluminous Braille is, it uses contractions, single characters that replace common words and character combinations like "the", "would", "ing" or "ed". Those tend to be language specific, never taught outside their country or countries of use, and hard to get accessible electronic materials for. The math codes are completely different too, we use something derived from Marburg, while English-speaking countries use Nemeth. Even basic characters like + and - differ between those two, not to mention more complicated structures. It's not just the dot patterns that are different but also the design principles, like where you put spaces or when you can omit "begin fraction" / "end fraction" characters.
I learned (the basics of) LaTeX in my last year of middle school, and stuck with it ever since. To be fair, I was into computers since I was a child, played with Rockbox at the age of 10, started to dabble in programming shortly after, so this was a lot less scary than most of the things I was doing already. I took my middle and high school finals (they're kind of like SAT but matter a lot more) by producing LaTeX output, which I then compiled to PDF and printed. The test itself was in braille, as that was all that our government could do.
Throughout college, my first question to most of my professors of math subjects was "do you do LaTeX, and can you give me your source code." Most said yes, and that's how we worked. LaTeX in, LaTeX or PDF out, depending on what the professor preferred.
The amount of LaTeX you need for calculus 1 isn't that great, you could probably teach it to a relatively bright student if you had an hour or two to spare, and then give them the source files. If you have the time, I'd suggest producing "stripped" versions of your files, with as little markup as possible to get your point across and no fancy formatting unless absolutely necessary. The amount of hoops some books and papers jump through to "look nice" drives me crazy.
You could also consider producing, teaching and consuming ASCII math, which seems like an even simpler and friendlier format. I couldn't really use it much in my school career for boring technical reasons, but it looks like a promising option.
Do you think there's potential for language models to play a role here? I know that AI can get tossed around as a buzzword, but hasn't it proved quite successful in fields like computer vision?
I'm not deeply familiar with the state of that art, but it seems like recovering the metadata from a PDF generated by LaTeX would be no more impressive than many other things we're currently seeing language models achieve?
I'm absolutely positive a few million dollars could get you a system that can "read aloud" pdf math papers in no time. I guess people will wait for it to become cheaper though.
You can also have that cheaper already. But having it stable and reliable - will take some time and possibly more money, depending on your definition of reliable.
You wouldn't need to use computer vision on a picture of the PDF. arXiv has the tex source for most of the papers. An LLM trained on code could do a pretty good job of translating tex to readable html with a bit of effort.
Mathpix is trying to achieve something like this, and they do consider the visually impaired market AFAIK, but it's pretty expensive and I have no experience with it personally, so I can't say how good it is.
I made these arguments two decades ago when I was still in university that PDF is a horrible format because it's purely præsentational, especially for people with disabilities whose software relies on semantic information. LaTeX last time I used it didn't even have a different symbol for uppercase Alpha and A because the glyphs are indistinguishable.
They argued that PDF was superior because the publisher could control how it looked and it looked the same everywhere but the point is that it should not. Things such as font size and line spacing should be at the control of the consumer, not the publisher. This isn't simply blind people but for instance also persons with dyslexia who use particular fonts to make it easier to read for them. Or in my case, someone who simply gets a headache from fronts and line-spacing that is too big. I've also been using darkmode everywhere for so long now that reading black text on a white surface on a screen gives me a headache.
For scientific articles pagination is still important, because it's how you refer to a particular part of a paper. If things like font size and line spacing are at the control of the consumer, pagination is not preserved.
This problem is harder than you one would think naively.
Seems like they should use detailed section numbering like military documents and laws. Referring by page number seems very course by comparison.
To write uppercase Alpha you need a modern version of latex (ie xelatex or lualatex) and to include the unicode-math package
https://tex.stackexchange.com/questions/485593/how-to-write-...
For the math equations, I'm curious: does MathML do any better for you than LaTeX?
Not the person you’re asking the question to, but it’s worth noting (if you don’t already know) that MathML is really not designed at all as an input language for practitioners who just want to write a few equations in some document. It’s designed as an output/presentation language so that devices that want to render some maths can do so faithfully[1]. As such, if you’re a human being who wants to typeset some equation, you’ll want to go to latex every single time rather than mathml and then someone else has to figure out the conversion.
[1] Great explanation here https://tex.stackexchange.com/questions/57717/relationship-b...
On the other hand, "semantic" flavor of MathML (as opposed to "presentation") is much easier than TeX for things like screen readers, both conceptually and in practice.
Yup LaTeX math doesn't make sense. I've been trying to hack my way into getting a voice model to read it but no real progress.
LaTeX is a programming language for generating beautiful pages, basically a typesetting system. It serves this purpose fantastically well.
It was not designed to provide semantic information, unfortunately. So getting anything other than visual representation out of it is hard.
Emacs with Emacspeak has a math reading module.
For accessibility purposes (and regular reading), it would be so much better to drop the justified text. Ragged edge is the way to go!
https://www.boia.org/blog/why-justified-or-centered-text-is-...
Not necessarily:
https://heyman.info/2023/fill-justified-text-on-the-web
Perhaps someone can publish a paper to arXiv that provides a meta-analysis. But still there doesn't seem to be a clear reason to justify it, given that almost all internet text is not justified.
To me one of the exciting aspects of HTML is that we can theme the same article in different ways, tailored to individual preferences - just swap in a different CSS file.
Having a two-column theme, or left-aligned vs justified themes, could be workable in the long run. I hope that we get to see some browser extensions modding the pages before too long.
The reason for the current justified text is that it is the default aesthetic for a LaTeX-based article, and a lot of authors expect it.