return to table of content

Show HN: An open-source, self-hostable synced narration platform for ebooks

r4victor
7 replies
2d5h

Amazing! I've made a similar ebooks-audiobooks aligner years ago: https://github.com/r4victor/syncabook. At that time, I chose to synthesize the text and align two audio sequences because I found texts-alignment approaches (including ML-based ones) too compute-intensive and inadequate for long texts. I see Storyteller works by aligning the texts. Could you give some view on how long it takes to sync a book?

Also, my experience was that audio and text versions are often very different (e.g. the audio having an intro missing from the text). It'd be very interesting to know how well Storyteller handles such cases. Does it require manual audio/text editing or handle the differences automatically?

NoahKAndrews
5 replies
2d1h

The docs say it's usually 1-4 hours depending on the book and the hardware: https://smoores.gitlab.io/storyteller/docs/syncing-books

The docs also have a detailed section about the algorithm that goes into how it auto-handles differences between the audio and the text.

cyberax
4 replies
1d20h

One obvious optimization is to sample the audio file at regular intervals and transcribe only a part of the text. Then just interpolate the locations. This can speed it up by a couple of orders of magnitude.

smoores
3 replies
1d16h

This is true, but it really limits the ability to highlight the current sentence visually while it's being read, which is great for language learning and for reducing cognitive load. I actually spent a lot of time trying to get the timing as precise as I could to make this feel as natural as possible, and I think the effect is really nice!

cyberax
2 replies
1d14h

Ah, makes sense. Maybe have it as an option?

And I haven't realized that you can actually see sentences highlighted as they are being read. I'd love that for Chinese (I'm learning it, so it'd help me a lot). I'll try and see if it "just works", and contribute a patch if it doesn't.

yorwba
0 replies
1d5h

A few years ago, I made that as a YouTube channel based on LibriVox audiobooks, maybe you'll enjoy it:

Simplified Mandarin: https://youtube.com/playlist?list=PLVlVz7EDz7fprPeVpqQCvlkMI...

Traditional Mandarin: https://youtube.com/playlist?list=PLVlVz7EDz7fpQZr29P5hqVveL...

... also 33 other languages https://youtube.com/@literature_for_eyes_and_ears/playlists

I left it to languish once I discovered the demand wasn't that great and I was spending more time making the videos than people ultimately spent watching them.

smoores
0 replies
1d13h

There's an open ticket for languages other than English! https://gitlab.com/smoores/storyteller/-/issues/10. If you want to take a look, please do! I don't have any contributing guidelines yet, unfortunately, but they'll probably come soon. I think Whisper does have Cantonese and Mandarin support, so it should be possible to add support for those languages, though we'll have to look into nltk support for sentence tokenization as well!

smoores
0 replies
1d16h

Hello! syncabook is awesome, and indeed Storyteller does take "the opposite" approach when it comes to forced alignment.

Others have linked to the docs, where I go into detail about the syncing algorithm, but at a high level:

Storyteller uses Whisper to transcribe the audio to text (this is the most computationally expensive part of the process)

Then we use a Levenshtein-distance-based fuzzy search algorithm to find each chapter in the text (this is attempting to account for the difference between audio and text versions, as you said!)

Then for each chapter, we find the start and end timestamp of each sentence, again using a fuzzy search across the transcription.

In general, Storyteller does a pretty good job; it treats the ebook as the source of truth, which means that at the moment it sometimes misses introductory and ending pieces of the audiobook, though it's on the roadmap to have some support for explicitly triggering those when that happens.

rpxio
5 replies
2d2h

I absolutely love this. However, my wife and kids all read EPUBS on kobo e-readers, so I wish we could somehow sync the last page read from kobo to Storyteller so that we could pick up on audiobook later. I’m not opposed to installing koreader on all of our kobos either if that would be required for syncing… it does look like koreader doesn’t support epub3 media overlays, but it does have a sync feature.

whycome
3 replies
1d16h

Amazon now has the text (books), and the audio (audible), and it’s absurd that there’s not some sort of sync feature. It would actually encourage people to cross-purchase books. There are so many times that I’m reading an ebook and I want to continue while driving and wish there was some sort of obvious and seamless “handoff” to continue with the audio version.

Infinitesimus
2 replies
1d14h

This has existed for a while, it's called Whispersync for Voice. Not available for all titles but it's there

whycome
1 replies
1d12h

TIL!

It looks like the feature is only available if you “add on” the audible version when you’re making your ebook purchase? And, for limited titles.

If I just bought a book in audible, there should be a “buy ebook” button in that app! And if I have the book in kindle, it should give me the option to add on the audio book after purchase. Seems like a missed opportunity— there must be a reason for it being so clunky.

Edit: I have not been able to find a single whispersync title. Looks like it’s not enabled at all in Canada? And the US books that have the feature don’t even follow the setup (eg icon) as described on the website (https://www.audible.com/ep/wfs)

Uvix
0 replies
1d3h

You don’t need to buy ebook and audiobook at the same time.

The icon appears when looking at the audiobook’s product page (both on Amazon and Audible). Unfortunately the ebook page doesn’t appear to have something similar.

smoores
0 replies
1d16h

Thanks! Server-side position syncing and integration with KOReader are both on the roadmap, actually; you're not the first one to bring this up!

sandreas
3 replies
2d5h

This is pretty interesting...

I once wrote a similar thing for building a custom LJSPEECH dataset out of ebook/audiobook combinations to synthesize my favorite narrator voices using coqui-tts and the VITS model and make them "publish" books that never came out as audiobook.

It was able to synchronize the book contents to timestamps, split the spoken word in to sentences and create a LJSPEECH datasets out of the combinations. I used aeneas[1], it was a bit finicky to set up, but after a while it even was able to map non-english languages (in my case german) with more than 80% accuracy. Worked out pretty well, the LJSPEECH datasets were good (I still have them here), but the TTS tech was not there yet :-) Maybe it's time to revive this project using newer modelling approaches like XTTS or something...

[1]: https://www.readbeyond.it/aeneas/

vagrantJin
2 replies
2d

I've thought about exactly this a few years back but lacked the technical skills to implement it. there are some great books out there as you mentioned, but even worse are great books with mediocre narration/production. eg, A Song of Ice and Fire on Audible is absolutely horrid. The Martian by Andy Weir is fantastic. Can I transplant Will wheaton or Greg Tremblay into GOT? Can I have multiple characters narrated by different voices?

please revisit it if you can.

monkeywork
0 replies
1d23h

IMHO the original narration on The Martian by RC Bray is better than Wil's. I enjoyed Wil's work on Ernest Cline's books but RC Bray and Dennis E. Taylor are (for me) top of the mountain when it comes to SF narration.

aedocw
0 replies
1d13h

You can do this today, though you would definitely be breaking copyright (you need to strip the DRM from the epub), and if you're cloning someone's voice without their permission you're probably breaking some more laws. You're pretty safe though assuming you don't distribute it or try to make money.

Check out https://github.com/aedocw/epub2tts for creating an audiobook from epub. Take a look in the utils directory for notes about fine-tuning a voice clone. I can tell you I've done some voices that are close enough to the original to be pretty shocking.

Feel free to get in touch if you have any questions, it's pretty fun making your own audiobooks with the reader of your choice!

roywashere
3 replies
2d8h

How does the narration work, is it automatically generated? For a year now I have a long commute and listen to audiobooks. However I find the narration vary wildly in quality and think oftentimes text-to-speech might actually be better

tschumacher
0 replies
2d8h

You provide an audiobook and an ebook and it syncs them.

smoores
0 replies
1d16h

As others have said, you provide the audiobook (which could technically be something you generated yourself with TTS!) and Storyteller syncs it. However, I've added an issue on GitLab to investigate building TTS directly into Storyteller, because not all books have audiobooks, and it would be cool to fill that gap!

DecoPerson
0 replies
2d8h

Once we have individual tracks to work with, we begin transcription. This is the most resource intensive part of the process. We rely on the Whisper AI transcription model from OpenAI, via WhisperX. The WhisperX project also uses wave2vec2 to provide accurate word-level timestamps, which is important for sentence-level synchronization. The transcription process is fairly standard; the only interesting addition to the process that Storyteller makes is to supply an "initial prompt" to the transcription model, outlining its task as transcribing an audiobook chapter and providing a list of words from the book that don't exist in the English dictionary as hints.

https://smoores.gitlab.io/storyteller/docs/how-it-works/the-...

cyberax
3 replies
2d9h

You didn't include the link: https://smoores.gitlab.io/storyteller/

Looks super nice, the next step is to build a fully synced ecosystem for book management.

AnyTimeTraveler
1 replies
1d17h

You mean a system like Audiobookshelf[0]? I can highly recommend this, by the way. Works more reliably than any paid service I've ever tried.

[0] https://www.audiobookshelf.org/

cyberax
0 replies
1d14h

I'm more interested in something that would unite audiobooks and textual books.

I love to jump between listening (in my car or while walking) and reading. Right now, only Amazon Kindle + Audible provides a good experience, but it's impossible to import your own audiobooks into Audible.

smoores
0 replies
1d13h

Yup, this is the goal! Library management and a reader app already exist, though there's definitely work to be done, especially on the library management front.

bberenberg
3 replies
2d10h

Amazing, I’ve been wanting something like this for years. If only Libby would integrate this so it could be used with rented books.

It would be great if you could add a link to the app on the App Store.

ck_one
1 replies
2d7h

What’s your use case for it?

bberenberg
0 replies
2d6h

Audiobooks while running / cooking / other activity where reading doesn’t make sense.

Ebook elsewhere.

smoores
0 replies
1d16h

I keep forgetting to do this! Here's a link: https://apps.apple.com/us/app/storyteller-reader/id647446772... and I'll push up a change to the docs right now that includes that.

sphars
2 replies
2d5h

This is really neat, it's something I hadn't thought about before. I've started listening to audiobooks on my commute, but I read at night. I currently use audiobookshelf[0] to listen to my ebooks, and it has support for ebooks as well. I've added a comment[1] on a discussion if audiobookshelf could read the epubs your took creates.

[0]: https://www.audiobookshelf.org/

[1]: https://github.com/advplyr/audiobookshelf/issues/189#issueco...

smoores
1 replies
1d16h

I've started listening to audiobooks on my commute, but I read at night.

This is basically exactly why I started down this road over two years ago. I really wanted to be able to switch back and forth between my audiobooks and their text representations!

Thanks for mentioning Storyteller in that discussion, I'll have to hop in there!

sphars
0 replies
1d4h

Looking forward to trying out the Android app when it's available!

klakierr
2 replies
2d5h

This works only for drm-free ebooks and audiobooks?

smoores
0 replies
1d16h

That's correct; it requires the ability to analyze the actual contents of the ebook and audiobook, so they can't be locked down with DRM. This is unfortunately pretty limiting, but at least for audiobooks, online stores like libro.fm have pretty massive catalogues of DRM-free audiobooks!

NoahKAndrews
0 replies
2d1h

That's what the docs say, yes

atmosx
2 replies
2d9h

Good job! I'm probably going to use this. Would love to have my collection accessible from mobile. A small "nit". Would be great to have non-docker installation instructions readily available.

smoores
1 replies
1d16h

Thanks! What's your preferred installation/setup? I started with docker for simplicity/ubiquity, but I know it's not everyone's cup of tea. The API server in particular can be a bit challenging to set up properly, but if I know what folks are looking for I can try to provide some guidance!

atmosx
0 replies
1d6h

Most likely will try this on a RPi, as a dedicare machine, running Linux or FreeBSD.

I did not go through minimum requirements but I have a LAMP stack running radius for a small shop and so far, runs happily on an RPi 2 featuring 1 GB of RAM. I have daily backups and everything required to spin up a clown if the SD card becomes faulty.

0x073
2 replies
2d9h

More information would be nice, a link to the iOS app or screenshots or what features the project have.

Is it a ebook/a book library like audiobookshelf with sync or just sync? ( https://www.audiobookshelf.org/ )

smoores
0 replies
1d16h

That's a great point; I'll try to add some more of these. I definitely meant to link to the app store page from the docs; I actually just updated them to include that.

It's a full ebook/audiobook library, with sync, though I've focused much more on the reader experience so far than the library management experience. Improving the library management experience is on the horizon, though!

joshstrange
0 replies
2d5h

Finding the app wasn’t super easy, I do wish they’d link to from the mobile apps page

https://apps.apple.com/us/app/storyteller-reader/id647446772...

zachlatta
1 replies
2d4h

This looks absolutely incredible, and like something I’ve been trying to find for years! Thank you so much building this!

smoores
0 replies
1d16h

Thank you, that's so exciting! Please let me know if you try it out, I'd really love to get your feedback!

mosselman
1 replies
2d9h

Is there a demo of the narration? I couldn’t find any

danparsonson
0 replies
2d4h

It doesn't generate narration, it syncs existing audio books with their written counterparts by transcribing the audio.

mike986
1 replies
2d1h

Super cool project!

even though there's still a lot more to do

A few have asked on this thread already, but since you're already using AI to transcibe, it would be super cool if we can use AI to generate audio using TTS

I quit audible (signed up a few times) because there are very few high quality audio book, even those spoke by the authors are bad (most of them are not pro narrator)

A good AI would be amazing, as they never get tired speaking for hours, yet maintaining the same energetic voice, intonation and pace.

smoores
0 replies
1d16h
jupiter909
1 replies
2d8h

Looks like an interesting project.

I do highly suggest that a quick intro demo video and/or screen shots of a tool like this would be beneficial to the project.

smoores
0 replies
1d16h

Thanks! I think you're probably right. In the meantime, there are some screenshots on the App Store page for the reader app: https://apps.apple.com/us/app/storyteller-reader/id647446772...

joshstrange
1 replies
2d5h

This is super cool, I love my audiobook app (Prologue) but this could tempt me away. Looking forward to setting this up and trying it out!

smoores
0 replies
1d16h

Prologue is also my go to audiobook app, and I really do love it, too. Its a significant inspiration for me for the reader apps; hopefully one day soon it will have feature parity!

grigio
1 replies
2d7h

Does it sync the reading progress of the ebook among clients?

smoores
0 replies
1d16h

Not yet, but it's on the roadmap (https://gitlab.com/smoores/storyteller/-/issues/13)! This one actually ought to be pretty straightforward; the only trick to it is making sure that it doesn't interfere with the local-first goals of the apps.

chrisweekly
1 replies
1d23h

Awesome! Thanks for sharing and working on this! WhisperSync functionality is a game-changer; it's one of the main reasons I'm able to read so much (switching modalities several times per day). I'd love to see this featureset become ubiquitous instead of being so tightly coupled to proprietary, DRM'd Amazon / Audible.

smoores
0 replies
1d16h

Thanks, I'm so glad folks seem to like it! Agreed, I remain astonished how limited support is for synced narration (forget non-DRM; really only Amazon even provides this feature!). It's totally changed how I consume books. Hopefully EPUB's Media Overlay spec (which Storyteller uses) will become more widespread!

causality0
1 replies
2d1h

Can this function as "Plex for audiobooks"? I don't really have a need for synced books but it would be nice to keep fewer audiobooks on my phone.

orand
0 replies
16h1m

As noted elsewhere in the comments, you can use Plex with the Prologue app to literally have Plex for audiobooks. But yes, it seems this will also do what you want.

timmb
0 replies
1d9h

Looks great! Is there an e-ink e-reader it’s compatible with? Would love to abandon the Amazon castle but could not go back to reading on a screen.

t0mk
0 replies
1h57m

Is there a tool that would convert ebook to a single (or a set of) mp3?

snapplebobapple
0 replies
1d22h

man.. if someone could hook the creation service into audiobookshelf this could be an extremely potent combination..

smoores
0 replies
1d16h

Wow, this really blew up while I wasn't looking! Thank you everyone who's popped in here to ask questions and give feedback. If anyone does spend some time trying to set this up, please don't hesitate to hop into our Gitter channel (https://smoores.gitlab.io/storyteller/docs/say-hi) and say hi or ask for support or give feedback.

pseufaux
0 replies
4h14m

This looks awesome. I might be missing it somewhere, but what's the minimum required hardware to run something like this locally?

majora2007
0 replies
2d3h

Looks really nice. I wanted to do exactly this with my project Kavita, but have been distracted with other things. I've heard Whisper has great potential and a few of my users have been doing something similar with it.

Look forward to see how this project matures. We need more options in the book reading scene that are self-hosted and not Calibre.

ZunarJ5
0 replies
1d20h

Thank you for your hard work!!