return to table of content

Autogenerating a Book Series from Three Years of iMessages

yaky
15 replies
1d3h

Perhaps an unpopular opinion, but this is slightly creepy.

I never understood why people care to keep their private conversation history in the first place. IMO private messages (as opposed to public posts, blogs, etc) are supposed to be temporary ("ephemeral") - one does not record every face-to-face conversation or phone call after all.

tivert
1 replies
21h50m

I never understood why people care to keep their private conversation history in the first place.

One reason that's understandable without relying sentimentality, is they're a record of what you were doing or thinking at a particular time, much like a private diary.

There's been a few times where I've gone back though stuff like chat history to better understand something that I didn't realize the significance of at the time.

hn_user82179
0 replies
18h22m

I never was disciplined enough to keep a diary when I was younger, but I started using messenger when I was about 14. It’s pretty amazing to be able to go back and see my interactions, the way I communicated, the way I saw the world (am now 30). I feel lucky to be able to have that window into my past life.

jakespencer
1 replies
1d3h

I think this is interesting, and not necessarily unpopular. It seems different people just think about this issue differently. I do everything I can to preserve every single chat history that I can. And I would like to have every face-to-face conversation and phone call recorded and easily accessible for that matter. I have a sense that I am the sum of my experiences and I don't want to forget those experiences - it feels like I am somehow less than myself if I don't remember them.

But I've seen that episode of Black Mirror, too. So I wrestle with the desire to perfectly remember everything that I've ever experienced vs the mental and emotional health benefits that clearly come from being able to forget things.

yaky
0 replies
15h24m

I read your reply a while ago, but still can't wrap my head around "t feels like I am somehow less than myself if I don't remember them.". I forgot many things, and it's quite ok with me, so I'm trying to understand your view.

Are you trying to remember everything all of the time? Including all of the new memories?

Suppose you are able to record calls and face-to-face interactions. Are you going to spend hours of your life re-watching or fast-forwarding through mundane everyday things?

fennecfoxy
1 replies
7h54m

People used to write letters to each other and older folks usually have a drawer full of all of their old letters.

We have learned a lot of history by reading famous peoples' letters long after their deaths.

yaky
0 replies
4h43m

Unless you write your text messages deliberately composed, in multiple paragraphs, and it takes days to send and receive, the comparison is very flawed. More long-form digital communication, like email, is a bit closer to letters, though.

Which brings me back to my point that text chats are equivalent to a spoken conversation and should be treated as such, and not be kept forever. Especially not printed out as a gift. You wouldn't give someone close a video of them sleeping or leaving for work over last 3 years (even if you saw them do it), nor a map of their movements from a GPS tracker (even if they told you where they are going), because that does not respect their boundaries nor privacy.

Although after reading some responses, no doubt some people will think of those as "cherished memories".

digging
1 replies
1d

IMO private messages (as opposed to public posts, blogs, etc) are supposed to be temporary

Why? Privacy and permanency are orthogonal axes. You've never kept a cherished letter or re-read a thoughtful text message?

yaky
0 replies
20h32m

If the message was something I found interesting, important, or funny, I would usually copy or screenshot it. Or remember it. Although I don't proactively delete old messages, I never intentionally backed up or transferred message history between devices either.

As for privacy and permanency - if data stops existing, it is definitely private now :)

vanjajaja1
0 replies
1d3h

private messages are not face to face conversations nor phone calls. letters last a long time and would be a more accurate comparison

the-grump
0 replies
1d3h

Because they're cherished memories for one.

npteljes
0 replies
19h15m

Really depends on the mindset when creating the message. If I message on a platform that keeps history, then I write with that expectation, or at least possibility, in mind. Now, this begs the question - is this modification of behavior problematic, does it detract perhaps from the meaning of the communication itself? Maybe.

The barrier, to me, is broken the moment we use technology. I work with this shit, I know how the sausage is made. I know that the phone calls are as encrypted as HTTP, that everyone can always keep record without you knowing, that even if they promise that something will go away, it may be won't, especially because it's a juicier target now just that promise alone. As soon as something is electronic, then it's a record.

nonameiguess
0 replies
21h26m

I can see both sides. I did actually correspond with people using written letters up until maybe 2004 or so, and in many of those cases, especially old girlfriends and letters from my little sisters when I first went to college, reading them years later was intensely nostalgic.

On the other hand, when I left the Army I moved into a much smaller place, put most of my stuff in storage, then three years later figured anything I hadn't touched or used for three years was something I didn't actually need, and let the facility have all of it. That seems to have included both all those old letters and all of my old photographs. I can't say I actually miss those things. People in here are saying they don't want to forget the past but the reality of forgetting is you don't know you forgot it so it has no perceivable effect once it happens.

To be honest, I'm nostalgic enough as is and don't think I need even more things to hold onto. I already don't watch new television or listen to new music. I'm mentally stuck in 1999 and not sure that's healthy.

m463
0 replies
16h46m

are supposed to be temporary

However, this is just an opinion.

Thinking more carefully, I think my opinion goes more like this:

my data collected by others should be temporary, but I should be able to maintain my own data forever if I want.

famahar
0 replies
1d3h

I agree. But it's more to do with the part of me that cringes at messages I sent and exist forever. The person I was 10 years ago is so different. It feels so jarring reading old messages.

Sigliotio
0 replies
1d3h

I don't assume those are private or ephemeral.

It tells a story and its the zeitgeist of our generation.

People haven't thought about too much how to preserve something like this.

I personally like the idea and i can imagine exporting this with all / few messages of my mother and having a memory of that time.

helboi4
12 replies
1d6h

Now to make this work for Whatsapp for the brits... Got excited at the idea of a project and then realised I will have to learn Rust if I was to fork this haha.

Anyway, this is definitely a cool idea. Reading my chat history with friends is actually very nostalgic.

netsharc
4 replies
1d5h

WhatsApp lets you export chats as txt, but I guess that's lossy.. e.g. I'm not sure the emojis will be there. Surely no attachments or voice messages.

As for extracting from the backup DB, they'll be encrypted..

andyjohnson0
2 replies
1d5h

Whatsapp on Android lets you export a chat with or without media (images etc), but it limits the number of messages. With media you get the last 10k messgaes, and without you get the last 40k. Emojis are preserved though.

See https://faq.whatsapp.com/1180414079177245/?cms_platform=andr...

The limits are supposedly due to email size limits but, as they also apply when exporting to non-email endpoints like Google Drive, I suspect they're more to do with preventing people from moving their chats to other services.

And yes, the db is encrypted.

nolongerthere
1 replies
1d4h

I wonder if those limits will be eased with the new EU regulations…

Rygian
0 replies
19h30m

GDPR is already 6 years old.

adastral
0 replies
23h2m

they'll be encrypted

Since some months (years?) ago, WhatsApp lets you set up your own encryption password for the DB backup. I set one up and used https://github.com/ElDavoo/wa-crypt-tools to get access to the decrypted SQLite and run some analytics over my messages :)

westernpopular
1 replies
1d5h

It also links to a Telegram exporter.

Telegram has this natively already

hu3
0 replies
1d5h

Indeed! That was unexpected for me.

dav43
2 replies
1d5h

For the rest of the world…

helboi4
1 replies
1d3h

I was going to say that but I then remembered all the many many other apps that a lot of other countries used, and therefore I didn't want to act like I wasn't aware of those. For example, WeChat, Line, KakaoTalk, and more. Whatsapp is not at all universal, even if it might be the most common in many European countries.

solstice
0 replies
1d3h

There is a python library for Wechat, but I'm having problems using it. Also, 1) getting and then 2) decrypting the Wechat database isn't easy. Because my phone is not rooted, I had to use an android emulator on my PC, transfer all my chats over there, extract the DB and all media. After installing all the fickle dependencies of the bruteforce decrypter, it took three days straight on my laptop from 2014 to decrypt, and now I can finally open it in an SQLite viewer. But that still leaves the major step of getting formatted messages out of there like in the OP. The HTML-conversion script that I used produced half-decent results, but hasn't been maintained for a while and thus chokes on certain messages so that the conversion of large chats invariably breaks down before being finished. Anyway. Maybe it is time to learn Python...

VikingIV
0 replies
21h20m

This desperately needs to happen — in a way that all messages and media can be exported in a sensibly reviewable format. Heck, I'd just like to be able to archive a backup that I know can be restored in the future on another device.

throwuwu
11 replies
1d5h

I like this a lot. We need more hard records of personal correspondence. It would be cool to do this as a service.

Honestly when I read the title I thought it was going to be about using message history as a basis for generating a narrative account of the events using an LLM.

thefourthchime
4 replies
23h30m

I love it as art, but it's usefulness is questionable. If you want a hard copy, just copy it to a microsd. Or three if you are worried about losing it.

tivert
2 replies
21h54m

I love it as art, but it's usefulness is questionable. If you want a hard copy, just copy it to a microsd. Or three if you are worried about losing it.

Hard copy means paper. Also microsd is a terrible for long term storage.

henryaj
1 replies
2h21m

Also microsd is a terrible for long term storage.

How come? I would have thought flash would be better than e.g. a regular hard drive.

Eiim
0 replies
1h54m

Flash memory relies on cells keeping charged, but the electrons can slowly leak and discharge the cells over time. It looks like the commonly claimed number is 10 years, but there's no clear answer. Hard drives also aren't great as a "set and forget" method. In either case you should refresh the data regularly (~yearly). Optical media is a great option for digital long-term storage, but paper is a very tried-and-true method, if stored in the right conditions.

refulgentis
0 replies
22h39m

"Hard copy" means "a collection of paper sheets bound in some fashion" in the context of books. So they probably didn't mean it in a way where microSD is equivalent.

demondemidi
1 replies
16h40m

One of the great disadvantages of private emails (& texts) is the massive amount of correspondence that is lost to future historians. I have books of letters published by Feynmann, Feyeraband, Einstein, etc. Everything is now email that is behind a password, which means we'll likely never have troves of personal letters from which to contextualize modern people who become historical figures in the future.

toomuchtodo
0 replies
15h19m

How would you solve for this?

trizoza
0 replies
1d5h

The same, I expected a whole criminal AI generated novel based on a history of a chat.

shawnc
0 replies
1d4h

Ditto. Exactly what I pictured from the title and I was already thinking how interesting that would be. I’m curious to try something like it now.

darkwater
0 replies
1d3h

Me too. But the real story is way better! Now I want to do the same with my Telegram chat history.

Cthulhu_
0 replies
1d4h

With the EU's DMA law and the preceding GDPR, some services have to offer an API so that your hypothetical service can pull this data. However, iMessage was notably excluded from this law, and then there's the encryption thing where you can't just pull data from e.g. whatsapp.

kirmerzlikin
7 replies
1d3h

Am I the only one who finds the idea of sending a full history of your private messages to some publisher for printing a little bit unsettling?

janfoeh
2 replies
1d1h

No, I do too. I've been planning on doing what the author did for quite some time now, and this is one of the unsolved stumbling blocks.

Printing and binding at home is probably the only option. All that's left to figure out is how to make the end result durable haptically pleasant...

dogline
1 replies
23h22m

You can always print it out, then run to Kinko's and use their comb binder that they usually have out. Not as elegant as real binding, but enough to make it work on my shelves.

janfoeh
0 replies
21h52m

For the cost of the plane ticket, I could probably hire a retired book binder ;)

red-iron-pine
1 replies
21h55m

the publisher is the business of spitting ink on paper. you should be more unsettled by being MITM'd by data mining companies whose job it is to change behavior via ads or other consensus-building tools.

ghaff
0 replies
20h48m

I might not do this if I were a high public official or celebrity on the off-chance that someone in the printing and packaging chain might happen to notice. But, for an average person, it seems pretty harmless. (Personally, the last thing I need is more paper but I get the attraction.)

russfink
0 replies
1d2h

Maybe try the —rot13 switch.

:-)

cooper_ganglia
0 replies
1d2h

I think if I sent my full history of private messages to a publisher, the most unsettled one would be the publisher!

quyleanh
5 replies
17h55m

I'm not sure if anyone know but I would like to ask about Signal.

I have an Android backup version of Signal message around 2020. Of course I have decrypt key. Since I can't restore it with current version, or the 2020 version of Signal (on Github release), how can I decrypt and extract all the message? Thank you.

stavros
2 replies
17h48m

Hm, why can't you restore it? AFAIK you should be able to.

quyleanh
1 replies
17h26m

Because my backup version comes from old version of Android Signal app. I try to restore with current version of Android Signal but it doesn't work, even with old version.

stavros
0 replies
10h3m

That's odd, I'm not aware of them changing the backup format, and it definitely should work with the old version. How does it fail?

DesertVarnish
1 replies
17h45m

I remember looking into a similar problem and learned that on desktop it was just an encrypted SQLite DB. It was readable with the standard SQLite library.

Not sure of the situation for the mobile backups though!

bkettle
1 replies
1d2h

Author of the post here, thanks so much for making it available! It’s an excellent library and I was thrilled to find it.

jxramos
0 replies
1d2h

All of my friends, for putting up with me sending them random messages to test things

Good sports taking one for the team! Thank you

jborichevskiy
0 replies
23h53m

Big fan of this library. Thanks for making it!

alchemist1e9
0 replies
22h7m

Thank you for this! I recently was digging into the sqlite files with an idea to monitor them for changes indicating new messages and then extract them. My initial prototype seemed to mostly work, with a few hacks. Next time I look at that idea I’ll switch to your library. Any suggestions or tips around near-time accessing?

pimlottc
3 replies
1d1h

Doesn’t seem liken this includes images, which for some people could be a significant part of their conversations: photos, memes, reaction gifs, etc

bm-rf
1 replies
1d

Maybe you could use something like GPT 4 vision To include a text description of the image in the transcript

samatman
0 replies
21h33m

Filtering full-color images down to a halftone suitable for book publishing is a mature technology, setting up an ImageMagick pipeline to do so would not be among the hard parts of preparing a book like this. Picking the right still frame out of gifs and video is a bit trickier, but not by much.

janfoeh
0 replies
1d1h

If been thinking about doing just this for a bit now. I plan on showing thumbnails plus a QR code for animations and videos; I have yet to figure out how to make the files accessible in a private and durable manner though.

ThePowerOfFuet
3 replies
1d5h

My first approach at this LaTeX generation is quite simple: align left if the message is from me and right otherwise

Isn't that backwards?

wantlotsofcurry
0 replies
1d5h

Maybe they wanted it from/for the other persons perspective for some reason?

suddenclarity
0 replies
1d5h

It is. I thought the author just mistyped in the post but looking at the repository example it's indeed backwards.

g4zj
0 replies
1d5h

Perhaps the book is a gift for the person on the other side of the messages in question.

nkko
2 replies
1d4h

Imagine having GPT generate a haiku from each unique correspondence.

risenshinetech
1 replies
1d3h

Why?

velcrovan
0 replies
1d3h

I’ll bite.

A book full of years of texts would be an interesting artifact, but how often would you pick it up? How interesting could it really be? Would you even want anyone else to see it?

Now suppose each exchange comes with a haiku summary, a fresh high-level look at the conversation that condenses its vague essence into a little linguistic locket, portable and easily recalled. The interplay between the mundane raw material and the poetic take would render it much more interesting, and tend to reward repeated examination.

A human poet would undoubtedly do a better job at this than an LLM, if any human poet could be persuaded to start and complete such a project. But having an LLM do it would be a very low-cost, low-effort way to at least try for interesting results.

mif
2 replies
1d6h

That’s awesome. I would like to do that with Telegram or WhatsApp.

ivanjermakov
0 replies
1d3h

Telegram gives you a JSON export of a full conversation, should not be complicated to adapt the input.

gavmor
2 replies
1d1h

I like to listen to blogs through Pocket's TTS mode, but this one made me laugh because I couldn't easily skip these sections:

/.../00008120-001854410CEB401E >>> cd 3d /.../00008120-001854410CEB401E/3d >>> ls 3d0292d3fe90e1e22c247403c0e9105ea0f9ff44 3d8830b71e98aae80b6eaf8bdd5500d79ce74946 3d02fe309afa7de839822d6f1b8433aa90090d17 3d88cdc16ff2b5231e5ea4b52271ee195a6f4b96 3d072c4fca5db4a5678fa10b137435f757e98492 3d8a425d70f4049417e855d273c44d8199de30c9 3d0739c90579fa907246d5c21bd8d8ebaa2d9d6b 3d8a43a1921f504bb4393250f75b24bfc2c5cedb 3d0798b3cc4d2f5ad347ffb8bc5a0f9d8c82cfb9 3d8a7c0460aadabf1b7fc9adea9e6a2a6e7bc73b 3d07a0adc5c5c22dc525ccd3a93fb05a50ef1ac5 3d8b6ad12c7617b3d783790a457b0aa19b193b68 3d0880f091c51ddc145e17c78d8e6f9a3e7e20c8 3d8b82abe05a9d697102d8b665c9d499e07492ea 3d093e92cf03abf3650411e09a647630a1e0c478 3d8ba897240ad32580bf8dfd00db8f181658cdfd 3d095e908ff898be3b3ffd64a75db959a58ac70a 3d8bc227d67ec4944df8e75291102367034d7214 3d09d5dcd5a9bdad67a80cd83201a9e1fb75aada 3d8c722f1d92f7cd6f90c936c14f60f51aad128b 3d0abb83123be82abf43ce20118e72fea06023c5 3d8ca6eeabeb1c01fae05bb20f08dedf734cfd04 3d0b246304c42d2ab1eb1892d629fcdfde689cb7 3d8d0c6b1bf7946c6bef91d60cccb32207b7bc01 3d0bb5f49e6f0e31348ef8feb9a38d4ce71f5ec7 3d8fd2fbcaf3079a683a8e486ecde8875f0a591d 3d0c1283936c45fec533a507b78558b5aa3159fa 3d8ff93bd94b3ea14edc77d1e677cf4ee4306e4e 3d0cb8e28462780bb9af1440e297ecd8224c70ff 3d90ea8bfbf62feda080cd0ccbd12fa5c8673993 3d0ce10de5f69606c52882215b99ebab259dc194 3d932638fe8ed669725b7a143c6a8b02b8959923 3d0d7e5fb2ce288813306e4d4636395e047a3d28 3d93c92679aa9d398331e27fdeed64b5094e68d1 ...

All I could think was, "Oh no, the nam-shub of Enki!"

firewolf34
0 replies
23h37m

I often listen to Pocket TTS on the train or when I can't access my device to skip or do much other than play/pause, and oh my god this gets me everytime haha. I am actually thinking of DIY'ing my own web-scraper thing to do a better job at it because especially for scientific articles, it's really rough when it gets to any LaTeX. And then I'm sitting there listening to some very automated sounding voice read off cryptic numbers and greek letters and code and math notation like some kind of Soviet number station (which is kinda cool at first, but gets annoying haha).

I want some kind of local document host that I can run a summarization or filtering script over to extract the portions that are legible to TTS, pipe it into something nice like ElevenLabs (if I was rich) or whatever, and then host a OGG for me to listen to on the go...

egypturnash
0 replies
1d

are you posting this while having a drink at the Black Sun

btbuildem
2 replies
1d4h

Would the next level be to use an LLM to take the essence of each set of exchanges and present it in a format of a play or movie script? Perhaps novelize it?

risenshinetech
0 replies
1d3h

Ya bro and then the next level would be to like feed that AI generated play or script and feed it into an AI movie maker. It would totally be a game changer! I'm super stoked and pumped about it

input_sh
0 replies
1d4h

Or, hear me out: you keep the personal communication personal instead of feeding it through a randomness machine.

frankfrank13
1 replies
1d3h

My Aunt has done a wonderful job at preserving the letters and diary entries between my grandfather + grandmother during WWII. My immediate thought is how our children and grandchildren will not have the same joy!

[Here is the blog for those interested](http://www.honeylightsletters.com/)

CSSer
0 replies
20h16m

Ha, I’m not sure it’s quite the same. If my understanding of most couples is correct, this would be more akin to preserving all of their sticky notes e.g. “Pick up milk on the way home”, “You’re getting the kids, right?”, “see you in 20”, etc.

Yet I suppose there’s a certain charm to that, so I hope I don’t sound like too much of a wet blanket.

fragmede
1 replies
1d3h

People will pay money for this! What a heartfelt, sentimental thing to be able to give to someone on an anniversary or birthday or something.

russfink
0 replies
1d2h

Or keep it to remind yourself of what a toxic relationship looks like, be that the case.

sciencesama
0 replies
1d3h

We need bubbles aswell !!

roland35
0 replies
1d5h

I love this idea! I think this would be a fun idea, except 1) not sure how it would handle pictures, and 2) there are probably some texts which should not be published!

Also - noto emoji is great. It is also nice to use for 3d printing/laser cutting

nico
0 replies
23h41m

Really cool, looks great

Also before going to the article, I thought it was about using an LLM to write a book with stories and characters inspired by the message history

mym1990
0 replies
1d2h

Very cool. A while ago I took a trip down memory lane with my partner to take a look at the first messages we sent each other, it was very neat and the memories definitely came back, even though it has been years since we met! A little bit like looking at a photograph and remembering the location and feeling in that moment.

larodi
0 replies
22h34m

I somehow initially thought that the iMessages went through some LLM which retold them in nice Brothers Grim style. But from another perspective it also makes sense to have the originals, although the author is perhaps much better than me in writing messages which may one day be worth reading…

johncalvinyoung
0 replies
1d1h

This is fantastic. I've done some semi-similar things, have a tool I wrote for generating nice documents from Facebook Messenger conversations, for archiving important personal conversations. But I didn't take it so far as to generate a book yet! What a great idea!

jll29
0 replies
1d2h

In 2000 years, your books may be the only thing left to study how we lived in the 21st century, because all ephemeral information (tweets, chat, SMS, emails, digital photos on people's devices) may have vanished.

jjice
0 replies
1d3h

I wasn't familiar with BN Press for personal use. I've done some research into KDP and Lulu, but I've decided that ebooks would be my main focus after getting a Kindle and loving it. For a limited/test run, BN Press seems fantastic. $30 for 1300 pages is fantastic.

j1elo
0 replies
1d3h

I love the idea!

The thing I like the least though is the table of contents, it's so dry with just the months and years. Despite the skepticism I have about latest AI use and abuse, generating a one-liner from the contents of each month seems like it would be a fitting usage for it.

gregorymichael
0 replies
1d4h

Great project and a great writeup

federalbob
0 replies
1d3h

A French company does this: https://www.monlivresms.com/ (warning: the website is annoying)

demondemidi
0 replies
1d3h

I did this for my partner on valentines day back in 2015. 20,000+ messages in one HTML page. I never thought of binding a book, though.

I suspect this person's project will become very popular as a service. This is a great idea.,

balu_
0 replies
1d4h

Nice Idea, tryed to use the source and build a quick example myself... Now i'm reminded of me disliking latex (it just doesn't work)

astiela
0 replies
3h16m

This is amazing and i plan on doing this and giving it to my wife :)

achristmascarl
0 replies
1d2h

This is really cool, and also seems like it could be a great gift to a loved one.

I was playing around with Nomic Atlas (https://docs.nomic.ai/) recently and dumped a bunch of my chat history in there, and it was pretty interesting to visualize and browse my messages as clusters around topics.

Which leads me to think that you could bring the searchability of digital to the physical format by generating embeddings for the messages and running topic modeling on them; then, you could create an index of topics at the end of the physical book with page number references to messages about that topic.