I see from comments I'm far from the only one using AI to summarise videos before deciding whether to watch them.
Reminds of the meme "why spend 10 minutes doing something when you can spend a week automating it". i.e. "why spend an hour watching a talk when you can spend 5 hours summarising it with AI and debating the summary's accuracy".
This sounds silly but potential gains from learning AI summarisation tooling/flows are large, hence why it warrants discussion. Learning how to summarise effectively might save hours per week and improve decisions about which sources deserve our limited time/attention.
I feel like I'm missing some boat, but I'm not sure what boat it is. These "AI" systems seem very superficial to me, and they give me the same feeling as VR does. When I see VR be some terrible approximation of reality, it just makes me feel like I'm wasting my time in it, when I could go experience the real thing. Same with AI "augmentation" tooling. Why don't I just read a book instead of getting some unpredictable (or predictably unpredictable) synposis? It's not like there's too much specific information there. These tools are just exploding the amount of unspecific information. Who has ever said: "hey, I have too much information for building this system or learning this topic"? Basically no one.
It's just going to move everything to the middle of the Bell curve, leaving the wings to die in obscurity.
If you know a book’s worth reading, going ahead and reading it works well. But for a lot of books/talks there’s competition for time - eg my bookshelf has 20 half read books (this is after triaging out the ones that aren’t worthy of my time) - any tooling that can help better determine where to invest tens or hundreds or hours of my time is a win.
Regarding accuracy, I think we’re at a tipping point where ease of use and accuracy is starting to make it worth the effort. For example Bard seems to know about youtube videos (just a couple of months ago you’d have to download it -> audio to text -> feed into a LLM). So the combination of greater accuracy and much easier to use make it worth considering.
LLM accuracy is so bad, especially in summarization, that I now have to fact check google search results because they’ve been repeatedly wrong about things like the hours restaurants are open.
There's a huge difference between summarizing a stable document that was part of the training data or the prompt, and knowing ephemeral facts like restaurant hours.
Technically true statement. If you're offering it to imply that the GP bears responsibility for knowing what document was in the training data and what's not, I have to quibble with you.
Knowing it's shortcomings should be the responsibility of the search app that is currently designed to give screen real estate to the wrong summary of the ephemeral fact. Or, users will start to lose trust.
It's because they don't understand language. You may have been mislead by their ability to generate language.
Is it that hard to determine that a book is worth reading where worth is measured from your perspective? It's usually pretty easy, at least for technical books. Fiction books are another story, but that's life. Having some unknown stochastic system giving me a decision based upon some unknown statistical data is not something I'm particularly interested in. I'm interested in my stochastic system and decision making. Trying to automate life away is a fool's errand.
I'm a huge believer in doing plenty of research about what to read. The simple rationale: it takes a tiny amount of time to learn about a book relative to the time it takes to read it. Even when I get a sense a book is bad, I still tend to spend at least a couple of hours before making the tough call not to bother reading further (I handled one literally 5 minutes ago that wasted a good few hours of my life). I'm not saying AI summaries solve this problem entirely, but they're just one additional avenue for consultation that might only take a minute or two and potentially save hours. It might improve my hit rate from - I dunno - 70% to 80%. Same idea for videos/articles/other media.
I think the more you outsource "what is worth my time" the less you're actually getting an answer about what's worth YOUR time. The more you rule out the possibility of surprise up front, the less well-informed your assumption about worth can possibly be.
There are FAR too many dimensions like word choice, sentence style, allusion, etc, that resist effective summarization.
I get where you're coming from and definitely vet books in similar ways depending on the subject, but I also feel like this process is pretty limited in ways too and appeals to some sort of objective third party that just doesn't exist. If you really want to know or have an opinion on a work/theory/book at the end of the day you have to engage with it yourself on some level.
In graduate school for example, it was pretty painfully obvious that most people didn't actually read a book and come to their own conclusions, but rather read summaries from people they already agreed with and worked backwards from there, especially on more theoretical matters.
I feel like on the long term this just leads to a person superficially knowing a lot about a wide variety of topics, but never truly going deep and gaining real understanding on any of them- it's less "knowing" and more the feeling of knowing.
Again, not saying this in an accusatory way because I totally do engage in this behavior too, I think everyone does to some degree, but I just feel the older I get, the less valuable this sort of information is. It's great for broad context and certain situations I suppose, but in a lot of areas I consider myself an expert, I would probably strongly disagree with summaries given on subjects and they also tend to miss finer details or qualifying points that are addressed with proper context.
IMHO, the good old method of skimming through the table of contents, reading the preface and perhaps the first couple of chapters is going to be a much higher fidelity indicator of whether a book is worth your time than reading an AI generated summary.
I'm trying to understand this comment, because I couldn't disagree more. It is the absolute explosion of available data sources that has me wanting to be much more judicious with where I spend my time reading/watching in the first place.
Your comment was interesting to me because I feel like I agree with one of its main sentiments: that AI generated content all kinda "sounds the same" and gives a superficial-feeling analysis. But that is why I think AI is a fantastic tool for summarizing existing information sources so I can see if I want to spend anymore time digging in to begin with.
Relying on glorified matrices (that's what machine learning is) for world data curation is just begging to handicap yourself into a cyborg dependent on a mainframe computer's implementation. An implementation and design that is rarely scrutinized for safety and alignment features.
Why not just make your brain smarter, instead of trying to cram foreign silicon components into your skull?
Why not both?
Because maximizing both biological vectors of self-improvement and computing based avenues of skill acquisition is limited by the fact that it's a multi-objective optimization problem when you combine them together during maximization. Optimizing one de-optimizes the other. They, biology and computers, conflict with each other in fact. So, at best, you have to reach for a Pareto frontier.
And, it turns out, technology can't be trusted, as there is always some sort of black box associated with its employment. Formally, there is always a comprehension involved when it comes to the development and integration of technology into human life. You can't really trust this stubborn built-in feature of technological and economic success if you don't pierce through its secrets (knowledge is the power to counteract cryptographic objects). After all, it could be a malicious trojan horse that "basic common sense" insists on us all using for "bettering" our daily lives.
A very unfriendly artificial intelligence is trying to sneak through civilization for its own desires. And you're letting it just pass on by, as a result of your compliance with the dominant narrative and philosophy of capitalist economics.
I was thinking the other day: Star Trek computers make a lot of sense if they are working with our current level of AI.
You can talk to it, it can give you back answers that are mostly correct to many questions, but you don't really trust it. You have real people pilot the ship, aim and fire weapons, and anything else important.
And nobody in Star Trek thinks the ship computer is sentient. On the other hand, the holodeck sometimes malfunctions and some holodeck character (like Moriarty) becomes sentient running on just a subset of the ship computer. That suggests sentience (in the Star Trek universe) is a property of the software architecture, not hardware.
Firstly, they had unlimited energy and replicators - which means they could make whatever hardware they wanted.
And they also had bio-neural circuits. And photonic chips.
So, hardware was already way ahead of software.
All this goes to show that in real world, the actual science (and fiction) around material sciences was already quite advanced compared to software.
I had a conversation with a friend where he suggested that he had had a broad range of experiences just from gaming. I think the context was a conversation about how experiences in life can expand you — something like that.
The whole premise bothered me though.
I can remember a bike ride where I was experiencing the onset of heat stroke and had to make quick decisions to perhaps save my life.
I remembered decades ago lost in Michigan's upper peninsula with the wife, on apparently some logging road and the truck getting into deeper and deeper snow as we proceeded until I made the decision to instead turn around and go back the way we came lest we become stranded in the middle of nowhere.
I remember having to use my wits, make difficult decisions while hitchhiking from Anchorage, Alaska to the lower 48 when I was in my early twenties....
The actual world, the chance of actual death, strangers, serendipity ... no amount of VR or AI really compares.
You're not wrong, but I also think the problem predates video games. Films, novels and even religious texts all are scrutinized for changing people's perspective on life. Fiction has a longstanding hold on society, but it inherently coexists with the "harsh reality" of survival and resource competition. Introducing video games into the equation is like re-hashing the centuries old Alice in Wonderland debate.
Playing video games all day isn't an enriching or well-rounded use of time, but neither is throwing yourself into danger and risk all the time. The real world is a game of carefully-considered strategy, where our response to hypothetical situations informs our performance during real ones. Careful reflection on fiction can be a philosophically powerful tool, for good or bad.
100 percent on things moving to the center of the curve.
For now, that’s not a bad thing if you need to know what the average information is.
As time goes by it might not be a good thing.
I just read “Robust Python” book. My overall reaction is that book could have been written with half the length and still be valuable for me. I can't stop thinking if I could ask LLM to summarize each chapter for me, I still could "read" the whole book in the manner the author outlinea but save a tons of time.
What is your workflow for this, if you don't mind me asking?
If you’re interested I did a YouTube video and short blog post about it
https://www.jerpint.io/blog/yougptube/
https://www.youtube.com/watch?v=WtMrp2hp94E
This is pretty cool. Would it be possible to just stream the audio directly into Whisper, maybe using something like vlc, at x2 play speed to get the summary faster?
Probably, the openAI api got a lot better since I made that post, though if you stream audio at 2x speed you have to expect a drop in quality since on average most clips whisper is trained on are not at 2x
Try out www.askyoutube.ai!
My approach wasn't fancy, just asked bard (aka gemini). I was drawn to bard/gemini for this since the source video is on youtube, so figured google would better support its related service (although that was an arbitrary hunch)
https://imgur.com/a/psb64IP
This is exactly why I built https://www.askyoutube.ai. It helps you figure out if a video has the answer you want before you spend time watching it. It does this by aggregating information from multiple videos in one-go.
I don't think it completely replaces watching videos in some cases but it definitely helps you skip the fluff and even guides you to the right point in the video.
Do you transcribe the videos or use the captions, because GPT4 can already do the latter?
It can be either depending on the mode, I don't think GPT4 can already do the latter though.
What tool do you use to summarize video?
(since it's a youtube video) I used bard/gemini: https://imgur.com/a/psb64IP
I have no idea if it's the best (or even a good) tool. Other commenters suggest some other tools (for both text summaries and condensed video summaries - a sort of 'highlights reel'):
https://news.ycombinator.com/item?id=39435930
https://news.ycombinator.com/item?id=39435964
(Little self-plug) I made a tool that’s pretty relevant
https://www.platoedu.org/videos/oSCRZkSQ1CE/watch
It's not really giving summaries but gives topic/section timestamps and highlights what was discussed.
(Main focus is actually making mini-courses off of YouTube videos but I found the section summaries really useful for figuring out which parts to watch)
Perhaps consider simply reading the description for an accurate summary.
From the description:
sure its an accurate summary, but is it at a granularity or specificity that you want? LLM summaries lets you move around the latent space of summaries and you probably dont agree with the one chosen for youtube descriprtions.
In this case the video description contains a useful Abstract. AI summaries can offer additional value though, going into more/less detail (as desired), and allowing you ask follow up questions to drill into anything potentially of interest.
What ai tool do you use to summarize?
I've A/B tested this with webinars and the tools I've tried tend to miss some really valuable/interesting stuff even when I give it the full transcript. Same goes for when I try to use ChatGPT or other tools for full interactive analysis, even when I basically hand it what I'm looking for as if I hadn't watched the video it will leave out the critical information
1. Author spends a week producing a video when writing an article would have taken a day.
2. Viewer spends hours summarizing the video to an article so they don't have to watch it.
P R O G R E S S
I made a tool that might be interesting for people here!
https://www.platoedu.org/videos/oSCRZkSQ1CE/watch
It's not really giving summaries but gives topic/section timestamps and highlights what was discussed
(for example: The Transformer Model (21:06 - 24:48) - Introduction of the Transformer model as a more efficient alternative to recurrent models for language processing)
The main focus is actually creating Anki-like spaced repetition questions/flashcards for videos and lectures you watch to retain knowledge, but I found the section information quite helpful for finding which parts of the video contain the info relating to topics/concepts
If you like summaries, you'll probably love de-summaries (WIP): https://socontextual.com/