Video quality seems really good, but limitations are quite restrictive "Our model encounters challenges when processing extremely long videos (e.g. 200 frames or more)".
I'd say most videos in practice are longer than 200 frames, so lot more research is still needed.
At 24fps that's not even 10 seconds. Calling it extremely long is kinda defensive.
The average shot length in a modern movie is around 2.5 seconds (down from 12 seconds in 1930's).
For animations it's around 15 seconds.
Huh, I thought this couldn't be true, but it is. The first time I noticed annoyingly fast cuts was World War Z, for me it was unwatchable with tons of shots around 1 second each.
So sad they didn’t keep to the idea of the book. Anyone who hasn’t read this book you should, it bares no resemblance to the movie aside from the name.
It's offtopic, but this is very good advice. As near as I can tell, there aren't any real similarities between the book and the movie; they're two separate zombie stories with the same name, and honestly I would recommend them both for wildly different reasons.
Funny - this is also a good description of I Am Legend.
And similarly, I, Robot, which is much more enjoyable when you realize it started as an independent murder-mystery screenplay that had Asimov’s works shoehorned in when both rights were bought in quick succession. I love both the movie and the collection of short stories, for vastly different reasons.
https://www.cbr.com/i-robot-original-screenplay-isaac-asimov...
Will Smith, a strange commonality in this tiny subgenre.
I didn’t rate the film really, but loved the book. Apparently it is based on / taking style inspiration from real first hand accounts of ww2.
It’s style is based on the oral history approach used by Studs Terkel to document aspects of WW2 - building a big picture by interleaving lots of individual interviews.
Making the movie or a documentary series like that would have been awesome.
I know two movies where the book is way better, Jurassic Park and Fight Club. I thought about putting spoilers in a comment to this one but i won't.
The lost world is also a great book. It explores a lot of interesting stuff the film completely ignores. Like that the raptors are only rampaging monsters because they had no proper upbringing having been been born in the lab with no mama or papa raptor to teach them social skills
But hey, at least we finally got the motorcycle chase (kind of) in "Jurassic World"! (It's my favourite entry in the series, BTW.)
Disagree, Jurassic Park was an amazing movie on multiple levels, the book was just differently good, and adapting it to film in the exact format would have been less interesting (though the ending was better in the book.)
I totally forgot the book ending! So much better.
I think like the motorcycle chase that they borrowed from the lost world in Jurassic world, they also have a scene with those tiny dinosaurs pecking someone to death.
Also The Godfather. No Country for old Men I wouldn’t say is better but is fantastic.
Loved the audiobook
Batman Begins was already in 2005 basically just a feature length trailer - all the pacing was completely cut out.
Yes, Nolan improves on that in later movies but he used to abuse of it.
Another movie of him that crimes of this non stop is The Prestige.
Yeah, the average may also be getting driven (e: down) by the basketball scene in Catwoman
[watches scene] I think you mean the average shot length is driven down.
The first time I noticed how bad the fast cuts are we see in most movies was when I watched Children of Men by Alfonso Cuarón, who often uses very long takes for action scenes:
https://en.wikipedia.org/wiki/Children_of_Men#Single-shot_se...
The textures of objects need to maintain consistency across much larger time frames, especially at 4k where you can see the pores on someone's face in a closeup.
I'm sure if you really want to burn money on compute you can do some smart windowing in the processing and use it on overlapping chunks and do an OK job.
Off topic: the clarity of pores and fine facial hair on Vision Pro when watching on a virtual 120-foot screen is mindblowing.
People won't be upscaling modern movies though.
Sure, but that represents a lot of fast cuts balanced out by a selection of significantly longer cuts.
Also, it's less likely that you'd want to upscale a modern movie, which is more likely to be higher resolution already, as opposed to an older movie which was recorded on older media or encoded in a lower-resolution format.
I believe the relevant data point when considering applicability is the median shot length to give an idea of the length of the majority of shots, not the average.
It reminds me of the story about the Air Force making cockpits to fit the elusive average pilot, which in reality fit none of their pilots...
10 seconds is what, about a dozen cuts in a modern movie? Much longer has people pulling out their phones.
:( "Our model encounters challenges when processing >200 frame videos"
:) "Our model is proven production-ready using real-world footage from Taken 3"
https://www.youtube.com/watch?v=gCKhktcbfQM
Freal. To the degree that i compulsively count seconds on shots until a show/movie has a few shots over 9 seconds then they "earn my trust" and i can let it go. Im fine
I guess one can break videos into 200-frame chunks and process them independent of each other.
Not if there isn't coherency between those chunks
Easily solved, just overlap by ~40 frames and fade the upscaled last frames of chunk A into the start of chunk B before processing. Editors do tricks like this all the time.
And now you end up with 40 blurred frames for each transition.
'before processing'
Decent editors may try that once, but they will give up right away because it will only work by coincidence.
There has to be a way where you can do it intelligently in chunks and reduce noise along the chunk borders.
Moreover I imagine that further research and power will do a lot, smarter, and quicker.
Don't forget people had toy story-comparable games in a decade or so after it was originally rendered at 1536x922.
Or upscale every 4th frame for consistency. Upscaling in between frames should be much easier.
At 30fps, which is not high, that would mean chunks of less than 7 seconds. Doable but highly impractical to say the least.
7s is pretty alright, I've seen HLS chunks of 6 seconds, that's pretty common I think.
6s was adopted as the "standard" by Apple [0].
For live streaming it's pretty common to see 2 or 3 seconds (reduces broadcast delay, but with some caveats).
0: https://dev.to/100mslive/introduction-to-low-latency-streami...
The Wright Brothers' first powered flight lasted 12 seconds
Source: https://www.nasa.gov/history/115-years-ago-wright-brothers-m....
Our invention works best except for extremely long flight times of 13 seconds
Fascinating how researchers put out amazing work and then claim that videos consisting of more than 200 frames are "extremely long".
Would it kill them to say that the method works best on short videos/scenes?
Tale as old as time, in graphics papers it's "our technique achieves realtime speeds" and then 8 pages down they clarify that they mean 30fps at 640x480 on an RTX 4090.
I think it encounters memory leaks and the usage of memory goes over the roof
If I am understanding the limitations section of the paper it seems like the 200 frames depends on the scene, it may be worse or better.
Break into chunks that overlap by, say, a second, upscale separately and then blend to reduce sudden transitions in the generated details to gradual morphing.
The details changing every ten seconds or so is actually a good thing; the viewer is reminded that what they are seeing is not real, yet still enjoying a high resolution video full of high frequency content that their eyes crave.
Wonder what happens if you run it piece-wise on every 200 frames. Perhaps it glitches in the interface.
It's good enough for "enhance, enhance, enhance" situations.
Well there goes my dreams of making my own Deep Space Nine remaster from DVDs.
If you're using this for existing material you just cut into <=8 second chunks, no big deal. Could be an absolute boon for filmmakers, otoh a nightmare for privacy because this will be applied to surveillance footage.