What is a good GUI interface for FFmpeg?
FFmpeg is wonderful software. Growing up as a Windows user in the early 2000s, devices were far more picky than they are today about which video codecs they'd support. It was a non-trivial task as an 11yo trying to convert DivX .avis into an MP4 my old iPod Video could understand. Discovering ffmpeg and finding that someone was offering for free what I otherwise could only find under mountains of crappy shareware was a real watershed moment.
20 years later it's still a goto. Great tool.
Oh god, DivX.. I had almost forgotten.
Lets write a new video compression algorithm that is super efficient - great
this lets us compress movies so they can fit on cheap CD's, instead of DVDs. - great
We can now give those CD's away with movies on them - great And then every time someone puts one in a DivX player they can pay us to watch/rent it, instead of having to drive to blockbuster - wait, what?
Its easy, we'll just use a phone line that everyone has right near their entertainment center in their living room to phone home at night and send the data of what movies you watched and how many times. - what are they smoking?
I think you're confusing DivX the codec and DIVX the disc media platform.
Afaict these aren't related.
Let's not forget to add Xvid to this circus, having X in your name was all the rage back in the early 2000s.
The MPEG-4 Visual space was kind of a mess.
Compared to 2024, when of course nobody would throw away an enormous amount of existing brand awareness to have an X for their product name
I prefer extortion, the X makes it sound cool
What a throwback! Only now am I realizing that Xvid is the OSS counterpart to Divx.
This is one part of 2000s tech I'm happy to have mostly forgotten about.
That's just DivX backward.
Xtreme marketing was absolutely going on back then - remember the Mountain Dew ad campaigns? Yeah....
How in the world were they both able to use that name?
DIVX the physical disc distribution company was a famous flop.
DivX the video codec started out as an unlicensed hacked version of Microsoft’s MPEG-4 v3 codec binary. Since it wasn’t a commercial product and was legally dubious, the author called it DivX ;-) with the smiley in the name.
When it became unexpectedly popular during the dot-com boom time, someone of course set up a DivX company that dropped the smiley, eventually rewrote the codec, and presumably acquired the trademark from the defunct DIVX (or just took it over if the registration expired, I don’t know).
and presumably acquired the trademark from the defunct DIVX (or just took it over if the registration expired, I don’t know).
and then, iirc, this is where xvid come into being. I think it was the same codec just re-written and given back to the opensource world, hence the reason for naming it "divx" spelled backwards.
They are somewhat related... the codec's title styling included a winking smiley face emoji -- "DivX ;-)" -- as a tongue-in-cheek nod to the failed video disk technology.
To make the waters murkier, there were also DVD players with DivX (MPEG4) codec support.
You could encode a CD sized video file, burn it, and watch it.
They're not related directly, but the codec was specifically named in reference to the media format.
DivX disks were DVDs.
I've always said ffmpeg is one of the new wonders of the world. It powers so much, is so complex, so irreplaceable. It's crazy we get to enjoy it for free. I use it to encode my movies and tv shows to AV1/720p.
& $ffmpegPath -i $_.FullName -r 23.976 -vf scale=1280:720 -c:v libsvtav1 -pix_fmt yuv420p10le -crf 30 -preset 10 -g 300 -c:a libopus -b:a 96k -ac 2 -c:s copy -map 0 $destPath
Why AV1?
I convert much to 720p PS3 compliant H264, for maximum device compatibility. I take an external drive with these files with me when I travel, and in 99% of the cases plugging it into hotel TVs just works.
I only play these files on my second monitor as background noise when I'm coding boring stuff. Day of The Dead is 696MB, and looks great on my monitor. https://files.catbox.moe/cm88w0.jpg I don't need more for this use case. 577 movies so far, at 466GB! God bless the AV1 team!
He he, I have a secret aversion to movies exceeding 1GB :) I know what you mean! I wonder what video fidelity I would gain moving to something more modern. It's just really convenient being able to playback anywhere.
I wonder what video fidelity I would gain moving to something more modern.
At the very least, 1080p instead of 720p.
I've found av1 and h265 handle some things better than h264 at almost any bitrate. Usually around film grain compression.
why AV1 in 720p ? is it just for watcking in smaller devices or for archiving too? isnt AV1 encoders still very slow compared to h265 ?
It's also a testament to the power of open source software. ffmpeg is in a significant amount of software touching audio/video in some way, and we are all enriched because everyone is free to use it.
I remember the default media player that shipped on Windows was absolutely terrible because it could only play a very limited number of file formats, none of which were actually used much by movie files found in the wild. If you wanted to actually play a video, you had to try your luck and choose among several 3p "codec packs" half of which were probably loaded with malware.
People who have always lived in a world with great software like VLC and MPV and ffmpeg underestimate how hard it was to actually play a video file on your computer back in 2000.
K-Lite Mega Codec Pack!
I remember always installing that. Also GOM Player was also a thing.
That and Media Player Classic
Or ffdshow
Codecs packs... Ugh.
We had an old software at work that would only take a specific file format, and of course the files came from multiple sources, which made it really painful for the users.
We transformed a relatively decent desktop into a ffmpeg transcoding machine, which would monitor files incoming from a samba share and it would output the converted file into another samba share.
It was just a bunch of scripts and cron jobs but it worked much better than I anticipated and it was mostly maintenance-free.
I am always surprised how much it's taken over video processing tasks (and maybe how Apple lost the plot with Quicktime and such).
Would def be interesting if someone could write up a history of the project so far. I wonder how much industry input into the OSS commits there are (like MS/IBM into linux, postgres, etc)
I remember Back in 2007, during a far away vacation, using ffmpeg on a windows netbook to convert star trek episodes into a format my little mp3 player could understand and play on its little screen (320x240?). It was amazing that even on the 600-900 MHz cpu, those videos transcoded in a matter of minutes.
The greatest addition to FFmpeg in the recent past was the addition of large language models translating my "ffmpeg command to mix audio file onto video file" into actually executable FFmpeg commands.
Being cheeky of course here. FFmpeg is great. An AI assistant was what I needed to execute my ~12 FFmpeg commands per year though, with ease and speed.
Oh, you nailed it. I was so excited the moment I realized I didn't have to struggle with that complicated CLI
I mean I've only done it once with ffmpeg but it felt so good
I did it just today to convert a Filipino movie I yt-dlp'd from WebM to MP4. Teen me would be so excited by this excellent new world of ours.
But but, yt-dlp can download it to .MP4 or .mkv already? It invokes ffmpeg during its final steps.
I always do
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best"
Not sure if there is a better way, ideally I would just like it to always default to that.LLMs are the best thing to happen to command line tools. I get so much productivity out of them now.
It's frustrating when they give you a plausible answer that doesn't exist though.
it is, but typing "no, `--do-x` is not a valid option, please only give me real commands" back into a chat window is a hell of a lot less frustrating than sifting through docs that may or may not even exist.
Seems risky to start running commands on your machine that a hallucinating AI spit out at you without understanding them, but I guess they can at least let you know what to look up so you can double check what the command will do.
ChatGPT / LLMs really are great with ffmpeg. Not too long ago I wrote a blog post outlining an ffmpeg command I needed (mostly for my own future reference). It almost feels quaint now, given that you can get the same kind of explanation from GPT. I bet if I pasted the command in my article into ChatGPT I would get an explainer that's basically identical to what I wrote.
I did this yesterday and first the command it gave me was wrong.
GPT4?
Paste the man page into the prompt
Makes you wonder why developers had to invent such shitty CLI UX in the first place instead of natural language syntax, keywords, and arguments.
Probably because:
1. ffmpeg exposes all of its options through the CLI, and there are a lot of options. So it's probably always going to be completely undiscoverable. It really needs a GUI to be usable, but that's a project in itself (I guess the project is Handbrake).
2. They probably didn't put a lot of work into the UX of the CLI since it's an open source project.
3. Backwards compatibility.
So it's probably always going to be completely undiscoverable. It really needs a GUI to be usable, but that's a project in itself (I guess the project is Handbrake).
I don't buy this. A TUI is completely possible, e.g. LazyGit, Htop, etc, countless tools indicate you can get pane-based UIs going in the terminal; the FFmpeg team has simply never made such a thing a priority.
But even without a TUI, the most basic use-cases are well-known after over 15+ years of existence, simple prompt-based wizards, i.e. "git add -p", should be offered; again, a matter of priority rather than being intractable.
A year ago or so someone posted a subscription service they made for composing the pipeline graph visually
Natural language syntaxes universally suck. A faux english syntax isn't easier to use if you don't know which english in particular will be accepted. A complex cli interface fundamentally can't be easy to use. A GUI can fix that by making things discoverable and by integrating the documentation into the UI, but the ffmpeg devs presumably see that as someone else job (and there have been people to step up).
I feel the same about sed and awk :)
There's a low-hanging fruit that I think would make ffmpeg more helpful for regular people.
There's a million terrible websites that offer file conversion services. They're ad-ridden, with god-knows-what privacy/security postures. There's little reason for users to need to upload their files to a third-party when they can do it locally. But getting them to download fiddly technical software is tough - and they're right to mistrust it.
So, there's a WASM version of ffmpeg, already working and hosted at Netlify [1]. It downloads the WASM bundle to your browser and you can run conversions/transformations as you wish, in your browser. Sandboxed and pretty performant too!
If this tool a) was updated regularly b) had a nicer, non-CLI UI for everyday users and c) was available at an easily-Googlable domain name - it would solve all the problems I mentioned above.
Or you could download HandBrake?
Browsers are annoying. They constrict the designer, they constrict the user, they're overly complicated, slow, bloated... I don't know why people keep pushing them to do things they are bad at.
I wish 20 years ago we'd made a concerted effort to make Java suck less. We'd have the universal applications everyone wants but nobody wants to put effort into. But the web was new-ish, and people didn't realize that hypertext document viewers would become an entire application platform and mini OS.
What I'd really like to see is something like FlatPak, but for all platforms. Basically it would be containerized GUI apps, but one repository for one app that serves all platforms. On Android, MacOS, Windows, etc, you would run your "flatpak add https://some/repository/ my-app && flatpak pull my-app && flakpak run my-app" (but in a GUI, like an App Store you control). And that would pull the image for your platform and run it. Since it's containerized, you get all the dependencies, it's multi-arch, & you control how it executes in a sandbox. You could use the same programming language per platform, or different languages; same widgets, different widgets; it wouldn't matter because each platform just downloads and runs an image for that platform. This wouldn't stop us from having/making "a better Java", but it would make it easier to support all platforms, distribute applications securely, update them, run them in a sandbox, etc. Imagine being able to ship a single app to Windows and iOS users that's just a shell script and 'xdialog'. Or if you prefer, a single Go app. Or a sprawling Python or Node.js app. Whatever you want. The user gets a single way to install and run an app on any platform, and developers can support multiple platforms any way they want. No more "how do I develop for iOS vs Windows"; just write your app and push your container.
This proposal is to offer competition to web-based conversion websites. If users are willing and able to find Handbrake and download it, it can work for them. But everyday users are right to distrust software downloaded from the Internet.
Many users are in environments where its not possible to download new software (schools, work places, universities).
The browser has its disadvantages, but it is the most widely-deployed sandboxed execution environment providing incredibly easy distribution of software.
I'm quite sure that the reason conversion websites are popular is Google. I don't know if they're just very good at SEO or search engines have a specific policy to favor web-based solution.
If you search for "mov to mp4", Handbrake is NOWHERE to see. The 10th result for me is a Cloudflare article explaining what's the difference between mov and mp4. The ~20th result is a book called Business Funding For Dummies (no shit). Handbrake is after these. The legend says I'm still scrolling trying to find where it is.
How is an average user supposed to know Handbrake or this FFmpeg WASM site?
But everyday users are right to distrust software downloaded from the Internet.
They're right to distrust apps that run in their browsers too, but that hasn't stopped anybody. These days everyone is scared to death of an .exe but will happily execute whatever random code a stranger on the internet comes up with if they only have to click on a link to run it on their devices. Warnings that WASM is a malware author's dream weren't enough (https://www.crowdstrike.com/blog/ecriminals-increasingly-use...) and browser sandbox escapes happen all the time but nobody seems to care. I can't even just pick on WASM, JS isn't much better and even CSS/HTML alone is getting complex enough that it can be used maliciously.
If Java had won, we'd be complaining about Java instead of web technologies. It doesn't ultimately matter, when you have a platform as large as the web is, it's going to be complicated and bloated.
HandBrake suffers from the same problems as FFmpeg to a lesser extent. I have no idea what options I should be selecting to get the max quality, smallest file size for a conversion.
The presets are useful but when I'm converting an old WMV or some other ancient format I want to know that I'm not leaving anything behind.
≥ "If this tool a) was updated regularly b) had a nicer, non-CLI UI for everyday users and c) was available at an easily-Googlable domain name - it would solve all the problems I mentioned above."
No matter how nice you make it, it will probably still lose the SEO battle against the shitty ad-ladden sites fighting to win top place for google searches of "convert X to mp3"/etc.
That's probably true. In my ideal world, the OSS community comes together in a suite of offline-first, in-browser alternatives to common user needs. We need a site for document conversion (using WASM Pandoc?), PDF merging, image conversion, text utils (lower case, spell check). All these would live on one trusted site. Hopefully users can discover the site and be done searching, and spread its name by word of mouth.
I had to merge a bunch of PDFs for a rental application recently and it was painful. Having to upload very sensitive docs, every site being a funnel to their paid version, etc.
In the past I've used ghostscript to merge PDFs, but it's not user friendly of course. I do like your idea of one site hosting these sort of FOSS utilities with nice wrappers.
How is performance? Right now I use Handbrake to do video optimization. However, I'm not a video expert and all the options are pretty daunting to me. I don't know how to get the right mix of optimization without having the video encoding take forever. I would guess running the conversion in browser would just make everything slower. I wish there was something simpler for video, like ImageOptim. Just drag the video in and it compresses it using the best options based on compatibility needs.
If it's 2-3x of native code, that's plenty good enough. Everyone uses Handbrake in an async mode - just set up the conversion and let it run overnight.
Of course, a website would be better for smaller conversion jobs (in case you have browser restarts or whatnot). Desktop apps can block computer restarts to a greater degree than websites.
Handbrake uses ffmpeg under the hood for much of its activities:
https://handbrake.fr/docs/en/1.3.0/technical/source-formats....
"One of HandBrake’s strengths is its ability to open a wide variety of video formats. HandBrake uses FFmpeg under the hood and generally can open whatever FFmpeg will, in addition to disc-based formats like DVD and Blu-ray."
The options Handbrake exposes are essentially the ffmpeg flags. The built in presets in Handbrake are generally pretty sensible IMO and I've rarely had to deviate.
I don't think users can distinguish a "local" website with a public one. So this software should just come with the OS as a "movie maker" package. As mentioned Handbrake UIs are a decent candidate.
I think OSs are generally over bundling helpful software, unless its a funnel to a paid version. The security risk is too high, and so is the legal risk of anti-monopoly action ("why are you killing independent movie editor businesses?").
"Regular" people don't really need FFMPEG. Regular people need tools with GUIs that have a non-generic purpose. So stuff like https://kdenlive.org/en/ that are backed by ffmpeg are (imo) superior "regular" person tools.
FFMPEG isn't complicated (its as complicated as any other CLI tool), it's that video encoding/decoding specifically is a hard problem space that you have to explicitly learn to better understand what ffmpeg can do. I think if someone spent an hour learning about video codecs, bitrates, and container formats, they would immediately feel "better" at ffmpeg despite not learning more about the tool itself.
I mean, we have, what, 15+ years of StackOverflow posts to tell us what the common use cases are, e.g. "how do I make a GIF from this short screen cap?"; surely FFmpeg could offer prompt-based wizards for that kind of low hanging fruit.
ffmpeg wasm is very slow in the browser. Users would still use those terrible websites because they wouldn't be terribly slow
rust-ffmpeg already seems to have support for 7.0: https://github.com/zmwangx/rust-ffmpeg/pull/178
At first I thought this was a full rewrite in Rust, but it's not
Why would one want a "Safe FFMpeg Wrapper"?
Edit: Got it, provides a rust API library to ffmpeg. Thanks @CJefferson.
Basically no one rewrites FFmpeg in recent years, in any language, at least not in the open source scene (and judging from the known usage of FFmpeg in world’s premier providers of multimedia content, probably not in the commercial scene either). It’s both too good and too daunting.
I'm not sure I'd describe FFmpeg's CLI as "too good"
I’m talking about the underlying libav*. There are plenty of frontends in all sorts of languages, although ffmpeg(1) itself is obviously the most versatile. Also, CLI UX is highly subjective (hence all the different frontends by people with differing opinions), I personally find it more than acceptable for the immense complexity it encapsulates.
The parent comment was not referring to the CLI exclusively.
So, the next candidate for backdooring? :)
Bah, don't give them ideas! Honestly, codecs are a worrying target for supply chain attacks because they're complex and use a lot of memory-unsafe code. Just look at all the image format attacks throughout history (a memorable recent one being the libwebp vulnerability.)
I use it — because I am writing a rust program, and want to use ffmpeg functionality.
What’s the alternative? I could wrap the C API, and then try to make a nice rust interface from that, but then that’s exactly what this package does, so I don’t want to repeat the work.
What’s the alternative?
I often just exec ffmpeg from whatever language I'm using (as a command line thing). Not very ergonomic, but the nice thing is that it's 1:1 with all examples and other uses of ffmpeg. But I guess it depends on how deep into ffmpeg you're doing stuff. Mine is mostly to point it at doing something non advanced with a file and that's it.
I used this wrapper to implement an opening and ending detection tool for “fun” [1].
However, it seems that many programs opt to instead shell out to the ffmpeg CLI. I think it’s usually simpler than linking against the library and to avoid licensing issues. But there are some cases where the CLI doesn’t cut it.
Surprised even MPEG-5 EVC made it. Unfortunately the VVC Decoder didn't quite make it ( Edit : Officially ) . I guess we will have to wait until version 7.1. Still waiting for x266.
VVC decoder is available, but it's flagged as experimental, so you have to prefix `-strict experimental` before the VVC input `-i`.
Thanks Yes I meant officially. Was hoping the we could set stage for VVC little earlier. I know VVC is not popular on HN or literally anywhere on Internet but I do hope to see it moving forward instead of something like MPEG-5 EVC which is somewhat dead in the water.
I don't know that having so many codecs is a good thing unless they really add something. How does it compare to av1 (which I was under the impression is coming to be the natural successor of hevc, with hardware support)?
Comparing to AV1, VVC / H.266 is expected to offer 20-30% reduction in Bit-Rate with similar quality at similar level of computational complexity. And it is already deployed and used in real world in China and India. I believe Brazil are looking to use it as their next generation codec for broadcasting along with LCEVC.
Here's the Brazil website.
https://forumsbtvd.org.br/tv3_0/
And their video codec testing.
https://forumsbtvd.org.br/wp-content/uploads/2024/03/SBTVD-T...
And ATSC 3.0 will also be using VVC.
https://prdatsc.wpenginepowered.com/wp-content/uploads/2024/...
VVC is marked as experimental as fuzzing continues on it.
The built-in VVC decoder is dreadfully slow (a ton of optimizations are missing), VVdec is at least 2-3 times faster on anything having AVX2/SSE4.
If you really want to give VVC a try, better stay with version 6.1.1 as it's the last one which has patches for enabling VVdec. You won't be able to apply them to version 7.0/git master:
https://github.com/fraunhoferhhi/vvenc/wiki/FFmpeg-Integrati...
So I'm trying to build ffmpeg via vcpkg today, and it turned out multiple of its dependencies are transitively depending on liblzma, but the downloading of liblzma source has been disabled by GitHub in light of the recent xz backdoor.
Blocking downloads of liblzma seems to me to be an ill-advised decision. Now that the mechanism is known, the dangers are limited, but the educational value of being able to study what has been done is real.
While the dangers are limited, they certainly aren't zero. Even if the original attacker(s) have entirely gone to ground others may be scanning for hosts that managed to got compromised by following the bleeding edge and more could get compromised of downloads from primary sources are kept open.
Keeping the affected code visible somewhere could be useful for research purposes, but you don't want it where people or automations might unwittingly use it. If the official sources where the only place this could be found then it might be reasonable to expect them to put up a side copy for this reason, but given how many forks and other copies there will be out there I don't think this is necessary and they are better off working on removing known compromises (and attempting to verify there are no others that were slipped in) to return things to a good state.
Maybe someone needs a year to audit the history and find all the other backdoors. Who's going to work on it for a year for free or without being in on it, I don't know.
Who knows what automated systems could pull code and integrate it into builds, otherwise.
It just takes time to act/react.
A week or so is not a lot of time.
I worry that this is just the begining
right now I'm sure it's a temporary measure, to limit the downloading of sources.
but I really worry that later this will become normalized first, after every exposed hack withrdraw source availability for a little bit aftewards, just while 'they' check for other attacks or whatever
later on, it'll take longer and longer to put the source back up. but let's hope this is merely my overactive paranoia and everything will be fine open source is still ok.
The obvious solution seems to be adding an extra hurdle, where it warns you the source may be compromised, so you can still get it, but aren't going to just grab it without knowing something happened.
There is value in making sure (potentially) compromised code doesn't just get used normally, but I agree that shouldn't mean totally blocking access to it in most cases.
Internet Archive to the rescue! It's still possible to download the liblzma source code from the Internet Archive's Github snapshot of the 5.4.6 stable release: https://web.archive.org/web/20240329182145/https://github.co...
Changelog:
- DXV DXT1 encoder
- LEAD MCMP decoder
- EVC decoding using external library libxevd
- EVC encoding using external library libxeve
- QOA decoder and demuxer
- aap filter
- demuxing, decoding, filtering, encoding, and muxing in the
- ffmpeg CLI now all run in parallel
- enable gdigrab device to grab a window using the hwnd=HANDLER syntax
- IAMF raw demuxer and muxer
- D3D12VA hardware accelerated H264, HEVC, VP9, AV1, MPEG-2 and VC1 decoding
- tiltandshift filter
- qrencode filter and qrencodesrc source
- quirc filter
- lavu/eval: introduce randomi() function in expressions
- VVC decoder (experimental)
- fsync filter
- Raw Captions with Time (RCWT) closed caption muxer
- ffmpeg CLI -bsf option may now be used for input as well as output
- ffmpeg CLI options may now be used as -/opt <path>, which is equivalent
- to -opt <contents of file <path>>
- showinfo bitstream filter
- a C11-compliant compiler is now required; note that this requirement
- will be bumped to C17 in the near future, so consider updating your
- build environment if it lacks C17 support
- Change the default bitrate control method from VBR to CQP for QSV encoders.
- removed deprecated ffmpeg CLI options -psnr and -map_channel
- DVD-Video demuxer, powered by libdvdnav and libdvdread
- ffprobe -show_stream_groups option
- ffprobe (with -export_side_data film_grain) now prints film grain metadata
- AEA muxer
- ffmpeg CLI loopback decoders
- Support PacketTypeMetadata of PacketType in enhanced flv format
- ffplay with hwaccel decoding support (depends on vulkan renderer via libplacebo)
- dnn filter libtorch backend
- Android content URIs protocol
- AOMedia Film Grain Synthesis 1 (AFGS1)
- RISC-V optimizations for AAC, FLAC, JPEG-2000, LPC, RV4.0, SVQ, VC1, VP8, and more
- Loongarch optimizations for HEVC decoding
- Important AArch64 optimizations for HEVC
- IAMF support inside MP4/ISOBMFF
- Support for HEIF/AVIF still images and tiled still images
- Dolby Vision profile 10 support in AV1
- Support for Ambient Viewing Environment metadata in MP4/ISOBMFF
- HDR10 metadata passthrough when encoding with libx264, libx265, and libsvtav1
- ffmpeg CLI now all run in parallel
I think I read about this a few months ago but don't remember the details. What exactly does this do? Does it result in faster encoding/decoding if you have multiple filter graphs (for example a single cmd line that transcodes to new audio, extracts image, creates a low res)
- ffmpeg CLI loopback decoders
No idea what this is...
Edit: threading => https://ffmpeg.org//index.html#cli_threading, loopback => https://ffmpeg.org/ffmpeg.html#Loopback-decoders
Loopback decoders are a nice concept. So could I use this to create a single ffmpeg command to extract images periodically (say 1/s) and then merge them into a horizontal strip (using the loopback decoder for this part)?
You don't need a loopback decoder for that. The periodic extraction will depend on a filter, and you can just clone and send the output of that filter to the tiling filter.
Had to go to ChatGPT for help. It appears that you need to know how many tiles to stitch. I was hoping to have that dynamically determined. Not sure if loopback will help.
CGPT said: ffmpeg -i input.mp4 -vf "fps=1,tile=3x1" -frames:v 1 output_stitched.png
Gemini: ffmpeg -i input_video.mp4 -vf "fps=1,scale=220:-1" -c:v png output.png
And also, thank you for packaging.
First of, ffmpeg is amazing, I'm very thankful to everyone involved in it.
dnn filter libtorch backend
What's ffmpeg's plan regarding ML based filters? When looking through the filter documentation it seems like filters use three different backends: tensorflow, torch, and openvino. Doesn't seem optimal, is there any discussion about consolidating on one backend?
ML filters need model files, and the filters take a path to a model file as one of their arguments. This makes them really difficult to use, if you're lucky you can find a suitable model and download somewhere, otherwise you need to find a separate model training project and dataset and run that first. Are there any plans on streamlining ML filters and model handling for ffmpeg? Maybe a model file repository with an option of installing these in an official models path on the system?
Most image and video research use ML now, but I don't get the impression that ffmpeg tries to integrate the modern technologies well yet. Being able to do for instance spatial and temporal super resolution using standard ffmpeg filters would be a big improvement, and I think things like automatic subtitles using whisper would be a good fit too. But it should start with a coherent ML strategy regarding inference backend and model management.
- D3D12VA hardware accelerated H264, HEVC, VP9, AV1, MPEG-2 and VC1 decoding
I wonder if this also means that Chrome and Edge will be able to use this acceleration for their ffmpeg backend (instead of relying on MediaFoundation)?
Meanwhile, the default ffmpeg is at version 4.4.4 [1] on MacPorts. There's ffmpeg6, which is at version 6.1.1. [2]
ffmpeg updates their API very liberally, even in minor version bumps. Combine that with many packages depending on ffmpeg doesn't make it very easy to always have the latest ffmpeg version. They are working on it though: https://trac.macports.org/ticket/65623
(I'm not a MacPorts maintainer, but I've been burnt by ffmpeg API changes a couple times myself before).
API users shouldn't pay attention to the ffmpeg version but those of the libraries. API and ABI breaks only happen at major version bumps of those.
Same for Homebrew. I just updated it and yes, there was an update for ffmpeg, but it only updated it to 6.1.1_7
6.1.1 is the latest build from the 6.1 release channel, which was the latest up until about 4 or 5 hours ago when a new major release was cut.
Not at all the same as running a 4.x release.
Static builds of nightlies and releases are available
I was curious if it was just some kind of horrific backlog but there didn't seem to be an oppressive number of PRs open, although it seems the new submissions drag their feet quite a bit https://github.com/macports/macports-ports/pulls?q=is%3Apr+i...
Also, it seems there is currently one in progress to drop the "6" qualifier on the ffmpeg binaries <https://github.com/macports/macports-ports/pull/23315/files> so it'll be fascinating to see if any new ffmpeg7 then subsequently puts the "7" back, beginning the cycle again
The winget version is still stuck on v6.1.1. Valve pls fix.
Winget could be so good. It's annoying that there are so many little things that make it total rubbish.
A small delay in getting the latest version isn't enough to warrant such a reaction... Do you think every single distro out there already ships 7.0?
The entitlement in tech is sometimes baffling. People sometimes expect updates the moment they snap their fingers. There's Arch for that.
The winget community repo is open source: https://github.com/microsoft/winget-pkgs/tree/master/manifes...
My upload for the source pkg only finished 20 mins. ago. The Winget maintainer should update their manifest shortly.
Critical Microsoft customers will be so relieved ;)
Can someone explain?
I lol’d
Please vouch for gyan's changelog comment below. Is flagged/dead for some reason?
Same question.
ffmpeg is such a joy to use, once you make it over the very steep learning curve.
I'm making some youtube videos where I play through Demon's Souls flipping a coin to decide to equip items or not, and I wanted to have an onscreen coin flip animation and sound effect. With some effort, I created a transparent set of frames for the animation. Then with ffmpeg's filter_complex I was able to add the image sequence as a video stream, overlay it over the original video, and add a sound effect. That's on top of the existing subtitles, audio channel merging, and video resizing/compression. All in a single (long!) ffmpeg cli command.
ffmpeg is one of the true wonders of FOSS.
and every other non-sadist would've done this in 2 seconds in <insert 500+ FOSS video editors>. not sure how wrangling a byzantine CLI is a joy to use.
In my case I'm working with very large uncompressed video. Files that are 1 hour and 300+ GB in size. I tried using Kdenlive but it choked.
I'm not saying the CLI is easy to learn, but once you do learn it, you have a lot of power at your fingertips.
IAMF/ambisonics looks so cool, but it's so unclear how is plebes would play around with it & explore it's use.
Crazy how much DirectX (DXVA) support got added.
Might look into ambisonic support in Unreal/Unity.
MultiThreading! Finally! \o/
FFmpeg has had multithreading for codecs for years. What's been added is multithreading in the transcode pipeline of the command line application. See https://ffmpeg.org//index.html#cli_threading and https://fosdem.org/2024/schedule/event/fosdem-2024-2423-mult... for details.
Moving to c11 is bad, really bad. This is a dangerous road to follow, don't trust ISO on that matter which is literaly doing planned obsolescence on 5-10 years cycle with computer language feature creeps. C has to be simplified then going towards eternal stability, not the other way around. I suspect some toxic/scammy people got in (or brain washed).
I think I did updated my code with the new channel layout API. But it was a year ago at least. There is another API which is supposed to change, the seeking API but I wonder if it is now stable enough to be used.
I have been using the xstack filter for several years now.
What I do is take several diverse short video segments, like 100, concatenate them into 4 segments (example 23+24+26+27 since they have diverse lengths) and then xstack them into a 2-by-2 mosaic video.
Before, I was doing it in a single stage, but now, after some advice, I do it in 5 stages: 4 concatenate stages and 1 xstack stage.
I have not profiled/timed it so see which is faster, but it works pretty well, although I often have a lot of different weird warnings.
Use the terminal. There does not exist a single GUI interface that does everything ffmpeg does.
Use ChatGPT to help you find the right command for your need.
That would give you hallucinated commands, not commands that actually exist or make sense. Better read documentation or ask experienced humans.
Can't confirm. LLMs tend to be oddly good at FFmpeg and give you a good starting point you can work with.
probably because the documentation (of which there's a lot) is very good.
I don't know if I agree with that. The official docs do list a lot but often you get something like compression_level takes an integer from 1-8. But doesn't tell you if 1 is smaller output or 8 is to work the hardest.
However the wiki pages can be quite good if they cover your use case and the real strength is so many examples online. Even if many of the example command lines feel like they have been cargo-culted through the years and no one actually understands what exactly they do.
No, that's bad advice imho. ChatGPT (and Claude etc.) are all pretty damned good at this. Strongly recommend using them over reading the docs or asking other people if you're getting started.
So this is a good use case for it: you will most likely get immediate feedback if it's wrong, and a bit delayed feedback if it achieved the wrong thing; and you can prod it to try harder. LLMs are best used when you can easily verify the results.
StackOverflow is a much more reliable version of ChatGPT :)
That's such a clunky workflow and takes so long, sometimes you want to drag and drop a file, tick a couple of checkboxes and click a button
But which checkboxes? Ffmpeg would probably need a hundred thousand.
categorize, add tooltips, show defaults... this is some basic UI design stuff, should be possible.
But all of them have parameters. And depend on each other. And the order of filters matters. There is no way to make a simple interface for all ffmpeg can do.
Why don't you keep a text file with the 3 commands you want to regularly run in the ffmpeg directory? fast. easy. cheap.
That still feels like a chore, opening some text file, copying the right command, opening a terminal, messing around with input and output file paths..
I don't like interacting with command line parameters in general, it feels clunky to me, but I don't think there's a point in arguing about it since it is more of a personal preference
My experience is that ChatGPT is dreadful at everything but the simplest ffmpeg invocations, and will often produce command lines with subtle quirks (such as "only works if the input is an even number of pixels wide").
But then again, my experience is that ChatGPT is dreadful at everything but the simplest anything.
My experience is similar, but I find it can still be useful if you go step by step, check its work and explain errors and corrections as you go. Admittedly when I describe it like that it doesn't seem very useful, but if I'm figuring things out by myself then most of that work is a given anyway and the bot helps that process along.
Sometimes... but most often, after I explain in detail an error it did, it will just say "I apologize for the mistake, you are correct, <re-iteration of my explanation of the error>. Here is a version with the error fixed:", followed by either the exact same output or an equally wrong alternate output.
The fact that these glorified Markov chains manage to fool people into thinking they posses some kind of actual intelligence or ability to reason baffles me.
Whenever I search for ffmpeg commands, there is always some person suggesting an 8 liner with 25 arguments, and next to that someone suggesting a command with two -i and one -o argument. On visual inspection both do the same, but I’m always left with the feeling that I did something wrong or just “got lucky” with the shorter command.
I would love a “unless you’re a pro with hyper specific needs, forget these 90% of arguments and only use this 10% in this way” type of guide.
That’s tame for FFmpeg, likely just specifying a bunch of encoder parameters the simpler command left out for defaults, maybe with some input/output streams explicitly spelled out. If you want to look at really incomprehensible FFmpeg commands, try anything with filtergraphs.
Too true! For a recent video project I wrote a 300 line bash script to generate the ffmpeg arguments needed to accomplish what I was after.
use ffprobe on the file first then paste that output and what you want done to it. Helps if there’s something about the file/codec that GPT would have to work around. Had better success that way.
https://handbrake.fr/ has been around for a long time.
But that is not ffmpeg, is it?
I know they use some of the same encoding libraries.
ffmpeg is, mainly, a bunch of libraries. libavcodec, libavformat, libswresample, etc, is almost all of ffmpeg. If a project is using those libraries, it's using ffmpeg.
The ffmpeg command line utility is "just" an interface to those libraries.
ffmpeg is a lot more than just a wrapper on libraries. It can do a lot of filtering and rewiring of audio channels, video channels and subtitles.
Handbrake doesn't do any of that. You can't even drag a bunch of audio files on to handbrake because handbrake doesn't do audio, while ffmpeg is great for encoding audio.
What I'm saying is that ffmpeg is the libraries. When you use the ffmpeg CLI to filter and rewrite audio, the ffmpeg command is just decoding it using a decoder in libavcodec, filtering using a filter from libavfilter, and encoding again using an encoder from libavcodec. The ffmpeg CLI is "just" an interface for the libraries.
The fact that Handbrake doesn't expose the same features as the ffmpeg CLI tool is frankly irrelevant.
The fact that Handbrake doesn't expose the same features as the ffmpeg CLI tool is frankly irrelevant.
It's not just relevant, it's the whole thing. They asked for an ffmpeg GUI and someone recommended a GUI that doesn't use ffmpeg and doesn't do what ffmpeg can do. ffmpeg can not only rewire channels, it can stream video, capture video from the screen, capture video from a tv tuner, overlay text etc.
Also the libraries you listed are part of the ffmpeg project. They come from ffmpeg.
https://ffmpeg.org/about.html
Also they asked for a GUI to ffmpeg, not necessarily a GUI to the command line tool.
Handbrake does use ffmpeg. My whole point is that the libraries are ffmpeg, so I don't understand why you're saying it back to me.
Gah. I give up.
Handbrake isn't a GUI for all of ffmpeg, it does a very limited set of what ffmpeg can do.
Never argued otherwise. Specifically, the thing I'm saying is wrong is:
Handbrake uses ffmpeg.
But I recognize your username. I don't remember from where but I remember reading or having a conversation with you which went nowhere. I think I'm done.
Using ffmpeg for one thing doesn't mean a GUI is a GUI for all of ffmpeg.
to be fair, I have never seen a GUI that could do everything its complex command-line equivalent could do, that wasn't, in the end, just simpler and easier to use the command-line in the first place. so when people ask for "a GUI", I think a lot more information is needed about what it should be able to do.
There is a MR to update it to FFMPEG 7: https://github.com/HandBrake/HandBrake/pull/5884
Because HandBrake uses some parts of the FFmpeg libraries, but HandBrake scope is much smaller than FFmpeg, and while it uses some parts, it's definitely not FFmpeg CLI GUI.
There isn’t one. Handbrake very weird about interpreting what you want and you’ll find yourself having to double check dimensions and stuff every time and the queue is very fiddly. On the Mac version at least.
Don’t think anything exists like XLD for FFMPEG video where you can just drop a file in set the quality and codec and get the exact same dimensiond file out every time.
Agreed with other posters: CLI is the best way to go…
ChatGPT can help with learning a lot now but the mailing lists are incredible sources of kind and wonderful (and incredibly knowledgeable) people… go there!
Handbrake, Permute are super as mentioned… I’ve put down a couple to add to the list.)
Helpful: https://ffmpeg.guide/ https://www.ffworks.net/
Not a gui interface for ffmpeg per se, but a lightweight gui app I like to use that uses ffmpeg's libavcodec library is avidemux.
XMedia Recode is pretty good. https://www.xmedia-recode.de/en/index.php
Permute for Mac is an elite app.
Depends on your needs. If you just want to cut and trim videos, LosslessCut[1] is great and simple to use.
[1] https://github.com/mifi/lossless-cut
It's not quite a GUI, but I usually refer to https://alfg.dev/ffmpeg-commander/.
I was trying to build one, but haven't progressed much lately. https://github.com/NoamRa/alpha-badger