Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What you need to know before touching a video file (gist.github.com)
373 points by qbow883 5 days ago | hide | past | favorite | 245 comments




Nearly this entire HN comment section is upset about VLC being mentioned once and not recommended. If you can not understand why this very minor (but loud?) note was made, then you probably do not do any serious video encoding or you would know why it sucks today and is well past its prime. VLC is glorified because it was a video player that used to be amazing back in the day, but hasn't been for several years now. It is the Firefox of media players.

There is a reason why the Anime community has collectively has ditched VLC in favor of MPV and MPC-HC. Color reproduction, modern codec support, ASS subtitle rendering, and even audio codecs are janky or even broken on VLC. 98% of all Anime encode release playback problems are caused by the user using VLC.

We even have a dedicated pastebin on a quick run down of what is wrong: https://rentry.co/vee-ell-cee

And this pastebin doesn't even have all the issues. VLC has a long standing issue of not playing back 5.1 Surround sound Opus correctly or at all. VLC is still using FFmpeg 4.x. We're on FFmpeg 8.x these days

I can not even use VLC to take screenshots of videos I encode because the color rendering on everything is wrong. BT.709 is very much NOT new and predates VLC itself.

And you can say "VLC is easy to install and the UI is easy." Yeah so is IINA for macOS, Celluloid for Linux, and MPV.net for Windows which all use MPV underneath. Other better and easy video players exist today.

We are not in 2012 anymore. We are no longer just using AVC/H264 + AAC or AC-3 (Dolby Audio) MP4s for every video. We are playing back HEVC, VP9, and AV1 with HDR metadata in MKV/webm cnotainers with audio codecs like Opus or HE-AACv3 or TrueHD in surround channels, BT.2020 colorspaces. VLC's current release is made of libraries and FFmpeg versions that predate some of these codecs/formats/metadata types. Even the VLC 4.0 nightly alpha is not keeping up. 4.0 is several years late to releasing and when it does, it may not even matter.


I'm also surprised by people's defense of VLC. It's a nice project, especially when it was created, but the bugs I regularly encountered were numerous and in seemingly common use cases.

Here's a post I made 4 years ago describing each bug, shortly before switching to MPV: https://www.reddit.com/r/VLC/comments/pm6y1n/too_many_bugs_o...


My main problem with VLC is that when I accidentally hit the wrong key on my keyboard (usually in the dark, because that's how I watch movies), it is quite often almost impossible to get the settings back to what they were without restarting the player.

Keyboard shortcuts with no modifier key involved are evil. Even Gmail has those.

Funny how you say "evil" but all I can hear is "vi".

Oh it's fair game there. I love vi/vim

Thunderbird has no modifier shortcuts too.

Honestly, I'm absolutely not. I still vividly remember those times when we have to install codecs separately. And every month something something new and incompatible pops up on a radar, which sent all users on a wild hunt for that exact codec and instructions how to tweak it so the funny clip could play. Oh dear I'm not loking back to times od all versions of divX xVid, matroska, mkv avi wma, mp4, mp3 vba ogg and everything else, all thos cryptic incantations to summon a non-broken video frame on a modern hardvare, for everyone but few people in anime community who drove that insanity on everyone else. I'll die on a hill of VLC, despite all its flaws, because it gave an escape route for everyone else - if you don't give a F about "pixel perfect lowest overhead most progressive compression that is still a scientific experiment but we want to encode a clip with it" and simply want to view a video - vlc was the way. Nothing else made so much good to users who simply want to watch a video and not be extatic about its perfect colour profile, loosless sound and smallest size possible.

All other players lost their plot when they tried to steer users into some madness pit of millions tweaks and configurations that somehow excites aughors of those players and some cohort of people who encode videos that way.

I istall vlc very single time, because this is a blunt answer to all video playing problems, even if its imperfect. And walked away from ever single player who tries to sell me something better asking to configure 100 parameters I've no idea about. Hope this answers the question why VLC won.


> It is the Firefox of media players.

So... the better option?



Unofficial third-party builds from unknown github accounts; I think that you are really brave if you installed it.

And the first party ones available there are for testing, with missing features :/

We do not have this kind of problems with VLC.


Did you miss the github builds or just discounting them?

What happened to downloading an installer from the official website? Are we sending grandma to GitHub now?

Things are complicated. As a policy, I wouldn’t want to encourage grandma to be going to any web site to download software. Grandma should probably stick to the App Store. And personally, I would way rather install github builds than downloads from ‘official’/independently maintained web sites. Especially in the case of free / open source projects, sometimes cash constrained. Security is hard.

I’m not super knowledgeable about modern video players- I do like Infuse, which is in the App Store.


> So... the better option?

Depends on what you care about.

For me, Firefox really lacks in handling of very large amounts of tabs and a lot of features that I specifically use Vivaldi for. Does that mean Vivaldi is the best? Yes and No, it depends on what you care about.

Is Firefox still a good browser? As far as I know, yes. But I don't use it much at all because it doesn't give _me_ what I want and need.

And yes, I do actually need a large amount of tabs open at the same time very regularly due to the depth of references I work against in my line of work. That's on top of saving lots of bookmarks and syncing them via nextCloud.

You like Firefox? Great, keep at it.

You want to see features that aren't necessarily elsewhere? Consider trying Vivaldi and seeing if it's great for you or not.

Let's not act like browser selection is binary, because it isn't, and it really hasn't been since netscape navigator was new. And even then it's up for debate.


This kind of insulting quip, refusing to engage with the body of the post, is really inappropriate. Can you please not behave like an arse?

IDK where you have been for the last decade, but Firefox has not been the better option since Chromium was made

Disliking Google Chrome proper is one thing, but Chromium is superior in every way. Rendering, features, speed, memory management


Chromium has more than a few flaws that I'm sure you can discover if you choose to. Here's an incident that I cannot forgive:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=786909


Show me a piece of software without flaws and I'll show you either a liar, or perhaps the program "ping".

> Chromium is superior in every way. Rendering, features, speed, memory management

Being faster, prettier and using less memory[1] is pointless if the browser won't let me block all ads.

I mean, it's like comparing a turd sandwich made with expensive exotic bread, and a cheese sandwich made with cheap grocery store break.

Sure, the one has great exotic bread, but I don't want the turd it comes with.

So, yeah, it actually doesn't matter how much prettier, faster or smaller web pages are with Chrome, at least FF lets me (currently) block almost anything.

---------------------------------------

[1] Chrome beats out FF in exactly one of those, and it's not the memory or speed. Turns out ads take up a lot of RAM, and slow down pages considerably.


The person is asking for the better option.

I coincide with the person, by the moment Firefox is the better option, the comparative form is confusing.

-1 tab containers

Please elaborate on ”features”.

Does chromium have non-google sync?


Chromium based browsers have non-google sync. Vivaldi implements their own encrypted sync service and I believe Brave does as well.

But I am talking about browser feature support, not stuff that can supplemented with an extension like a password manager.

Firefox has poor support for modern web features including video processing and encoding which makes it very bad at web conferencing/video calls or in-page streaming.

Firefox's developer tools and console is also much worse and missing important features.

Other features Firefox is missing or has poor support for compared to Chromium are WebGPU, WebTransport, Periodic Background Sync, and parts of WebRTC. Plus various APIs for web serial, badging, and Web Share are missing partial or full support.

Firefox still doesn't have functional HDR for images and videos including AV1.


Oh I thought you meant actual chromium browser.

Those seem rather marginal features from my pov but of course once you need them, you need them, I guess.


Also, for context: ’Some truth here, but it’s overstated.

Firefox does WebRTC fine. AV1 works, simulcast works, calls and streaming work. Chrome still leads on performance tweaks and extra APIs, but “very bad” is just wrong.

DevTools aren’t “much worse.” Different, less popular, sometimes better (CSS, network). Chrome wins mainly because everyone targets it first.

API gaps are real but the list is sloppy. WebGPU and WebTransport exist in Firefox now, just behind on advanced bits. Periodic Background Sync barely matters. WebRTC support keeps closing the gap.

Missing stuff like Web Serial, Badging, fuller Web Share? True, and mostly intentional.

HDR is the weakest claim that actually holds. AV1 decode exists, but HDR support still feels half-done.

TL;DR: Firefox lags Chromium in breadth and polish, not in core modern web capability. Calling it bad for video or modern apps doesn’t match reality.” ’


MPC-HC is still a thing? I remember installing that (and K-Lite Codec Pack) on Windows, back in the day. Haven't used, or even thought about MPC-HC in years.

I still use K-Lite Codec Pack on all of my Windows systems: https://github.com/Microsoft/winget-pkgs/tree/master/manifes...


Is anyone else annoyed about how this is not very discoverable? The first Google hit for ”MPC-HC” is the web site saying ”MPC-HC is not under development since 2017. Please switch to something else.” What happened? Has the maintainer refused to hand over the project, or something?

Nobody took over maintenance at the time. Eventually clsid2 picked it up, and it has been maintained by him ever since.

It still is, but it's not as recommended over MPV but I'm not as familiar with what it decodes and renders wrong in comparison, but it is still suggested over VLC in Anime circles.

I've really felt gaslit over the last decade from people continuing to promote VLC as such a great thing, when I've had nothing bug bugs, crashes, glitches, issues with it for a full decade now (on Linux). From 10-25 years ago I definitely used it for everything, all the time, but now even the default Ubuntu totem video player (or whatever it's called) seems like 2-3 times as likely to be able to play a random video file without an issue as VLC does.

Thanks, I didn't realize the situation was so dire.

MPV is not user friendly, but I was very impressed by the gapless playback.

I did recently see someone compare mpv and vlc on a 8k HDR @ 60fps file with mpv really lagging while vlc doing it fine. I could confirm the mpv lags but don't have vlc, so not sure if it's just better in that specific case or did something like no actual HDR

This may just be because mpv has higher-quality default settings for scaling and tonemapping. Try mpv with profile=fast, maybe. To properly compare mpv's and VLC's performance you'd need to fully match all settings across both players.

It was with the fast profile using both software and hardware deciding, important detail I forgot was that the video was av1. Don't have the link to it now but it was from jellyfin's test files

Honestly the pastebin link needs to be re-submitted and frontpaged.

I even encounter this in professional a/v contexts! If VLC can read and decode your stream, that's a good sign that most things should able to view it, but it absolutely should not be trusted as any measure of doing things correctly/to spec.


> It is the Firefox of media players.

Ironically, my main gripe about Firefox is that it has no support for HDR content and its colour management is disabled by default… and buggy when enabled.


Lack of HDR is my second favorite feature

Knowing nothing about video stuff my only question from this is: what's wrong with Firefox?

Many HN readers won't be familiar with the fansub culture that this writeup originates from, so sharing a helpful resource in case anyone is interested in learning more:

ENTRY LEVEL FANSUBBERS' BEGINNERS GUIDE:

https://github.com/zeriyu/fansub-guide

Hope this helps anyone interested in the ancient art of subbing Japanese animes!

Be sure to read every link thoroughly, and don't worry, there are more link lists linked from the above link list.

Arigatou gomenasai!


Wait, so this categorical dismissal of VLC is just coming from a specific fandom community?

To be fair, it's a fandom community with high requirements and standards toward video players and that really knows its stuff.

VLC also fails playing live action media too, if that clears up anything.

I generally dislike anime and tend to reflexively roll my eyes when someone suggests I watch it, but I've been complaining about VLC for at least 15 years.

Its main claim to fame is that it "plays everything," and it rose to prominence in the P2P file sharing era. During this time, Windows users often installed so many "codec packs" that DirectShow would eventually just have an aneurysm any time you tried to play something. VLC's media stack ignored DirectShow, and would still play media on systems where it was broken.

We're past that problem, but the solution has stuck around because "installing codecs will break my computer, but installing VLC won't" is the zombie that just won't die.


The only one who cares, apparently...

It works well enough, and I doubt the majority of VLC users are watching anime with it.

It seems really weirdly written. It's written with a lot of authority, like saying "Don't use VLC" and "Don't use Y" yet provides no reasoning for those things. Just putting "Trust me, just don't" doesn't suddenly mean I trust the author more, it probably has the opposite effect. Some sections seem to differ based on if the reader knows/doesn't know something, but I thought the article was supposed to be for the latter.

Would have been nice if these "MUST KNOW BEFORE" advises were structured in a way so one could easily come back and use it as a reference, like just a list, but instead it's like a over-dinner conversation with your "expert and correct but socially-annoying" work colleague, who refuses to elaborate on the how's and why's, but still have very strong opinions.


Yeah, as far as I know, to understand video formats, you need to understand encode-decode process, how film/video editor operate normally (keeping in mind film/video editing has levels from $100s to way beyond me), history, how optics and cameras work, etc. Then particular choices and confusions can be understood.

This indeed just seems to jump-in in the middle and give a bunch very specific recommendation. I have no idea if they're good or bad recommendations but this doesn't seem like the way to teach good procedures.


There are not very many recommendations in this article, but they're good.

Wish I knew which ones were applicable to me, but without any reasoning, these "recommendations" are as good as random tweets with factoids.

Well, I just told you they're good, so now you know.

Yeah, based on your comment I learnt as much as the original article, which is basically nothing. So thank you :)

And? It’s a GitHub gist not an oreilly book. Context.

So? This was just a HN comment, not a review of a paper.

Exactly, very hard to take the rest of it seriously after the VLC bit. VLC has literally never left me hanging, across I don't know how many decades. It's gonna take more than a trust me bro to challenge that.

You're talking about VLC for video playback, TFA is taking about video editing.

VLC ignores a lot for it's outstanding video playback support, which is great if you want the playback too just work... But that's the player perspective, not the editing/encoding


While VLC is excellent at playing every format under the sun, it's not good at playing all those formats correctly. Off the top of my head:

- Transfer functions are just generally a mess but are close enough that most people don't notice they're wrong. Changing the render engine option will often change how the video looks.

- Some DV profiles cause videos to turn purple or green.

- h.264 left and right crops are both applied as left crops that are summed together which completely breaks many videos. They could just ignore this metadata but from what I've heard their attitude is that a broken implementation is better than no implementation.


The author did mention to use MPV, which is much much lightweight than VLC. Being using it as default for quite some times now.

what are you talking about? Of course it's only about playback just like the other 2 alternatives

> single best media player out there ... VLC is not recommended.


Literally the first sentence

> Hanging out in subtitling and video re-editing communities, I see my fair share of novice video editors and video encoders, and see plenty of them make the classic beginner mistakes when it comes to working with videos.

Seriously, you quoted pretty much the only sentence in the whole article that's about plain playback, and even in that bullet point, the following sentence mentions hardcoding subtitles.


Literally don't stop at the first sentence!!!

> It turns out that reading the (f.) manual actually helps a lot!

The non-recommendation of VLC vs mpc/mpv is literally for playback as I quoted! MPC also doesn't do any encoding, yet it's recommended

> the following sentence mentions hardcoding subtitles.

And that sentence starts with "Apart from simply watching the video" to tell you the same thing the previous sentence told you - that comparison where VLC was not receommended was about playback, not editing


Yes, I think everyone, including the author of the article will agree that you've quoted the one sentence thats about playback. I agreed on that from the beginning as well. After all, after you've modified a video file, you will want to check if it looks as desired. You want a video player for that, ideally one that doesn't ignore things for improved compatibility.

The point was that the rest of the article wasn't and if you unironically can't tell that, then you should seriously train your reading comprehension.


VLC is great for playing stuff back, but can produce some horribly incorrect video files especially if you're dealing with stuff for editing.

There's a reason why VLC isn't used in broadcast stuff and ffmpeg is.


VLC has always caused problems for me when seeking backwards (graphical glitches). mpv has never caused any issues in this regard.

VLC and mpv literally use the same underlying codec library. (As well as ffmpeg.)

Have you tried both? mpv is able to play high resolution HEVC videos backwards at real time by holding the "previous frame" key. VLC can't reliably jump backwards even at second intervals, forget about reverse playback.

So? Here is the post where VLC dev explains why you can't seek 1 frame back (you can do that in MPC and mpv) https://forum.videolan.org/viewtopic.php?f=7&t=126609&start=...

VLC makes a choice not to seek backwards to keyframes, which means you get video corruption.

Seeking is surprisingly difficult. Many container formats don't support it at all, because they don't have indexes, and so it's easy to mess up playback or lose A/V sync by trying it. Constructing the index is about as hard as decoding the entire file too.


libav{format,codec,...} are just libraries for demuxing and decoding video. There is huge variability in how those libraries are used, let alone how the video is displayed (which needs scaling, color space conversions, tonemapping, subtitle rendering, handling playback timing, etc. etc.). mpv also has its own demuxer for matroska files, since libavformat's is very limited [1].

[1] https://github.com/mpv-player/mpv/wiki/libavformat-mkv-check...


IIRC VLC used the wrong primaries for converting to RGB for a long time (years) even after it being reported to them as wrong

>even after it being reported to them as wrong

Source?


I'm not the OP of the claim (and I love VLC) but maybe they're referring to this early 2018 issue: https://trac.videolan.org/vlc/ticket/19723 which seems to be being actively worked on.

There's also https://code.videolan.org/videolan/vlc/-/issues/25651 but that's an off by one error so likely not really relevant to video playback for the average user.


VLC has left me hanging many times. It's play a file wrong or not played at all while mpv plays it no problem. Do not use vlc.

technically correct is the best kind. who cares if it's obnoxious? take the opinions and agree or disagree with them.

How do you know it is technically correct without explanation. It's not much different from someone getting blown off for being annoying because they constantly question simple answers when seeking better understanding. I was fortunate to work with a group of engineers when I was very young that accepted my constant use of "why?" not as disrespectful questioning but realized I was actually learning so they naturally just provided more details leading to less "why?" being asked. This eventually got to the point where I would ask a question, and the answer would be to read a specific book on the shelf. This was way before the internet. I received a better education on the job than I ever was going to get in school.

So no, I'm not just going to take an opinion without more information. I don't change my mind just on say so.


Why? Is the most simple test of a valid explanation. If you don't need to ask why any more, you've answered the question. Sometimes it takes 3 or 5 white in a row!

It works if you know the person and have a baseline for how much confidence you give their opinions. If it's just a random person on the internet, they need to support their argument.

I mean—they can. They don’t need to give more than they’re already giving we anonymous strangers for free. For all we know, this person wrote this for people they encounter personally or professionally, and we’re just incidentally benefitting.

We as readers should gauge their credibility for ourselves, whether by reputation or by checking the claims. I don’t know who wrote it but it seems basically correct, consistent, and concisely argued to me.


How can I disagree when they don't provide a reasoning behind why something is the better option?

When we switched from x264 to hardware based encoders it saved something like 90% on our customers' power and cooling bills.

So while this essay might be "technically correct" in some very narrow sense the author is speaking with far more authority than they have the experience to justify, which is what makes it obnoxious in the first place.


The author is directing this at complete noobs who are subbing their first anime and you are complaining that it is not applicable to running a datacenter?

the author never talked about power savings or cooling bills, they talked about quality so they are still correct.

This is already mentioned in the article. Software vs. hardware is a tradeoff. x264 produces higher quality (perceptual or compression efficiency) video, at the expense of latency.

Interesting read, it’s a shame the ranty format makes it 3x longer than necessary.

Not sure why it takes a dump on VLC - it’s been the most stable and friendly video player for Windows for a long time (it matters that ordinary users, like school teachers, can use it without special training. I don’t care how ideological you are about Linux or video players or whatever lol).


VLC works great on Linux too! It's one of the few programs where I expect the exact same look and feel regardless of the underlying OS.

mpv is okay but its complete reliance on command line flags and manually written config files makes it a bore.


> where I expect the exact same look and feel regardless of the underlying OS

Slightly ironic, as I think a new UI is underway (and coming soon?). Not sure what version it's planned for, but I think some beta has it enabled by default already, was surprised when I saw it. So the consistent UI is here today, and will be in the future, but there will be a slice of time where different users will run different versions where some switched to the new UI, and some haven't. But it'll hopefully be a brief period, and of course it's still cross-platform :)


No it doesn't, its Wayland support is a mess, its codec support is lackluster, and somehow the experience is worse when you use VA-API hardware decoding.

VLC is pretty much one of the default things I download on any of my computers. Right now I use mac and it's my default video player here too!

I don't believe it's the case anymore, but it was very common for VLC to cause video corruption (see [1] for example of what it looked like) in the past, the hate just stuck around and I don't think it's ever going away.

[1] https://www.reddit.com/r/glitch_art/comments/144vjl/vlc_star...


13 years since that post and this is the first time I’m hearing of this long-past issue.

Haters gonna hate I guess.


I still have this problem every day.

It has never been very common for VLC to cause video corruption.

In the anime fan subbing community (which this document is likely from), it's very common to hate on VLC for a variety of imagined (and occasionally real but marginal) issues.

Why is that?

At least for the real part there was the great 10-bit encoding "switch off" at around 2012 where it seemed like the whole anime encoding scene decided to move into encoding just about everything with "10-bit h264" in order to preserve more detail at the same bitrate. VLC didn't have support for it and for a long time (+5 years?) it remained without proper support for that. Every time you tried playing such files they would exhibit corruption at some interval. It was like watching a scrambled cable channel with brief moments of respite.

The kicker is that many, many other players broke. Very few hardware decoders could deal with this format, so it was fairly common to get dropped frames due to software decoding fallback even if your device or player could play it. And, about devices, if you were previously playing h264 anime stuff on your nice pre-smart tv, forget about doing so with the 10-bit stuff.

Years passed and most players could deal with 10-bit encoding, people bought newer devices that could hardware decode it and so on, but afaik VLC remained incompatible a while longer.

Eventually it all became mutt because the anime scene switched to h265...


8-bit and 10-bit almost give digital video too much credit. Because of analog backwards compatibility, 8-bit video only uses values 16-235, so it's actually like… 7.8 bit.

It's nowhere near enough codes, especially in darker regions. That's one reason 10-bit is so important, another is that h264 had unnecessary rounding issues and adding bit depth hid them.


Mostly that VLC has had noticeable issues with displaying some kinds of subtitles made with Advanced SubStation (especially ones taking up much of the frame, or that pan/zoom), which MPV-based players handle better.

If you want a MPV-based player GUI on macOS, https://github.com/iina/iina is quite good.


Note that, while I haven't had time to investigate them myself yet, IINA is known to have problems with color spaces (and also uses libmpv, which is quite limited at the moment and does not support mpv's new gpu-next renderer). Nowadays mpv has first-party builds for macOS, which work very well in my opinion, so I'd recommend using those directly.

He's talking about using VLC for transcoding or encoding, where the functionality has lots of issues and is kind of bolted on the side. VLC for playing is totally fine.

No it isn't, VLC plays everything back slightly incorrectly in all sorts of ways, the subtitle rendering and colorspace handling isn't compliant at all.

VLC uses libass for .ass subtitle rendering as far as I can see, shouldn't it be the same?

One difference I can immediately point to is that VLC always renders subtitles at the video's storage resolution and then up/downscales all bitmaps returned by libass individually before blending them. This can create ugly ringing artifacts on text.

I've also seen many reports of it lagging or choking on complex subtitles, though I haven't had the time to investigate that myself yet.

Either way, it's not as simple as "both players use libass." Libass handles the rasterization and layout of subtitles, but players need to handle the color space mangling and blending, and there can be big differences there.


Follow-up comment: I love how the author’s one brief take-down shot at VLC is currently the dominant criticism in the HN comments (inc. mine). 10,000+ words and the entire lot is being questioned because of one dubious throwaway comment about VLC.

A lesson to learn in that.

Lol


There was an article[1] yesterday where a single poor word choice derailed much of the comment section with rat-holing and nitpicking until the author revised the article. HN's gotta HN.

1: https://news.ycombinator.com/item?id=46413256


If you're talking about the use of the word "hover", I think that was quite justified, given that it was a critical element of the claim, and the poor wording made it neigh impossible to reproduce the author's claim.

It seems to me he’s talking about using it for re-encoding/conversion as part of your editing workflow and is not really talking about its media playback capabilities. In that sense he is very much correct.

I'm always amazed when I see how many people are unfamiliar with VLC hate. It was notorious (to the point of it being a popular meme topic) for video artifacts, slow/buggy seeking, bloated/clumsy UI/menus, having very little format support out of the box, and buggy subtitles. I assume nowadays it's much better, since it seems popular, but its reputation will stick with me forever.

>It was notorious (to the point of it being a popular meme topic) for [...] having very little format support out of the box

???

I thought the meme was that it played basically everything? At least compared to windows media player or whatever.

The other items I can't say I've noticed, but then again I only play the most common of files (eg. h.264/h.265 with english subtitles in a mkv) so maybe it's something that only happens with unusual formats/encodes.

edit: based on other comments (eg. https://news.ycombinator.com/item?id=46465349), it looks like it might indeed be caused by uncommon files that I haven't encountered.


> I thought the meme was that it played basically everything? At least compared to windows media player or whatever.

Yes, that was in the 2000s though. During the 2010 VLC started falling behind because its shortcomings overweighted its capabilities.


Pretty sure VLC can play /dev/urandom

What year was this? I don't know there has ever been a normal format it doesn't support, and I think this has been the case for at least 15 years.

Up until just last month I had never had a problem with VLC. But I don't pirate content so maybe I just hadn't encountered the problematic files. However, recording voice notes in Opus format on my phone, it turns out that VLC has a bug playing Opus files at certain bit rates. However for me this is easily worked around by just using MPV.

I dropped VLC circa 2019 for all the reasons mentioned and ever since I use exclusively MPV, both on Windows and Linux.

So at least from those times


The only current VLC issue I encounter regularly is the subtitles one. I'm a lite video user, so I'm not sure the technical details, but some subtitle formats render black-on-black in VLC, but work fine on Plex/TV.

I've also encountered the odd "this video is corrupted" error that persists even after re-encoding. But I've never thought to troubleshoot to see if it's a VLC, and instead just get a different version.


I've never had problems with VLC, and I've used it off and on for 20 years.

I don't doubt that there's some obscure, elite videophile hate towards it, but I'm hardly going to stop using it because a few random internet strangers hate on it.


Kinda the problem with anecdotes isn't it? :)

My own anecdotal experience with VLC was that while every update fixed something, they also broke something in return - and these updates were common. This got annoying enough at some point for me to hop ships, and so I switched to mpc-hc and never looked back.

I've since also tried (and keep trying) the to-me still newfangled mpv, but I'm not a fan of the GUI or the keybinds. I know it can be customized, but that's not something I'm interested in doing. I know there are alternative frontends as well, but I checked and they're not to my liking either. So I mostly just use it for when mpc-hc gives out, such as really badly broken media files, or anything HDR.


Out of pure curiosity, what kind of things were you VLC using for, for it to break so often? I'm almost never doing anything with video, so I'm completely clueless in this field.

I don't recall my issues being media file or workload specific [0]. It was specifically just general frontend stuff I believe [1]. Although I should probably also mention that I don't remember much to begin with, other than my decidedly negative conclusion that made me switch players, and the overall personal narrative around that. It's been quite a few years if not a whole decade.

[0] Doesn't mean there weren't any, but then I was not doing anything special. Just watched anime, listened to music, streamed YouTube. Hardly an extraordinary workload for VLC, or indeed any media player in general.

[1] I remember them changing around the volume slider widget back and forth ad nauseam for example, and that becoming in some particular way defective that I cannot recall.


Thanks for the reply, yeah that's interesting, I remember also having issues with VLC as a kid (I'm in my late 20's now), but I always chalked it up to me being a noob and not knowledgeable, I wonder retrospectively how much of that were actual me-issues and how much VLC-issues. Recently one of my parents wanted to play a DVD via VLC which did not work, where the issues ultimately was missing DVD libraries which apparently can't be shipped in Ubuntu due to licensing issues (libdvdcss). 12 year old me wouls not be able to debug this issue to be honest.

And you think everybody else should stop using it because you had problems?

I'll make up my own mind on it.


Do you think everyone else should start or continue using it because you never had problems?

Let's be kind. Clearly not what either of us were thinking or intended to convey.


You should make up your own mind. why bother reading comments at all? why bother reading reviews? I bet you watch every movie and never comment on how good or bad they are because you'll be telling everybody else what to do and that would be hypocritical. I wonder how you stumbled upon VLC in the first place - perhaps you read about it somewhere and followed that advice?

Where did perching_aix say that?

For a long time it was the only graphical user-friendly option for non-technical Windows users that had decent support for a wide range of formats. I don’t know about its early years, but friends, family and I have been using it for a good 15+ years without encountering the issues folks are describing in these comments.

It seems there’s a lot of open-source lovers that haven’t also accepted that bugs can get fixed, projects can improve, etc. They’d rather treat a project as though it was stuck at version 0 from 20 something years ago. Deeply ironic.


Agree. Never mind how far they were behind on the more power user options like scaling, dealing with mismatches in video framerate and monitor refresh rate, etc.

Havent used it in ages, but a decade ago it felt a joke for all the video artifacts and subtitle glitches.

The one part that does get me some about people who blindly still praise it as THE video player at least outside of more technically inclined spaces like this, is so many people assume it exists as some monolith. Clearly library free, entirely the original work of VideoLAN, gracious they be that they give it all away for free.


| I assume nowadays it's much better...

It's not, but the Linux weenies won't hear of it. Maybe it's a great choice on Linux, but on Windows, it often renders things much worse than stock WMP (both legacy and modern). Videos with a lot of motion play especially poorly.

But, yeah, it opens everything.


Do you have sources for that? As far as I know VLC has actually always been famous for supporting basically every format.

ffmpeg supports every format. ffmpeg wrappers don't need add much value in order to also do this.

(Lots of corner cases apply and VLC developers do assist ffmpeg, host a conference, etc.)


Original post author here.

It seems like the main criticisms I am getting for this article are because it's escaped past its main target audience, so let me clarify a few things.

This post was born out of me hanging out in communities where people would make their own shortened edits of TV series and, in particular, anime, often to cut out filler or padding. Many people there would make many of the mistakes mentioned in the post, in particular reencoding at every step without knowing how to actually control efficiency/quality. I spent a lot of time helping out individual people one-on-one, but eventually wrote the linked article to collect all of my advice in one place. That way I (or other people I know) can just link to it like "Read the section on containers here," and then answer any follow-up questions, instead of having to explain from scratch each time.

> It seems really weirdly written. / ranty format

So, yes, it does. It was born out of one-to-one explanations on Discord. I wouldn't be surprised if it may seem condescending to a more advanced reader, but if I rant about some point to hammer it down it's because it's a mistake I've seen people make often enough that it has to be reenforced this much. I wouldn't write a professional article this way.

The other point many people seem to get hung up about is the "hate" on VLC. Let me clarify that I do not "hate" VLC at all, I just don't recommend it. VLC is only mentioned once in the entire page, exactly because I didn't want to slot in an intermission purely to list a bunch of VLC issues. I felt like that would qualify more as "hate."

That said, yes, pretty much anyone I know in the fansubbing or encoding community does not recommend VLC because of various assorted issues. The rentry post [1] is often shared to list those, though I don't like how it does not give sources or reproducible examples for the issues it lists. I really do want to go through it and make proper samples and bug reports for all of these issues, I just didn't have the time yet.

Let me also clarify that I have nothing against the VLC developers. VideoLan does great work even outside of VLC, and every interaction I've had with their developers has been great. I just do not recommend the tool.

[1] https://rentry.co/vee-ell-cee


remux vs reencode itself is a big point for video noobs such as myself.

in the past, cropping out a part of a video would meant reencoding it in some random preset. this would often take longer than required. however, accidentally realized the difference when trying out avidemux [1] and clipping together videos blazing fast (provided in same container and format)!

[1] http://fixounet.free.fr/avidemux/


Shameless plug for anyone that wants to go deeper on specific video topics: I've been organizing a conference for video devs for 11 years now and there's a wealth of info in the recordings. A talk from the most recent one on hacking a Sega Genesis to stream video might not seem that practical, but there were some fascinating bits on compression (or, rather, not being able to use actual compression). https://www.youtube.com/watch?v=GZdxdpw-3nI

If folks want to get involved, there's also a chat community that's pretty active: https://video-dev.org.


MPV plugins can actually do frame-perfect cuts and crops for you (+ whatever ffmpeg filters you want), something that would generally require the hassle of opening editing software. And those cuts can be done in h264 lossless (for additional processing later at no additional quality loss from this step).

https://github.com/occivink/mpv-scripts

There is also a way to losslessly cut preserving the original encoding but you give up the precision of the cuts due to keyframes. The MPV script above can do that too: script-opts/encode_slice.conf


"Don't use Topaz AI, Anime4k, RealESRGAN, RIFE, etc. Trust me, just don't."

Why? I only know Topaz and I always thought it had its narrow but legitimate uses cases for upscaling and equalizing quality?


Can't mind read the guy obviously, but the usual motivation that I'm aware of is that you pretty much fuck over everyone else that comes later. Upscalers improve over time, but in terms of distribution, recency bias is strong and visual treats are inviting. So when those much better upscalers eventually come around, what's more likely to still be available is the secondary source you distributed, which is already upscaled once with a then-inferior upscaler. This leads to a form of generational rot.

Other likely explanations are:

- them not liking how these upscalers look: you can imagine if they can nitpick minor differences between different encodes that most people don't notice, they'll hate the glaring artifacts these filters usually produce

- boycotting AI


You should read the article.

The reasons stated against upscaling were that (re-)encoding video files should generally be done in a way that preserves as much of the original information and intent as possible. AI upscalers add information where there is none, thus modifying the video in a way that goes against that goal.


Topaz looks bloody awful. Instead of big blocky upscaled pixels you've got weird artifacty "oil painting effect" smeary blobs.

[flagged]


OP makes zero comments about content generation, and the complaint is about upscaling introducing artifacts not in the original source. No different than hating a bad 4k remaster / sharpening.

That advice is not universal, and without context it's simply wrong.

You wouldn't upscale a classic film in this way, but there are plenty of low-resolution shots that benefit. Especially with VGA resolution renders and modern AI workflows.

Just looking at the Topaz marketing, you can see a lot of places where it indeed does work. And 20-year industry professionals are using it today for their day jobs.

If you want to say "don't upscale a classic film in Topaz", say that. Because context makes the advice correct. This blanket "do not use" statement is flat out wrong.


In any case, blindly apply upscaling is just wrong ..

You apply it to where it is needed. Not every scene need the same treatment.


The article simply says "Trust me, just don't."

No it doesn't. It says “don't, Unless you're extremely surgical with it and know exactly what you're doing”.

Which is a sensible piece of advice.


> Don't use Topaz AI, Anime4k, RealESRGAN, RIFE, etc. Trust me, just don't.

Is what the submission says about Topaz and similar.

> Applying any kind of post-processing[4]

Is what the footnote you quoted is linked to.


There's no mistake, “Topaz and similar” == “any kind of postprocessing”.

Right, but since the "AI upscaling" "advice" is more specific, doesn't that take precedent?

> Do post-processing if you're surgical about it

> Don't do AI upscaling regardless, never do it, and don't even ask why

Is the impression I get from this article.


I wish he talked about avidemux.

It's a simple tool which is great for many things, it has filters and there are most of the formats. I think it uses ffmpeg under the hood.

It's an old tool but it's fine for most things, when ffmpeg is to fastidious to use. ffmpeg is still what I use, but some more complex tasks are just more comfortable with avidemux.


Some of the simple tools might not be using ffmpeg per se, but using the libav or similar libraries. ffmpeg is just a tool built to utilize the functionality of multiple libraries like this.

ffmpeg and libavcodec are the same project.

yes. but do you have to include the ffmpeg part to include just the other libraries?

No, ffmpeg itself is a pretty small wrapper around them. But the rest are true libraries and not frameworks, so you have to make all kinds of playback decisions yourself. A/V sync is hard.

> to fastidious

Do you mean "too fussy?"


using a sledge to drive a finishing nail? yes, it's still a hammer and it's still a nail, but still the wrong tool for the job

I remember using Gordian Knot to create avi files from my DVDs back when XviD was the pragmatic method for encoding videos, and the whole goal was to get movies under 700mb so that you could write them to a CD. Avisynth and community filters were largely geared towards undoing all sorts of crap done to an image because artifacts that were relatively unnoticeable on a general CRT television were quite apparent on a computer monitor, as well as to then prepare the video to look good once it has been highly compressed with XviD or DivX.

These days I'm much more inclined to try and transparently encode the source material, tag it appropriately in the media container, and let the player adjust the image on the fly. Though I admit, I still spend hours playing around with Vapoursynth filter settings and AV1 parameters to try and get a good quality/compression ratio.

I have to say that the biggest improvement to the experience of watching my videos was when I got an OLED TV. Even some garbage VHS rip can look interesting when the night sky has been adjusted to true black.

Given the increasing abilities of TVs and processing abilities and feature sets of players, I'm not much persuaded to upgrade my DVD collection to Blu-Ray. Though I admit some of that is that I enjoy the challenge of getting a good video file out of my DVDs.

I partially disagree with the use of ASS subtitles. For a lot of traditional movies, using SRT files is sensible because more players support it, and because it's often sensible to give the player the option of how to render the text (because the viewing environment informs what is e.g. the appropriate font size).


Something I've never been able to find satisfactory information on (and unfortunately this article also declares it out of scope), is what is the actual hard on-the-wire and on-disk differences between SDR and HDR? Like yes, I know HDR = high dynamic range = bigger difference between light and dark, but what technical changes were needed to accomplish this?

The way I understand it, we've got the YCbCr that is being converted to an RGB value which directly corresponds to how bright we drive the R, G, and B subpixels. So wouldn't the entire range already be available? As in, post-conversion to RGB you've got 256 levels for each channel which can be anywhere from 0 to 255 or 0% to 100%? We could go to 10-bit color which would then give you finer control with 1024 levels per channel instead of 256, but you still have the same range of 0% to 100%. Does the YCbCr -> RGB conversion not use the full 0-255 range in RGB?

Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant, but that wouldn't change the on-disk or on-the-wire formats. Those formats have changed (video files are specifically HDR or SDR and operating systems need to support HDR to drive HDR monitors), so clearly I am missing something but all of my searches only find people comparing the final image without digging into the technical details behind the shift. Anyone care to explain or have links to a good source of information on the topic?


The keywords you're missing are color spaces and gamma curves. For a given bandwidth, we want to efficiently allocate color encoding as well as brightness (logarithmically to capture the huge dynamic range of perceptible light). sRGB is one such standard that we've all agreed upon, and output devices all ostensibly shoot for the sRGB target, but may also interpret the signal however they'd like. This is inevitable, to account for the fact that not all output devices are equally capable. HDR is another set of standards that aims to expand the dynamic range, while also pinning those values to actual real-life brightness values. But again, TVs and such may interpret those signals in wildly different ways, as evidenced by the wide range of TVs that claim to have "HDR" support.

This was probably not the most accurate explanation, but hopefully it's enough to point you in the right direction.


> Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant,

It's actually the opposite that makes the biggest difference with the physical monitor. CRTs always had a residual glow that caused blacks to be grays. It was very hard to get true black on a CRT unless it was off and had been for some time. It wasn't until you could actually have no light from a pixel where black was actually black.

Sony did a demo when they released their OLED monitors where they had the top of each monitor type side by side: CRT, LCD, OLED. The CRT was just gray while the OLED was actually black. To the point that I was thinking in my head that surely this is a joke and the OLED wasn't actually on. That's precisely when the narrator said "and just to show that the monitors are all on" as the video switched to a test pattern.

As for the true question you're getting at, TFA mentions things like color matrix, primaries, and transfer settings in the file. Depending on the values, the decoder makes decision on the math used to calculate the values. You can use any of the values on the same video and arrive at different results. Using the wrong ones will make your video look bad, so ensuring your file has the correct values is important.

From TFA: https://gist.github.com/arch1t3cht/b5b9552633567fa7658deee5a...


Note that crt's did not have bad blacks, they were far better than lcd displays. I am currently using an ips display and it has pretty good blacks, notably better than a normal lcd display. But I remember crt's being even better(probably just me being nostalgic for the good ol days when we were staring straight into an electron beam with only an inch of leaded glass to protect us). I Don't think they were lying, oleds are very very good(except for the burn in issue, but that's solvable), but I would be wary about the conclusions of a demo designed to sell something.

For what it's worth, the display I liked best was a monochrome terminal, a vt220, Let me explain, a crt does not really have pixels as we think of them on an modern display, but they do have a shadow mask which is nearly the same thing. however a monochrome crt(as found in a terminal or oscilloscope) has no shadow mask, the text of those vt220 was tight, it was a surprisingly good reading experience.


Do you have a device that supports HDR such as any MacBook in the last 10+ years, any iPhone since iPhone 4, or most high end Androids?

If so, try this: https://gregbenzphotography.com/hdr-gain-map-gallery/

Clicking the "Limit to SDR" and "Allow Full HDR (as supported)" should show a significant difference if you device supports HDR. If you don't see a difference then your device doesn't support HDR (or your browser)

For these images, there's a specific extension to JPEG where they store the original JPEG like you've always seen, and then a separate embedded gain map to add brightness if the device supports it. That's for stills (JPEGs) though, not video but the "on the wire difference" is that gain map

I'm not an expert but for videos, ATM, afaict, they switched them to 10bits (SDR is 8bits), and added metadata to map that 10 bits to values > "white" where white = 100nits. This metadata (PQ or HLG) can map those 10 bits up to 10000 nits.


HDR is nothing more than metadata about the color spaces. The way the underlying pixel data is encoded does not change. HDR consists of

1. A larger color space, allowing for more colors (through different color primaries) and a higher brightness range (though a different gamma function)

2. Metadata (either static or per-scene or per-frame) like a scene's peak brightness concrete tonemapping settinsg, which can help players and displays map the video's colors to the set of colors it can display.

I actually have a more advanced but more compact "list of resources" on video stuff in another gist; that has a section on color spaces and HDR:

https://gist.github.com/arch1t3cht/ef5ec3fe0e2e8ae58fcbae903...


YCrCb can also be better for HDR or not - 4:2:2 vs 4:4:4

if you expand limited YCrCb to a large HDR range you'll get a "blurred" output.

Imaging converting 1 bit image (0 or 1, black or white pixel) to full range HDR RGB - it's still black and white


here you go

> 10 bits per sample Rec. 2020 uses video levels where the black level is defined as code 64 and the nominal peak is defined as code 940. Codes 0–3 and 1,020–1,023 are used for the timing reference. Codes 4 through 63 provide video data below the black level while codes 941 through 1,019 provide video data above the nominal peak.

https://en.wikipedia.org/wiki/Rec._2020

Compare to

https://en.wikipedia.org/wiki/Rec._709



Hah, that's exactly how it feels!

Tangential but, at least for me, I find lots of video creators making 2-3 gig videos for no noticable difference in quality for me re-encoding them to 1/4th the size or less.

My impression is, their audience equates file size with quality so the bigger the file the more "value" they got from the creator. This is frustrating because bigger files means hitting transfer limits, slower to download, slower to copy, taking more space, etc...


Do youngins even know why AVI files are under 700mb? I think obsessions over quality/compression are the concern of us aging hobbyist encoders.

Unless one lives in a country where the internet is slow and/or hard drives are expensive, I think the audience does not care.


Yeah, similarly, my DSLR makes some huge video files, but they aren't that much better quality than my phone's. Of course, the sensor is massively better, and that makes a difference, but I don't know why the files are so much bigger.

My hypothesis is that they use a really high quality value, and that there are diminishing returns there.


video format world is one where you nope out pretty quick once you realize how many moving pieces there are.

ffmpeg seems ridiculously complicated, but infact its amazing the amount of work that happens under the hood when you do

    ffmpeg -i input.mp4 output.webm
and tbh theyve made the interface about as smooth as can be given the scope of the problem.

For all the hate Handbrake gets, it does the job of simplifying video encoding enough for casual users while still allowing for plenty of functionality to be leveraged.

this complication causing people to nope out has made my career. for everyone that decides it is too complicated and is only the realm of experts, my career has been made that much more secure. sadly, i've worked with plenty of video that has clearly been made by someone that should have "noped out"

I edit videos on a hobbyist level (mostly using davinci resolve to edit clips of me dying in video games to upload to a shareX host to show to friends). The big takeaway for me was reading that for quality/efficiency libx264 is better than nvenc for rendering h264 video. All this time I’ve assumed nvenc is better because it used shiny GPU technology! Is libx264 better for recording high quality videos too? I know it will run on CPU unlike NVENC but I doubt that’s an issue for my use case.

Edit: from some googling it looks like encoding is encoding, whether it’s used for recording or rendering footage. In that case the same quality arguments the article is making should apply for recording too. I only did a cursory search though and have not had a chance to test so if anyone knows better feel free to respond


Yeah, this is a very common misconception. There are hardware encoders that might be distribution quality, but these are (to my knowledge) expensive ASICs that Netflix, Amazon, Google, etc. use to accelerate encode (not necessarily to realtime) and improve power efficiency.

GPU acceleration could be used to accelerate a CPU encode in a quality-neutral way, but NVENC and the various other HW accelerators available to end users are designed for realtime encoding for broadcast or for immediate storage (for example, to an SD card).

For distribution, you can either distribute the original source (if bandwidth and space are no concern), or you can ideally encode in a modern, efficient codec like x265 or AV1. AV1 might be particularly useful if you have a noisy source, since denoising and classification of the noise is part of the algorithm. The reference software encoders are considered the best quality, but often the slowest, options.

GPU is best if you need to temporarily transcode (for Plex), or you want to make a working copy for temporary distribution before a final encode.


You might want to try dumping your work from Resolve out in ProRes 422 HQ or DNxHR HQ and then encoding to h264/h265 with Compressor (costs $; it's the encoder part of Final Cut as a separate piece of software) on a Mac or Shutter Encoder. Also, I'm making a big assumption that you're using the paid version of Resolve; disregard otherwise. It might not be worth it if your input material is video game capture but if you have something like a camera that records h264 4:2:2 10bit in a log gamma then it can help preserve some quality.

Pretty good writeup but not sure why VLC is not recommended...?

I bought a new dashcam. It generates .mp4 files. I tried to play them back with my new Roku media player, and it says invalid format. (They will play with Windows media player.)

Grump grump grumpity grump. Same experience with every dashcam I've bought over the years.


What's wrong with VLC?

Making such a bold, unsubstantiated claim is a curious item in an otherwise detailed document. I went looking for other explanations and found this gem: https://www.reddit.com/r/mpv/comments/m1sxjo/it_is_better_mp...

I think it might be one of those classic “everyone should just get good like me” style opinions you find polluting some subject matter communities.


Yes, absolutely. The top answer on that Reddit link starts with: "MPV is the ultimate video player on planet earth, all the others are junk in comparison" and doesn't mention VLC at all. That's not a helpful answer, it's just signalling that they're a huge fan of MPV, with nothing to suggest they've ever even tried anything else.

In the olden times of not working/playing movies (00's) and being a clueless tech support for ppl even more clueless about them computers,

The vlc was how you could get any movie to work (instead of messing with all these codecs, which apparently, in lieu to another comment in this thread, aren't really codecs).


my biggest pet peeve was that VLC was always considered a streamer and treated local files as streams as well. for the longest time, stepping within the video was not possible. reverse play was also a bane as well, even with i-frame only content. i have long found players that are better for me, but still find myself using VLC frequently because it still has features these other players do not.

This matches with my observation, VLC tends to be more tolerant of slightly broken files or random issues that you encounter when streaming. Especially for hls streams, vlc often works when ffplay refuses to play it, I believe because vlc uses their own demuxer (instead of relying on libavformat).

I would disagree somewhat on his stance that video quality is not affected by container format (especially on part "Here is a list of things that people commonly associate with a video's quality"). Different container formats have different limitations regarding what video (and audio) formats they support. And while it subtitles support doesn't directly affect video quality, it does do so indirectly. If you cannot add subtitles without hardsubbing or subtitle formats are so limited that you end up needing hardsubbing anyway then the choice of the container format ends up affecting the video quality.

I thought it was a good read, although with a couple of mistakes and a somewhat (IMO) childish sense of entitlement. This reads a bit like something a young teen who is heavy into tech wrote. I'm sure I could have authored something with the same overall tone and vibe when I was younger (perhaps not same quality, though!). Either way, it's a very decent read!

The idea that YCbCr is only here because of "legacy reasons", and that we only we discard half of chrominance because of equally "legacy reasons" is bonkers, though.


The core idea of YCbCr - decoupling chrominance and luminance information - definitely has merit, but the fact that we are still using YCbCr specifically is definitely for historical reasons. BT.601 comes directly from analog television. If you want to truly decouple chrominance from luminance, there are better color spaces (opponent color spaces or ICtCp, depending on your use case) you could choose.

Similarly, chroma subsampling is motivated by psychovisual aspects, but I truly believe that enforcing it on a format level is just no longer necessary. Modern video encoders are much better at encoding low-frequency content at high resolutions than they used to be, so keeping chroma at full resolution with a lower bitrate would get you very similar quality but give much more freedom to the encoder (not to mention getting rid of all the headaches regarding chroma location and having to up- and downscale chroma whenever needing to process something in RGB).

Regarding the tone of the article, I address that in my top-level comment here.


I would have also expected at least a passing mention of chroma subsampling beyond 4:2:0 if only just to have an excuse to give the "4:2:0" label to the usual case they mention. And you might run across stuff in 4:2:2 or 4:4:4 not all that rarely.

Really good quickstart guide

>Really good quickstart guide

It really isn't. You have to scroll 75% of the way through the document before you it tells you what to actually type in. Everything before (9000+ words) is just ranty exposition that might be relevant, but is hardly "quick".


Nah, see, I maintain a commercial video platform, and half the battle is people typing things in before they understand what a codec is. Theory first, practice after.

that's not a quick start guide. not once has the quick start guide with a printer explained the workings of the ink jet nozzle and the ability to precisely control the position of the head. it just says plug it in, hit this button to join wifi, open this app on your device, hit print.

The discussions in this thread are amusing. It’s a pretty great beginner guide. Almost a parallel to “how to ask questions the smart way” applied to videos.

Do you have any recommendations for literature on the subject of video encoding etc? I really want to learn more theory.

I've had a lot of misconceptions that I had to contend with over the years myself as well. Maybe this thread is a good opportunity to air the biggest one of those. Additionally, I'll touch on subbing at the end, since the post specifically calls it out.

My biggest misconception, bar none, was around what a codec is exactly, and how well specified they are. I'd keep hearing downright mythical sounding claims, such as how different hardware and software encoders, and even decoders, produce different quality outputs.

This sounded absolutely mental to me. I thought that when someone said AVC / H.264, then there was some specification somewhere, that was then implemented, and that's it. I could not for the life of me even begin to fathom where differences in quality might seep in. Chief of this was when somebody claimed using single threaded encoding instead of multi threaded encoding was superior. I legitimately considered I was being messed with, or that the person I was talking to simply didn't know what they were talking about.

My initial thoughts on this were that okay, maybe there's a specification, and the various codec implementations just "creatively interpret" these. This made intuitive sense to me because "de jure" and "de facto" distinctions are immensely common in the real world, be it for laws, standards, what have you. So I'd start differentiating and going "okay so this is H.264 but <implementation name>". I was pretty happy with this, but eventually, something felt off enough to make me start digging again.

And then, not even a very long time ago, the mystery unraveled. What the various codec specifications actually describe, and what these codecs actually "are", is the on-disk bitstream format, and how to decode it. Just the decode. Never the encode. This applies to video, image, and sound formats; all lossy media formats. Except for telephony, all these codecs only ever specify the end result and how to decode that, but not the way to get there.

And so suddenly, the differences between implementations made sense. It isn't that they're flaunting the standard: for the encoding step, there simply isn't one. The various codec implementations are to compete on finding the "best" way to compress information to the same cross-compatibly decode-able bitstream. It is the individual encoders' responsibility to craft a so-called psychovisual or psychoacoustic model, and then build a compute-efficient encoder that can get you the most bang for the buck. This is how you get differences between different hardware and software encoders, and how you can get differences even between single and multi-threaded codepaths of the same encoder. Some of the approaches they chose might simply not work or work well with multi threading.

One question that escaped me then was how can e.g. "HEVC / H.265" be "more optimal" than "AVC / H.264" if all these standards define is the end result and how to decode that end result. The answer is actually kinda trivial: more features. Literally just more knobs to tweak. These of course introduce some overhead, so the question becomes, can you reliably beat this overhead to achieve parity, or gain efficiency. The OP claims this is not a foregone conclusion, but doesn't substantiate. In my anecdotal experience, it is: parity or even efficiency gain is pretty much guaranteed.

Finally, I mentioned differences between decoder output quality. That is a bit more boring. It is usually a matter of fault tolerance, and indeed, standards violations, such as supporting a 10 bit format in H.264 when the standard (supposedly, never checked) only specifies 8-bit. And of course, just basic incorrectness / bugs.

Regarding subbing then, unless you're burning in subs (called hard-subs), all this malarkey about encoding doesn't actually matter. The only thing you really need to know about is subtitle formats and media containers. OP's writing is not really for you.


I was a DVD programmer for 10 years. There was a defined DVD spec. The problem is that not every DVD device adhered to the spec. Specs contain words like shall/must and other words that can be misinterpreted, and then you have people that build MVP as a product that do not worry about the more advanced portion of the spec.

As a specific example, the DVD software had a random feature that could be used. There was one brand of player that had a preset list of random numbers so that every time you played a disc that used random, the random would be the exact same every time. This made designing DVD-Video games "interesting" as not all players behaved the same.

This was when I first became aware that just because there's a spec doesn't mean you can count on the spec being followed in the same way everywhere. As you mentioned, video decoders also play fast and loose with specs. That's why some players cannot decode the 10-bit encodes as that's an "advanced" feature. Some players could not decode all of the profiles/levels a codec could use according to the spec. Apple's QTPlayer could not decode the more advanced profiles/levels just to show that it's not "small" devs making limited decoders.


The issue is encoding is an art, especially as it's lossy. You choose how much data to throw away (kind of like when you pick a quality in JPEG). Further, for video, you generally try to encode the differences between 2 frames. Again, because it's a lossy difference, it's up to the creator of the encoder to decide how to compute the difference. different algorithms come up with a different answers. There result still fits the spec.

Let's just say we were encoding a list of numbers. So we get a keyframe (an exact number) and then all frames after that until the next keyframe are just deltas. How much to add to that keyframe number

    keyframe = 123
    nextFrame += 2   // result = 125
    nextFrame += 3   // result = 128
    nextFrame -= 1   // result = 127
etc... A different encoder might have different deltas. When it comes to video, those difference are likely relatively subtle, tho some definitely look better than others.

The "spec" or "codec" only defines that each frame is encoded as a delta. it doesn't say what those detlas are or how they are computed, only how they are applied.

This is also why most video encoding software has quality settings and those settings often includely the fact higher quality is slower. Some of those settings are about bitrate or bitdepth or other things but others are about how much time is spent looking for the perfect or better delta values to get closer to the original image as searching for bettter matches takes time. Especially because it's lossy, there is no "correct" answer. There is just opinion.


> And then, not even a very long time ago, the mystery unraveled. What the various codec specifications actually describe, and what these codecs actually "are", is the on-disk bitstream format, and how to decode it. Just the decode. Never the encode.

Soooo with everyone getting used to creative names instead of descriptive names over the past decade or two, I guess "codec" just became a blob and it just never crosses peoples' minds that this is right there in the name: COding/DECoding. No ENCoding.


There's a term overload involved. In implementation terms, codec stands for coder/decoder, with "coder" referring exactly to an encoding capability: https://en.wikipedia.org/wiki/Codec

So that's a swing and a miss I'm afraid. But I'm very interested to hear what do you think a "coder" library does in this context if not encode, and why is it juxtaposed with "decoder" if not for doing the exact opposite.


Thanks for bringing this up, since I'm realizing that I did not explicitly spell this out in the post. I'll add a paragraph making this even clearer.

what if I told you the same issue is true for lossless plain compression like .zip files

the compressor (encoder) decides exactly how to pack the data, it's not deterministic, you can do a better job at it or a worse one

which is why we have "better" zlib implementations which compress more tightly


Drives me crazy but I'm glad to learn of it :D

Makes a lot of sense in retrospect, to the extent it bothers me I haven't figured it out myself earlier.


this is exactly what "higher" compression levels do (among other things like bigger dictionary) - they try harder, more iterations, to find the optimum combination of available knobs for a particular chunk of data.

Yes, that much was always clear. I just always thought the way these software go about finding those combinations was also standardized on a high level rather than proprietary to each implementation. It is a fairly recent development for me to realize that the various encoder options and presets are specific to the encoder, not the format (and now, that the same is true for lossless formats too).

for video there is another constraint - time

hardware encoders (like the ones in GPUs) typically work realtime-ish, so they do minimal exploration of encoding space

you also have the one-pass/two-pass thing which is key for unlocking high quality compression


Sometimes I would want to convert from MPEG-TS H.264 to DVD video format, or other conversions, so there are reasons to do so. However, once I had got desynchronized audio, and I don't know if that is because of the original source, because of the conversion, or because some segments have not been recorded. (Also, it could not retain the EIA-608 captions, but that seems to be a limitation with FFmpeg, rather than something I did.)

Did you use a decent piece of DVD authoring software? Here are some options: https://www.videohelp.com/software/sections/authoring-dvd

I used the "ffmpeg" to convert the video/audio, and then I used the "dvdauthor" to make the files into the video DVD format, and then I used the "genisoimage" to make the disk image file in the DVD format.

This is a great write up. Thank you for sharing.

Container formats for x.264, AVC, or H.264 are in .mkv or .mp4 codecs to encode and decode.

[1] Technically the term codec refers to a specific program that can encode and decode a certain format.


I'm curious what the issue is with using Handbrake? I use it all the time on macOS and it's generally a simple and effective tool for my purposes.

Handbrake is fine if you truly need to reencode (aka “transcode”) your video, but if you find yourself with a video that your player can’t read, you might be able to just change the container format (remux it) using ffmpeg, copying the video and audio streams directly across.

With video there are 3 formats: the video stream itself, the audio stream itself, and the container (only the container is knowable from the extension). Formats could technically be combined in any combination.

The video stream especially is costly in CPU to encode, and can degrade quality significantly to transcode so it’s just a shame to re-encode if the original codec is usable.

Container format mkv is notorious for not being supported out of the box on lots of consumer devices, even if they might have codecs for the audio and video streams they typically contain. (It has cool features geeks like, though, but for some reason it gets less support.)


Subtitles are another kind of stream aside from video/audio.

Also there's one user-level aspect of MKV that makes it not too surprising to me: It can contain any number of video/audio/subtitle streams and the interface needs some way of allowing the user to pick between them. Easier to just skip that complexity, I guess.


If you search the page you'll find a reference to having “numerous foot guns”.

I can't say I've experienced either of the ones mentioned, but I have had trouble in the past with output resolution selection (ending up with a larger file than expected with the encoding resolution much larger than the intended display resolution). User error, of course, but that tab is a bit non-obvious so it might be fair to call it a footgun.


the short version is there's nothing wrong with it for your use case.

The author's POV is that the handbrake is a lossy conversion and often people use it in cases where they could have used a different tool that is lossless.

My uses of handbrake are that I always want a lossy conversion so no issue. A good example is anytime I make screen capture and want to post it on github. I want it to be under the 10meg limit (or whatever it is) so I want it to be re-encoded to be smaller. I don't mind the loss in quality.


the author can't stand how it simply re-encodes videos instead of extracting the video tracks and puts them in new containers.

Could have used this in the nineties, where hunting a specific codec to play that video you downloaded off a BBS was an actual thing.

Oh man it extended well past the 90’s. Finding some weird windows video codec in a dodgy .ru domain was a time honored tradition for quite some time.

I remember all the weird repackaged video codec installers that put mystery goo all over them machine.

The article bashes VLC but I tell you what… VLC plays just about everything you feed it without complaint. Even horribly corrupt files it will attempt to handle. It might not be perfect by any means but it does almost always work.


I had found that VLC does not play a MPEG-TS file very well (although it recognizes the file and plays it, it does not work very well); converting it to another format first, will cause it to play better, in my experience.

MPEG-TS files are just a pain to play in general. It was never intended as a file format for prerecorded video, and lacks some features (like seek indexes) which are required for reasonable playback.

In most circumstances, a MPEG-TS file can be remuxed (without reencoding) to a more reasonable container format like MP4, and it'll play better that way. In some cases, it'll even be a smaller file.


I used mplayer ( the ancestor to mpv as I’ve just realized ) in the early 2000s which I think could handle everything under the sun back then.

That's ffmpeg. Neither mplayer nor VLC were doing anything special that let them "play everything". They just used ffmpeg.

(nb they did often use their own demuxers instead of libavformat)


would be nice if someone like epic spaceman actually broke down how videos are encoded, stored, processed and how encoding algorithms work visually, i am bad at understanding things by reading about them

The article talks about image comparisons but does not say what the best way to extract an image is.

If I want the best possible quality image at a precisely specified time, what would I do?

Can I increase quality if I have some leeway regarding the time (to use the closest keyframe)?

Is there a way to "undo" motion blur and get a sharp picture?


I usually use a shortcut in mpv to extract the screenshot. If I want to do it via the command-line:

  ffmpeg -ss 00:00:12.435 -i '/Users/weinzieri/videofile.mp4' -vframes 1 '/Users/weinzieri/image.png'
The means “go to 00:00:12.435 on the file /Users/weinzieri/videofile.mp4 and extract one frame to the file /Users/weinzieri/image.png”.

> Is there a way to "undo" motion blur and get a sharp picture?

Not really, no, any more than there is a way to unblur something that was shot out of focus.

You can play clever tricks with motion estimation and neural networks but really all you're getting is a prediction of what it might have been like if the data had really been present.

Once the information is gone, it's gone.


If the estimation is good it might be enough for some use cases. Is there any software out there that specializes in this? Similarly to maybe AI colorizing or upscaling, which both guess information that is not there anymore.

it's not gone, just more difficult to extract

video has certain temporal statistics which can allow you to fit the missing information

only true blurred white noise is impossible to recover


It really is gone. You can predict what you think it might have been, but you can't know what it was.

it's gone in a single still frame

but across many consecutive frames, the information is spread out temporaly and can be recovered (partially)

the same principle of how you can get a high resolution image from a short video, by extracting the same patch from multiple frames

https://en.wikipedia.org/wiki/Video_super-resolution


No, it's not "restoring detail". The information is gone.

It is predicting what the information might maybe have been like.


you are arguing with math proofs here, the information is not gone, if it was a real video (as opposed to adversarily generated video)

I'm struggling with the idea that you can use maths to recover information from a video that simply was not present in the video.

I get that what you're describing can statisically "unblur" stuff you've blurred with overly-simplistic algorithms.

I can provide you with real-world footage that has "natural" motion blur in it, if you can demonstrate this technique working? I'd really like to see how it's done.


That looks interesting. Is there ready-made software that can do this? Doesn't have to be easy to use just useable with a time commitment of a few days.

> Not really, no, any more than there is a way to unblur something that was shot out of focus.

This is actually possible:

https://en.wikipedia.org/wiki/Deconvolution

If you have a high-quality image (before any compression) with a consistent blur, you can actually remove blur surprisingly well. Not completely perfectly, but often to a surprising degree that defies intuition.

And it's not a prediction -- it's recovering the actual data. Just because it's blurred doesn't mean it's gone -- it's just smeared across pixels, but clever math can be use to recover it. It's used widely in certain types of scientific imaging.

For photographers, it's most useful in removing motion blur from accidentally moving the camera while snapping a photo.


You'll need to settle on a decoder. I personally just use my video player for this, mpc-hc.

In mpc-hc, you can framestep using CTRL+LeftArrow (steps a frame backward) or CTRL+RightArrow (steps a frame forward). This lets you select the frame you want to capture. You do not need to be on a keyframe. These keybinds are configurable and may be different on the latest version.

Then in the File menu, there's an export image option. It directly exports the frame you're currently on, to disk. Make sure to use a lossless format for comparisons (e.g. PNG).

I'm aware this can be done in other players - like mpv - as well, although there I believe no keybinds are set up for this by default, and the default export format is JPEG.


Nowadays, I just ask an LLM to give me the ffmpeg command that I need.

No need to know anything about the video file anymore.

(Of course if you're hosting billions of videos on a website like YouTube it is a different story, but at that point you need to learn a _lot_ more e.g. about hardware accelerators, etc.)


Just writing off AI upscaling completely is bs. It's not some magic bullet to use on every video and there is a learning curve on how to apply but there are absolutely scenarios where you can get shockingly good results. I think a lot of people make judgements on it based on super small sample sizes.

On a separate note also not mentioned llm's are really good at generating ffmpeg commands. Just discuss with chatGPT your source file and goals for a video and you can typically oneshot a targeted command even if you aren't familiar with ffmpeg cli.


The article defines quality as being as close to the source as possible. AI upscaling adds detail that wasn't there in the first place.

AI upscaling can also be done during playback, to the benefit of lower file sizes, but at the cost of higher processor utilisation. So it is a trade-off.

> I would recommend you to just learn basic ffmpeg usage instead > but ffmpeg is fine for beginners

No, that's just nonsense for any guide targetting beginners, it's not fine, it's too error-prone and complicated and requires entering the whole unfriendly land of the cli!

> If you must use a GUI

Of course you must! It's much better to provide beginners with your presets in Handbrake that avoid the footguns you mention (or teach them how to avoid them on their own) rather than ask them to descend into the dark pit of ffmpeg "basics"

> Before you start complaining about how complicated ffmpeg is and how arcane its syntax is, do yourself a favor and read the start of its documentation. It turns out that reading the (f.) manual actually helps a lot!

It turns out that wrapping the bad UI in a simpler typed GUI interface wastes less of the collective time than asking everyone to read dozens of pages of documentation!


The second technical definition in this document is wrong. Great way to put the "the author is opinionated but is clueless" marker right near the top.

> Actual video coding formats are formats like H.264 (also known as AVC) or H.265 (also known as HEVC). Sometimes they're also called codecs, short for "encode, decode".

Codec is coder/decoder. It's not the format.

There's a footnote claiming people mix the 2 terms up (a video format is apparently equal to a video codec according to this "expert") but apparently acknowledging the difference is seemingly only what nitpickers do. Sheesh. If you want to educate, educate with precision, and don't spread your misinformation!


> If you want to educate, educate with precision, and don't spread your misinformation!

I would assert that the author was already being precise. A statement that X is "sometimes called" Y already conventionally carries the subtext that Y isn't actually the correct term; that Y is instead some kind of colloquial variant or corrupted layman's coinage for the more generally-agreed-upon term X.

Why mention the incorrect terminology Y at all, then?

Specifically in the case that pertains here, where far more laymen are already familiar with the Y term than the X term, giving Y as a synonym in the definition of X is a way to give people who are already familiar with this concept — but who only know it as Y — an immediate understanding of what is being discussed, by connecting their knowledge of "Y" over to the X term the author is defining. This is an extremely common practice in academic writing, especially in textbooks.


Maybe it's my bubble, but a video format is a name given to the description of how the video is stored as a stream of bytes/how to get the video from the stream of bytes, and "codec" is the piece of software (more generally, logic) that handles that work.

It's not fine to say "oh those 2 things are the same", especially in an introduction, because then you're leading people astray instead of educating them.


That is a bubble, yes. And a bit of a generational divide, as well.

I think people of a certain age picked up that the things inside video containers were "codecs" when we all had to install "codec packs." The things inside those packs were literally codecs — encoder/decoder libraries.

But when AV software of the time (media players, metadata taggers, library software, format conversion software, etc) needed to let you choose which audio or video bitstream format to use for encoding [i.e. pick a codec plug-in library to pass the bitstream through]; or needed to tell you that you needed to install codec plug-in X to play file Y, they would often equivocate the codec with the AV bitstream format(s) they enabled the encoding/decoding of. Especially when the software also had a separate step or field that made reference to the media container format. It was not uncommon to see a converter with a chooser for "output format" and another chooser for "output codec" — where the "codec" was not a choice of what library plug-in to use, but a choice of which AV bitstream format to target. (Of course, relying on the assumption that you'd only ever have one such codec library installed for any given AV bitstream format.)

---

Heck, this is still true even today. Go open a media file in VLC and pop open the Media Information palette window. There's a "Codec Details" tab... and under each stream, the key labelled "Codec" has as its value the name of the AV bitstream format of the stream. I just opened an M4A file I had laying around, and it says "MPEG AAC audio (mp4a)" there.

My understanding of why VLC does it that way [despite probably decades of pedants pointing out this nitpick], is that there's just one codec library backing pretty much all of VLC's support for AV bitstream formats — "libavcodec". Players that rely upon plug-in codec libraries might instead have something like "MPEG AAC audio (libavcodec)" in this field. But since VLC only has the one codec, it's implicit.

Even though, to be pedantic, a media-info "Codec" field should just contain e.g. "libavcodec"; and then there should be a whole second field, subordinate to that, to describe which of the AV bitstream formats supported by the codec is being used.)

---

I also recall some old versions of iTunes using the word "Codec" to refer to AV bitstream formats in metadata and in ripping settings. Not sure if that memory is accurate; current versions seem to skirt the problematic nomenclature entirely by making up their own schema. Modern Music.app, on a song's "File" tab, gives the container format as the file's "kind", and then gives some weird hybrid of the AV bitstream format, the encode-time parameters, and the originating software, as an "encoded with" field.


Thanks for the "encoder/decoder" correction.

But yes, as the other reply says, I am aware of this distinction, and I make a point not to use the word "codec" at any other point in the article. and explain in a lot of detail how much the encoder matters when it comes to encoding in some format. I mention the term to make people aware that it exists.

But, you're right, I will clarify this a bit more.


They are very hot



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: