Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> it would probably be the simplest part of whatever you were actually building

Unless you've written a Widevine client, downloaded from DASH, parsed MP4, decrypted MP4 samples, then reassembled the decrypted fragments, then you're really not in a position to be making this claim. I have done all the above, and the MP4 parsing was by far the most difficult part of the process, and that includes parsing Protocol Buffers for use with Widevine. The sheer volume of different box types is what makes it difficult. Over 100 types, see for yourself:

https://godocs.io/github.com/edgeware/mp4ff/mp4



I haven't done anything with Widevine , but I have written multiple BMFF parsers, and I'm intimately familiar with how many different boxes/atoms there are. Luckily you can implement them incrementally because the box hierarchy is so normalized.

It's actually my go-to project when I'm trying to learn a new language, because the problem itself is simple enough to understand, but it forces you to learn the idioms about the language you're learning. What is the idiomatic way to represent different box types? How do you read values with specific endianness from a buffer? How do you seek through a file's contents without loading the whole 10GB movie into memory?


I’ve written a (partial) zero-alloc MP4 decoder and it was definitely not as easy as the other poster makes it sound.


Back a while I tried to implement a MP4 demuxer, and I can kind of relate to that. The mdat box is sometimes an opaque blob and you need to parse the codec framing to split packets (fMP4 helps with this a bit), each codec has its own set of boxes, and the specs for each of them are paywalled...

Matroska/WebM is so much simpler and easier to parse, you can essentially abstract it away in a JSON-like DOM (obviously without loading 1GB of data into memory) and just get what you want, it's great.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: