You don’t need a Weissman score to know that today’s video encoders are incredibly good at what they do and will continue to get better as hardware power increases. So how exactly does an encoder, such as H.264, compress gigabytes of video into megabytes? It’s complex, sure, but very explainable.
Developer and former Microsoft software engineer Sid Bala has written a detailed post on the inner workings on H.264, appropriately titled “H.246 is Magic”. While the codec does use some sophisticated algorithms to do its work, most of the space-saving is done by simply discarding information:
In a TV signal, R+G+B color data gets transformed to Y+Cb+Cr. The Y is the luminance (essentially black and white brightness) and the Cb and Cr are the chrominance (color) components … But check out the trick: the Y component gets encoded at full resolution. The C components only at a quarter resolution. Since the eye/brain is terrible at detecting color variations, you can get away with this. By doing this, you reduce total bandwidth by one half, with very little visual difference. Half!
Given that you’re storing moving images, it’s possible to extrapolate patterns by analysing each frame and making compression decisions based on this information. As Bala explains:
Imagine you’re watching a tennis match … the court, the net, the crowds all are static. The only thing moving really is the ball. What if you could just have one static image of everything on the background, and then one moving image of just the ball. Wouldn’t that save a lot of space?
The solution is to store only the changes, called the delta. This not only reduces the amount of data, but makes it more favourable to compression.
This is only scratching the surface — hit up Bala’s article below for the full explanation.
H.264 is Magic [Sid Bala]