Skip to main content

Everything you need to know about video encoding

Variable-sized macroblocks in H.264 onward enable compression to be better targeted.

MPEG-4: Zoom and enhance

MPEG-4—aka H.264—started off with the aim of enhancing the standard for digital streaming. Getting better image quality from half the bitrate. But with the advent of Blu-ray and HD DVD (remember that?), the extension MPEG-4 Part 10—aka MPEG-4 Advanced Video Coding (AVC)—was formed.

Work started on AVC in 1998, and the first ratification appeared in 2003, 10 years after MPEG-1 began development in 1988, and its ratification in 1993. As you can imagine, some serious enhancements were implemented, taking advantage of the vast increases in processor speeds and system resources to both encoding systems and decoding devices, alongside enhancements to displays. We’ve gone from Video CD days to Full HD Blu-ray in just a couple of codecs.

We’re not going to go into the depth of detail we’ve used for the various aspects of MPEG-1, partly as some are refinements on the basic techniques we’ve looked at, and largely as the complexity of implementation is so much greater, and we’re running out of room!

To kick off, the I-frame prediction was vastly improved by increasing its complexity. The major move was variable block size for motion compensation in the macroblocks, from 16x16 down to 4x4, and any variation in between, such as 8x16 or 8x4 (image above). Multiple motion vectors per macroblock and available reference frames were increased to 16, with a buffering requirement of four or five versus just one or two previously. Motion compensation works on a quarter-pixel precision. A built-in deblocking filter helps to eliminate those 16x16 artifacts.

This is just a tiny portion of the improvements—for instance, there’s a whole section on loss resilience to improve performance with corrupt data streams, plus enhanced protocol compression, lossless features, and feature-specific options to enhance black and white footage, visual transitions, or fades.

The abilities of x265 shine through over H.264 at lower bitrates.

MPEG-5: Patent wars

As with the previous generation, a core design target for H.265—aka High Efficiency Video Coding (HEVC)—was to attain the same level of picture quality as H.264, but at half the bitrate (image above). Subjective testing rates it able to achieve at least a 56 percent reduction in bitrates at 720p, which increases to 64 percent at 2160p. Mission accomplished!

So, what’s changed? To start with, fixed 16x16 macroblocks are out, and in their place are coding tree blocks (CTB). These can be 64x64, 32x32, or 16x16 pixels. A CTB can be then subdivided recursively into coding unit (CU) blocks, from 32x32 in size down to 8x8, stored as a “quadtree,” as each CU subdivision is a quad of smaller squares.

Each CU is recursively split into transform units (TU), that again can be 32x32 down to 4x4, which are processed with DCT. On top of this, the alternative and more efficient discrete sine transform (DST) is used on 4x4 blocks. This flexibility and an increase in accuracy improves H.265’s efficiency enormously.

The H.265 codec takes the chance to overhaul motion vectors in a big way, and it starts with prediction units (PU)—yes, everything’s a unit! A CU can be split into one of eight PU types: the whole unit, half either horizontally or vertically, quartered, or a quarter strip at the top/bottom or on the left/ right. We’ll skip intra-prediction (inframe) that’s used to improve “flat” areas, but we will say that HEVC has 35 modes, versus the nine of H.264.

The real muscle is with the inter prediction motion vector system. It still has two reference lists with 16 entries each, but it can buffer eight pictures that can include the same frame multiple times for weighted predictions. There are improved motion compensation filters on both the luma and chrome channels, and an optimized deblocking filter that can be processed in parallel.

Finally, H.265 has been designed with multithreaded processing in mind for modern systems. The CTB design produces a grid of independent decodable blocks.

We’re done. Phew! We’ve gained a lot of respect for codec designers and the people who have to implement these standards in smartphones and TVs. It’s mind-boggling stuff, yet it all just works.

Can you hear me?

Conspicuous by its absence so far in this feature has been talk about audio. We’re going to go out on a limb here, and suggest that no one really cares any longer when it comes to standalone audio encoding. We’re sure there’s the odd reader who still has a multigigabyte collection, but they’re like the vinyl hoarder—more an exception than a rule. It feels as though the way people consume their audio has changed drastically, from the old owner/collector system, to a new streamer/subscriber model.

Even taking that into account, if you are going to be storing your own audio, if you care in any way, you’re going to be using a lossless codec. Because storage is cheap, and FLAC should be your preference, there’s not too much to cover. If you don’t care, you’re likely happy to keep on using MP3 encoding, though you should be using ogg Vorbis.

Both of these standards are open (actually, MP3 is also open as the patents have expired), so technically can be supported by anyone who cares to. FLAC has wide industry backing, including Windows, MacoS, ioS, Android, and something called Linux. As important, its device support is top-notch, including Sonos, a host of in-car systems, Denon, Synology, and Plex, to name a few. ogg has more limited support, but can be supported by Windows, MacoS, and ioS.

Some might point out that a newer format, opus, offers even greater quality with its lossless format, but its youthfulness means hardware support is more limited than ogg.

MeGUI, it’s all about me, me, me, me, me!

Encoding with MeGUI

If you’re at all interested in quality video encoding, then there are a host of solutions offering their services out there. Handbrake.fr is often suggested as a good opensource option, but it appears to be looked down upon by many of those in the know. A better suggestion appears to be MeGUI (see image above), which has been developed by the brainiacs on this ancient site.

MeGUI is open-source licensed GPL v2, it’s still under active development, it supports x.264 and x.265 out of the box, and it automatically updates all the many required components as it ticks along—this means that the first time you encode something, there’s a number of additional waits while it downloads the necessary components. It’s all very slick.

Plenty of handy documentation and guides can be found here, while you’ll be pointed toward this link for the main download. Just grab it, extract, and you’re ready to go!

Originally developed with DVD ripping in mind, it’s now able to handle Blu-ray files amongst other tricks. We’re not even going to start trying to go into the settings on this bad boy here, but we will point out that it has a one-click mode. Drag and drop a file you want to re-encode on to it, and select “One Click.” Then, in the dialog that appears, click “Config” in the “Output” section.

You may want to select “Keep Input Resolution,” otherwise choose a standard 1280 or 1920 width. Use the encoder to select x265, and “Config” to adjust the bitrate and number of passes. OK everything, and when you click “Queue,” off it goes to do its thing!

Next-gen HEVC

Codec designers have done a staggering job of reducing bitrates whilst improving image quality, and not drastically increasing decode complexity. A less palatable aspect of the codec world is software patents, which impose licenses.

Licenses are fine, if fair, but the word on the street is there was a marked increase for HEVC, which ensured the industry started looking for alternatives.

Google, with its vlog platform YouTube, has an obvious interest in not paying licenses for its growing user base. It bought in its VP8 codec tech, and now uses the H.265 equivalent but royalty-free VP9 for YouTube streaming. Netflix, which was heavily using H.264 AVC, switched to VP9 for compatible devices.

So, you can understand why the Alliance for Open Media was formed by Google, Mozilla, Microsoft, Netflix, Amazon, Intel, AMD, ARM, and Nvidia. AV1 is the codec it has been developing, released at the end of March 2018. Tests by Netflix show a rough 25 percent bitrate reduction over HEVC and VP9, but encoding is up to 10 times as complex, and a Moscow State University project revealed an unoptimized encoder was up to 3,500 times slower. A big hurdle for AV1 is hardware support; none is due until the end of 2019.

The HEVC people aren’t sitting still—a new Joint Video Exploration Team has been formed to work on a next-gen codec for late 2020. It’ll be up to 50 percent more efficient, and support next-gen display tech.

This article was originally published in Maximum PC's August issue. For more quality articles about all things PC hardware, you can subscribe to Maximum PC now.