Decode

Avivo-supporting GPUs purport to have a programmable video decode engine paired with fixed function paths for decode of H.262, VC-1 and the about-to-be-very-venerable H.264 AVC. That's pretty much it in terms of how to explain the Decode step in Avivo, really.

Obviously the GPU takes input video, be that from the Encode stage (if needing to display it after encode, for real-time previews) or from the network (network transmission is pretty important, going forward into the future) or from disk (the same as decode from the network, but with better guarantee of the video data actually arriving) or whatever, and passes it through the fixed function hardware present for decode. Then the programmable graphics hardware and some programmable silicon inside the decoder can be used to support other video formats, or implement quality tweaks.

Something like the following image. The interconnects might not be as strict as one to the next, but the stages do exist in Avivo silicon for decoding video.

(mostly) Fixed-function decode for H.264

ATI's documentation for the fixed-function Decode stage in Avivo is thin on the ground. It mentions "comprehensive decode support" for the mentioned CODECs but that's about it. However their publicly available whitepaper on H.264 goes into a bit more detail. The decode stages for that video format are as follows.

Reverse entropy means rebuilding the larger dataset created in the first encode stage outlined on the previous page. I mentioned CABAC on the previous page too, so a little explanation is worth it. Content adaptive binary arithmetic encoding (lossless compression basically), to give its full sexy name, allows higher compression rates than 'basic' H.264, but at a computational cost. CABAC works by analysing frame data to decide on the best compression scheme. Per-frame, that's a nice added cost in decode to get it done faster than real-time so you don't drop frames. R5-series GPUs from ATI have dedicated silicon for that.

iDCT is next. It's computationally cheap (pretty much the same step during decode as iDCT for MPEG-2/H.262) as a single function, but applied to H.264 it consumes more CPU cycles. Motion compensation is the most computationally expensive task in H.264 decode. It's not a fixed cost either, so you're doing varying amounts of analysis in your decoder in order to present the motion video. CPU burn, so get the GPU to help.

Finally, in-loop deblocking is the act of using prior frame data to help with deblocking the current frame you're working on. Stepping transition between video block bad, deblocking good. GPU help!

Similar stuff in the rest of the decoder; support for more formats

Effectively the same stuff happens for the other supported video formats in the fixed-function paths, be it VC-1 or H.262 (although that problem was solved some years back) for accurate, high-quality decode of popular video formats. Not forgetting the programmable side of the video silicon, and the shader hardware in the 3D core which can also be used for assist in video processing. While many of the video tasks (encode or decode) aren't really mappable to what's essentially an array of vector stream processors, the GPU does what it can. ATI are confident that Avivo does it better on-GPU than anyone else.

Decode Points

The biggest thing to take from the Decode stage is that Avivo-capable products have dedicated gates for video decode. Computationally expensive formats like H.264 get significant assist by the GPU according to the Avivo literature, both by fixed function hardware, programmable video-only silicon and the 3D shader hardware if needed.

Review: ATI Avivo Video and Display Engine - Technology Discussion

Decode

Decode

(mostly) Fixed-function decode for H.264

Similar stuff in the rest of the decoder; support for more formats

Decode Points

MY HEXUS

EVENTS

INDUSTRY PRESS RELEASES