facebook rss twitter

Windows Vista: DirectX10 D3D Intro

by Ryszard Sommefeldt on 12 September 2006, 08:36

Tags: Microsoft (NASDAQ:MSFT)

Quick Link: HEXUS.net/qagar

Add to My Vault: x

The D3D9 Programmable Graphics Pipe: Basics

D3D9

Parts of this article will somewhat assumes you know how a modern D3D9 GPU works in terms of vertex and pixel shading, and raster output, at least in terms of the basics.

Essentially you need to keep in mind that, for the D3D9 pipe, the CPU sends the GPU geometry to work on, and any other data relevant to rendering such as render state, to start drawing a frame. The vertex shader hardware then processes the geometry before passing it on to the rasteriser, which processes geometry and outputs triangle fragments.

Those are then fed to the pixel shader hardware for processing, before shaded fragments are output to the ROP hardware for final pixel-level ops to determine output colour, at which point finished pixels are written to the output framebuffer by the ROP for display on your screen.

The CPU may or may not get involved in every frame, depending on geometry needed and state changes required in the driver.

'Shading' in this sense just means using provided instructions and data input in order to change the attributes of a single vertex or pixel fragment output by the shader hardware. A D3D9 GPU can sample data from a variety of data surfaces in a variety of data types, proving flexible input data sources for the shading process.

The GPU's (and programmer's) view of the data can be multi-dimensional if needed, and output data from pixel and vertex shading can be held in intermediate buffers for reuse somewhere else in the pipeline (with some rules), by the processing resources, so verts and fragments can have 'memory' of their prior processing, before further processing occurs.

All of that general programmability and somewhat general access to memory is (in a nutshell of course) how a D3D9 GPU does what it does. As far as the D3D9 runtime and API are concerned, they're there to facilitate a standard way to program the pipeline (provision of the API), and marshall the way the GPU interacts with the OS (the runtime layer between CPU, driver and GPU) so as to get pixels drawn on the screen.

The D3D9 API effectively turns the GPU into a state machine. Certain parameters (states) can be set per object being shaded and rendered, and globally on the GPU to affect processing (whether colour data should be written, for example), and the runtime, with the help of the driver, sets those states using the CPU.

Set/Draw call overhead

One of the reasons D3D9 is suboptimal for really efficient rendering is the state change overhead in the runtime layer (and driver to some extent). Having to batch render to minimise changes is realistically a barrier to performance and true open programmability of the graphics pipeline.

And while that's just one of many things that could be improved upon, it's what you see dominate programmer's manuals and developer documentation for D3D9 rendering systems. DrawIndexedPrimitive is a common D3D9 function call to affect rasterisation of geometry, but its performance tanks with less than a few hundred triangles per call. Set*/Draw* call overhead is the big nasty in D3D9.

Simply getting data onto and off of the GPU, from and to the CPU, is also a struggle at times depending on hardware (so it's not consistent), so there's decent scope for processing overheads to be reduced, while improving programmability and use of the API. That's what D3D10 seeks to do.

Onto what D3D10 brings to the table, versus what's available with D3D9.