Shader Model 3.0
Much has been made of NV40's Shader Model 3.0 compatibility so a page on what it means, despite any real means to test its performance at the time of writing, is worth it.I've already talked about branching and flow control in the vertex shader, so I won't cover that here. DirectX 9.0c exposes the ability in that API to write vertex shaders which use the hardware, so it's a pretty simple extension to things.
Centroid Sampling
Centroid sampling allows the hardware to ensure that it always samples inside a triangle's capture area, avoiding situations with multisampling where the sample can lie outside of the relevant area for that triangle.Arbitrary Swizzle
Arbitrary swizzle is a pixel shader feature that lets it do something like this:mov r1.xyzw r1.xxzw;
That copies the x component of the register holding the pixel data into the y component, preserving everything else. The hardware should optimise for that and do it in one instruction.
Vertex texturing
Shader Model 3.0 allows vertex shader hardware to do texture lookups, using them as a source of data. Displacement mapping is a product of this ability and vertex programs can do arbitrary lookup of any texture surface available to the application that the developer creates.Allowing the vertex shader to use the texture unit as a source of data is probably the biggest of all the new Shader Model 3.0 features.
Vertex Stream Instancing
Stream instancing lets a developer define an object singly and then apply a stream of vertex modifiers to the object to change it from the base object. This avoids the need for creating multiple copies of the object and running a vertex shader on each vertex, each time it's created, which may be thousands of times per frame for things like grass, or leaves on trees. Plus you save geometry bandwidth by creating streams with only the minimum modifications needed to the vertices to make that object unique. For objects with lots of vertices, the savings can be significant.Other Differences compared to Shader Model 2.0
Shader Model 2.0 | Shader Model 3.0 | |
Pixel Shader Differences | ||
Instruction Count Limit | 96 (ps2_0) | No practical limit (65535+) |
Colour format | 8-bit per component | 32-bit per component |
Multiple render targets | Optional | 4 required |
Vertex Shader Differences | ||
Instruction Count Limit | 256 | 65535 |