Philip Taylor
Microsoft Corporation
February 19, 2001
Download Base.exe and Basevs.exe.
Welcome to Driving DirectX. Last month I presented an overview of programmable vertex and pixel shaders. This month I want to explore vertex shaders in more detail. Vertex shaders gives developers fine-grained control over the vertex transformation and lighting pipeline and can be used as a substitute in DirectX® 8 for the fixed-function transform and lighting pipeline. First, though, we need to examine the new sample framework so we have a basis to begin developing vertex shader samples and vertex shaders.
DirectX 8 Graphics Sample Framework
While the DirectX 8 Direct3D SDK graphics sample framework is an evolution from the DirectX 7 graphics sample framework (D3DFrame, covered in the May 2000 issue of Driving DirectX), it has changed enough that some small additional coverage is warranted to update the particulars.
First, the SDK layout has changed. The SDK samples, by default, are installed in \Mssdk\Samples\Multimedia. The new SDK layout is shown in Figure 1.
.gif)
Figure 1. DirectX 8 SDK sample folder layout
For our purposes, the folders of interest are Common and Direct3D. The sample framework is now contained in the Common folder and the Direct3D samples based on the graphics framework are contained in the Direct3D folder. Within the Common folder are Include and Src folders containing the headers and source files that make up the graphics framework (as well as support for the samples that showcase the other components of DirectX, but that's beyond the scope of this column).
Now that it's clear where the sample bits are located, it's time to dig in. If you really want to understand how the samples work, as opposed to the features showcased in them, a basic understanding of how the sample framework works is required. Here I will perform a high-level discussion of the graphics sample framework. This is all that's necessary to understand and use the framework to write your own samples. The DirectX 8 graphics framework consists of five source modules, located in the Common\src folder that's contained in the Multimedia folder as shown in Figure 1. The files that contain the Direct3D Sample framework are:
- d3dapp.cpp implements the application class, CD3DApplication, which is the base class used for the D3D samples.
- d3dfile.cpp furnishes x-file support to enable samples to load x-files. Of particular interest are classes CD3Dmesh and CD3DFrame.
- d3dfont.cpp furnishes the CD3DFont class, which provides basic font output support, to enable things like statistics views.
- d3dutil.cpp provides generally useful 3d functions, such as material, light, and texture helper functions.
- dxutil.cpp provides generally useful DirectX functions, such as media, registry, and timer helper functions.
There are corresponding header files located in the Common\Include folder as well.
If you recall from the May article, each sample implements a subclass of CD3DApplication (typically named CMyD3DApplication) and set of "overridable" methods as shown below:
// Overridable functions for the 3D scene created by the app
virtual HRESULT ConfirmDevice(D3DCAPS8*,DWORD,D3DFORMAT) { return S_OK; }
virtual HRESULT OneTimeSceneInit() { return S_OK; }
virtual HRESULT InitDeviceObjects() { return S_OK; }
virtual HRESULT RestoreDeviceObjects() { return S_OK; }
virtual HRESULT FrameMove() { return S_OK; }
virtual HRESULT Render() { return S_OK; }
virtual HRESULT InvalidateDeviceObjects() { return S_OK; }
virtual HRESULT DeleteDeviceObjects() { return S_OK; }
virtual HRESULT FinalCleanup() { return S_OK; }
The prototypes for these methods are contained in d3dapp.h in the CD3Dapplication class. All you need to do to create a new D3D application using the sample framework is create a new project and new implementations of these functions; let's call them the "overridable" interface for a framework sample. That's exactly what the D3D SDK samples do. These methods are essentially the same as we covered in the May 2000 article; in the interest of brevity I'll refer you to the May issue for more detail. One important distinction to remember is between InitDeviceObjects/RestoreDeviceObjects, and between InvalidateDeviceObjects/DeleteDeviceObjects, Init/Delete are called when you are creating/destroying a device completely; Restore/Invalidate are called before/after calling Reset on a device, which happens when you change some aspects of the device or have to deal with a lost device. To make a well-formed new framework sample you need to understand where to put your code—for example, managed textures can be created/destroyed in Init/Delete, but unmanaged textures need to be created/destroyed in Restore/Invalidate.
Base Framework Sample
With those details in mind, let's proceed to create a simple sample that we can use for the basis of experimentation. This sample needs to be located in the Direct3D folder in the SDK for all the path information to work without requiring you to fix things up. Figure 2 below shows the project view from Visual C++® 6 for the "Base" sample that I will use for the next several columns.
.gif)
Figure 2. DirectX 8 "Base" sample project view
All five DirectX graphics sample framework modules are used here for completeness. Technically d3dfile.cpp isn't required. The class definition for this sample is a quite simple overloading of the "overridable" functions discussed above and shown below.
class CMyD3DApplication : public CD3DApplication
{
protected:
CD3DFont* m_pFont; // Font for drawing text
LPDIRECT3DVERTEXBUFFER8 m_pVB; // Buffer to hold vertices
HRESULT ConfirmDevice( D3DCAPS8*, DWORD, D3DFORMAT );
HRESULT OneTimeSceneInit();
HRESULT InitDeviceObjects();
HRESULT RestoreDeviceObjects();
HRESULT InvalidateDeviceObjects();
HRESULT DeleteDeviceObjects();
HRESULT FinalCleanup();
HRESULT Render();
HRESULT FrameMove();
public:
CMyD3DApplication();
};
For our "Base" sample, rendering a simple "quad" consisting of two triangles will be sufficient to illustrate the use of the framework. From such humble beginnings all things shader-ish will flow. Do note that the "quad" will be contained in a vertexbuffer. For DirectX 8 the container for geometry data is vertexbuffers. For details on using vertexbuffers, see the Direct3D documentation.
The definition of the quad and its vertex information is quite simple, as shown below:
//a quad
#define NUM_VERTS 4
#define NUM_TRIS 2
// A structure for our custom vertex type
struct CUSTOMVERTEX
{
FLOAT x, y, z; // The untransformed position for the vertex
DWORD color; // The vertex color
};
// Our custom FVF, which describes our custom vertex structure
#define D3DFVF_CUSTOMVERTEX (D3DFVF_XYZ|D3DFVF_DIFFUSE)
// Initialize vertices for a QUAD
CUSTOMVERTEX g_Vertices[] =
{
// x y z diffuse
{ -1.0f,-1.0f, 0.0f, 0xff00ff00, },//bl
{ 1.0f,-1.0f, 0.0f, 0xff00ff00, },//br
{ 1.0f, 1.0f, 0.0f, 0xffff0000, },//tr
{ -1.0f, 1.0f, 0.0f, 0xff0000ff, },//tl
};
A quad can be completely specified by two triangles and four vertices in a triangle fan. The vertex information consists of {x, y, z} 3D vector location information as well as diffuse color. The definition of the initial conditions of the location information places the quad about the origin. Now it's on to the "overridable" functions.
//---------------------------------------------------------------------
// Name: CMyD3DApplication()
// Desc: Application constructor. Sets attributes for the app.
//---------------------------------------------------------------------
CMyD3DApplication::CMyD3DApplication()
{
m_strWindowTitle = _T("Base: D3D Basic Example");
m_bUseDepthBuffer = TRUE;
m_pVB = NULL;
m_pFont = new CD3DFont( _T("Arial"), 12, D3DFONT_BOLD );
}
In the CMyD3DApplication class constructor shown above we initialize the window title, enable using a depth-buffer, pre-initialize our vertexbuffer interface pointer to NULL, and set up a font to use for debugging output.
//---------------------------------------------------------------------
// Name: OneTimeSceneInit()
// Desc: Called during initial app startup, this function performs all the
// permanent initialization.
//---------------------------------------------------------------------
HRESULT CMyD3DApplication::OneTimeSceneInit()
{
return S_OK;
}
OneTimeSceneInit is called by the framework when the app starts up, and can be used to initialize structures that aren't tied to a D3DDevice. This sample has no one-time initialization needs, so the OneTimeSceneInit method is a stub as shown above.
//---------------------------------------------------------------------
// Name: FrameMove()
// Desc: Called once per frame, the call is the entry point for animating
// the scene.
//---------------------------------------------------------------------
HRESULT CMyD3DApplication::FrameMove()
{
// For our world matrix, just rotate the object about the y-axis.
D3DXMATRIX matWorld;
D3DXMatrixRotationY( &matWorld, m_fTime * 4.0f );
m_pd3dDevice->SetTransform( D3DTS_WORLD, &matWorld );
return S_OK;
}
The FrameMove method above allows us to specify animation actions that happen once a frame. For this sample, we set up a simple y-rotation to rotate the "quad" about the Y-axis at the origin.
//-----------------------------------------------------------------------
// Name: Render()
// Desc: Called once per frame, the call is the entry point for 3d
// rendering. This function sets up render states, clears the
// viewport, and renders the scene.
//------------------------------------------------------------------------
HRESULT CMyD3DApplication::Render()
{
// Clear the viewport
m_pd3dDevice->Clear( 0L, NULL, D3DCLEAR_TARGET|D3DCLEAR_ZBUFFER,
D3DCOLOR_XRGB(0,0,128), 1.0f, 0L );
// Begin the scene
if( SUCCEEDED( m_pd3dDevice->BeginScene() ) )
{
// specify the source of stream 0, which is our vertex buffer.
// then let D3D know what vertex shader to use.
// call DrawPrimitive() which does the actual rendering
m_pd3dDevice->SetStreamSource( 0, m_pVB, sizeof(CUSTOMVERTEX) );
m_pd3dDevice->SetVertexShader( D3DFVF_CUSTOMVERTEX );
m_pd3dDevice->DrawPrimitive( D3DPT_TRIANGLEFAN, 0, NUM_TRIS );
// Output statistics
m_pFont->DrawText( 2, 0, D3DCOLOR_ARGB(255,255,255,0),
m_strFrameStats );
m_pFont->DrawText( 2, 20, D3DCOLOR_ARGB(255,255,255,0),
m_strDeviceStats );
// End the scene.
m_pd3dDevice->EndScene();
}
return S_OK;
}
The Render method above first clears the viewport using Clear. Then, within the BeginScene/EndScene pair, we call SetStreamSource to tell the D3D device that we are using vertexbuffer m_pVB with a stride of the size of our custom vertex type. Then we inform the D3D device that we are using a fixed-function FVF shader. Finally, we invoke DrawPrimitive to render the quad.
//------------------------------------------------------------------------
// Name: InitDeviceObjects()
// Desc: This creates all device-dependent managed objects, such as
// managed textures and managed vertex buffers.
//------------------------------------------------------------------------
HRESULT CMyD3DApplication::InitDeviceObjects()
{
// Initialize the font's internal textures
m_pFont->InitDeviceObjects( m_pd3dDevice );
return S_OK;
}
In this sample the InitDeviceObjects method's only task, shown above, is to initialize our debugging output font, using its corresponding InitDeviceObjects method.
//------------------------------------------------------------------------
// Name: RestoreDeviceObjects()
// Desc: Restore device-memory objects and state after a device is created
// or resized.
//------------------------------------------------------------------------
HRESULT CMyD3DApplication::RestoreDeviceObjects()
{
// Restore the device objects for the font
m_pFont->RestoreDeviceObjects();
// Create the vertex buffer. Here we are allocating enough memory
// (from the default pool) to hold all our 3 custom vertices. We also
// specify the FVF, so the vertex buffer knows what data it contains.
if( FAILED( m_pd3dDevice->CreateVertexBuffer(
NUM_VERTS*sizeof(CUSTOMVERTEX),
D3DUSAGE_WRITEONLY,
D3DFVF_CUSTOMVERTEX,
D3DPOOL_DEFAULT, &m_pVB ) ) )
{
return E_FAIL;
}
// Now we fill the vertex buffer. To do this, we need to Lock() the VB
// to gain access to the vertices. This mechanism is required becuase
// vertex buffers may be in device memory.
VOID* pVertices;
if( FAILED( m_pVB->Lock( 0, sizeof(g_Vertices),
(BYTE**)&pVertices, 0 ) ) )
return E_FAIL;
memcpy( pVertices, g_Vertices, sizeof(g_Vertices) );
m_pVB->Unlock();
// Set the projection matrix
D3DXMATRIX matProj;
FLOAT fAspect = m_d3dsdBackBuffer.Width /
(FLOAT)m_d3dsdBackBuffer.Height;
D3DXMatrixPerspectiveFovLH( &matProj, D3DX_PI/4, fAspect,
1.0f, 100.0f );
m_pd3dDevice->SetTransform( D3DTS_PROJECTION, &matProj );
// Set up our view matrix. A view matrix can be defined given an eye
// point, a point to look at, and a direction for which way is up.
// Here, we set the eye 4 units back along the z-axis and up 1
// unit(s), look at the origin, and define "up" to be in the y-
// direction.
D3DXMATRIX matView;
D3DXMatrixLookAtLH( &matView, &D3DXVECTOR3( 0.0f, 1.0f,-4.0f ), //from
&D3DXVECTOR3( 0.0f, 0.0f, 0.0f ), //at
&D3DXVECTOR3( 0.0f, 1.0f, 0.0f ));//up
m_pd3dDevice->SetTransform( D3DTS_VIEW, &matView );
// Set up the default texture states
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_COLOROP,
D3DTOP_SELECTARG2 );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_COLORARG1,
D3DTA_TEXTURE );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_COLORARG1,
D3DTA_DIFFUSE );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_ALPHAOP,
D3DTOP_SELECTARG1 );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_ALPHAARG1,
D3DTA_TEXTURE );
m_pd3dDevice->SetTextureStageState( 1, D3DTSS_COLOROP,
D3DTOP_DISABLE );
m_pd3dDevice->SetTextureStageState( 1, D3DTSS_ALPHAOP,
D3DTOP_DISABLE );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_MINFILTER,
D3DTEXF_LINEAR );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_MAGFILTER,
D3DTEXF_LINEAR );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_ADDRESSU,
D3DTADDRESS_CLAMP );
m_pd3dDevice->SetTextureStageState( 0, D3DTSS_ADDRESSV,
D3DTADDRESS_CLAMP );
m_pd3dDevice->SetRenderState( D3DRS_DITHERENABLE, TRUE );
m_pd3dDevice->SetRenderState( D3DRS_ZENABLE, TRUE );
// Turn off D3D lighting, since we are providing our own vertex colors
m_pd3dDevice->SetRenderState( D3DRS_LIGHTING, FALSE );
// Turn off culling, so we see the front and back of the triangle
m_pd3dDevice->SetRenderState( D3DRS_CULLMODE, D3DCULL_NONE );
return S_OK;
}
RestoreDeviceObjects shown above performs quite a bit of work in this sample. First, the vertexbuffer m_pVB is created using CreateVertexBuffer. Next, we lock the vertexbuffer, populate it with our quad's data, and then unlock it. Then the view and projection matrices are initialized and SetTransform for the respective matrices is used to set each matrix on the device. Finally, a series of renderstates are initialized using SetTextureStageState and SetRenderState. Note that since this sample does not use multi-texture only stage 0 states are set.
//------------------------------------------------------------------------
// Name: InvalidateDeviceObjects()
// Desc: Called when the device-dependent objects are about to be lost.
//------------------------------------------------------------------------
HRESULT CMyD3DApplication::InvalidateDeviceObjects()
{
m_pFont->InvalidateDeviceObjects();
SAFE_RELEASE ( m_pVB );
return S_OK;
}
InvalidateDeviceObjects above releases our vertexbuffer using the SAFE_RELEASE macro and invokes the InvalidateDeviceObjects method on our font to release any internal objects the font created.
//------------------------------------------------------------------------
// Name: DeleteDeviceObjects()
// Desc: Called when the app is exiting, or the device is being changed,
// this function deletes any device dependent objects.
//------------------------------------------------------------------------
HRESULT CMyD3DApplication::DeleteDeviceObjects()
{
m_pFont->DeleteDeviceObjects();
return S_OK;
}
DeleteDeviceObjects above simply invokes the DeleteDeviceObjects method on the font.
//------------------------------------------------------------------------
// Name: FinalCleanup()
// Desc: Called before the app exits, this function gives the app the
// chance to cleanup after itself.
//------------------------------------------------------------------------
HRESULT CMyD3DApplication::FinalCleanup()
{
SAFE_DELETE( m_pFont );
return S_OK;
}
FinalCleanup shown above uses the SAFE_DELETE macro to finalize the debug display font.
//------------------------------------------------------------------------
// Name: ConfirmDevice()
// Desc: Called during device intialization, this code checks the device
// for some minimum set of capabilities
//------------------------------------------------------------------------
HRESULT CMyD3DApplication::ConfirmDevice( D3DCAPS8* pCaps,
DWORD dwBehavior,
D3DFORMAT Format )
{
return S_OK;
}
ConfirmDevice is quite simply, as shown above, an empty stub. This means that the framework will consider all D3D devices valid for use in this sample, regardless of their caps, vertex processing type, or back buffer format.
Figure 3 shows a screenshot from Base, showing a quad rendered with a "rainbow" diffuse color effect.
.gif)
Figure 3. DirectX 8 "Base" sample, rendering a quad using diffuse color shading
BaseVS Vertex Shader Sample
With our simple DirectX 8 graphics framework sample that renders a quad in hand, it's now time to experiment with vertex shaders. Remember that we are using software vertex shaders and that AMD and Intel have provided highly tuned implementations. Indeed, you can consider software vertex shaders as the easiest way to get portable SIMD code. While it will be great to get hardware shader cards, significant prototyping work with vertex shaders can be done now, so don't let lack of hardware stop you from experimenting.
This sample, BaseVS, uses several vertex shaders in order to give an idea about both the syntax and power of vertex shaders, as well to familiarize you with the process of using shaders. This sample also needs to be located in the Direct3D folder in the SDK for all the path information to work. Figure 4 below shows the BaseVS project window.
.gif)
Figure 4. BaseVS project
In the interests of time, I am not going to cover all the framework functions again, you can check the code out in the download. Please note the download contains a bitmap, dx8_logo.bmp, that needs to be copied to the DX SDK media directory for the sample to execute correctly. One interesting code detail is the ConfirmDevice method shown below. Here the device is rejected if it supports hardware or mixed vertex processing and does not support shaders in hardware. This allows us to run on a hardware TnL device (with software vertex processing) like the geForce2, as well as on a shader card with hardware vertex processing.
You will note that vertex shaders in software still perform quite well. This is because AMD and Intel tuned the software pipelines for their respective processors. Indeed software vertex shaders can be considered the easiest way to get portable SIMD code. Still, once shader cards are available most serious developers wont be able to live without them.
//-----------------------------------------------------------------------------
// Name: ConfirmDevice()
// Desc: Called during device intialization, this code checks the device
// for some minimum set of capabilities
//-----------------------------------------------------------------------------
HRESULT CMyD3DApplication::ConfirmDevice( D3DCAPS8* pCaps, DWORD dwBehavior,
D3DFORMAT Format )
{
if( (dwBehavior & D3DCREATE_HARDWARE_VERTEXPROCESSING ) ||
(dwBehavior & D3DCREATE_MIXED_VERTEXPROCESSING ) )
{
if( pCaps->VertexShaderVersion < D3DVS_VERSION(1,0) )
return E_FAIL;
}
return S_OK;
}
Let’s now focus on vertex shaders. There are several pieces to master when working with vertex shaders:
- API Usage
- Vertex Definitions
- Shader Declarations
- Shader Definition
API Usage
Last column we briefly covered the new API methods for working with shaders, and they included:
- D3DXAssembleShader
- IDirect3DDevice8::CreateVertexShader
- IDirect3DDevice8::SetVertexShaderConstant
- IDirect3DDevice8::SetStreamSource
- IDirect3DDevice8::SetVertexShader
- IDirect3DDevice8::DeleteVertexShader
D3DXAssembleShader and IDirect3DDevice8::CreateVertexShader are used to generate the runtime representation of the shader and return a shader handle to that representation. The code snippet below shows usage, assuming SimpleVertexShader0 has a shader definition and dwDecl0 has a shader declaration (more on those in a bit):
ID3DXBuffer* pshader0;
ID3DXBuffer* perrors;
// Assemble the shader
rc = D3DXAssembleShader( SimpleVertexShader0 ,
sizeof(SimpleVertexShader0)-1, 0 , NULL ,
&pshader0 , perrors );
// Create the vertex shader handle
rc = m_pd3dDevice->CreateVertexShader( dwDecl0,
(DWORD*)pshader0->GetBufferPointer(),
&m_hVertexShader0, 0 );
IDirect3DDevice8::SetVertexShaderConstant is used to load constant definitions used by a shader. IDirect3DDevice8::SetStreamSource informs the runtime as to the source of vertex components, that is, what vertex buffer contains the vertex components and what size or stride of data the shader expects. IDirect3DDevice8::SetVertexShader takes the shader handle created by IDirect3DDevice8::CreateVertexShader and makes that the current vertex shader on the device. The code snippets below use these methods:
//set shader constants
float color[4] = {0,1,0,0};
m_pd3dDevice->SetVertexShaderConstant( 8 , color , 1 );
//set shader stream
m_pd3dDevice->SetStreamSource( 0, m_pVBVertexShader0,
sizeof(VERTEXSHADER0VERTEX) );
//set shader handle
m_pd3dDevice->SetVertexShader( m_hVertexShader0 );
Note an alternate usage for IDirect3DDevice8::SetVertexShader takes an FVF code to enable a fixed-function vertex shader using stream 0 only, as shown below:
//set the FVF vertex shader
m_pd3dDevice->SetVertexShader( D3DFVF_VERTEXSHADER0VERTEX );
Finally, IDirect3DDevice8::DeleteVertexShader is used to remove a vertex shader from a device; it's important to do this to leave a device "clean" without dangling assets. The code snippet below shows how to invoke this method:
//delete shader from device
m_pd3dDevice->DeleteVertexShader( m_hVertexShader0 );
With those APIs in hand, you can examine the BuildVertexShaders and DestroyVertexShaders functions that are invoked in RestoreDeviceObjects and InvalidateDeviceObjects to handle the mechanics of creating and destroying shaders. Now we are ready to dive into the shader details.
Vertex Definitions
Using the "quad" from the Base sample as a starting point, let's expand the vertex definitions for several simple shaders. We again will render a quad:
//a quad
#define NUM_VERTS 4
#define NUM_TRIS 2
Here we will use four different shaders for our quad:
// A structure for vertex shader0 vertex type
struct VERTEXSHADER0VERTEX
{
FLOAT x, y, z; // The untransformed position for the vertex
DWORD color; // The vertex color
};
// Our custom FVF, which describes our vertex shader0 vertex structure
#define D3DFVF_VERTEXSHADER0VERTEX (D3DFVF_XYZ|D3DFVF_DIFFUSE)
// Initialize vertices for rendering a quad
VERTEXSHADER0VERTEX g_VertexShaderVertices0[] =
{
// x y z diffuse
{ -1.0f,-1.0f, 0.0f, 0xff00ff00, },//bl
{ 1.0f,-1.0f, 0.0f, 0xff00ff00, },//br
{ 1.0f, 1.0f, 0.0f, 0xffff0000, },//tr
{ -1.0f, 1.0f, 0.0f, 0xff0000ff, },//tl
};
// A structure for vertex shader1 vertex type
struct VERTEXSHADER1VERTEX
{
FLOAT x, y, z; // The untransformed position for the vertex
DWORD color; // The vertex color
};
// Our custom FVF, which describes our vertex shader1 vertex structure
#define D3DFVF_VERTEXSHADER1VERTEX (D3DFVF_XYZ|D3DFVF_DIFFUSE)
// Initialize vertices for rendering a quad
VERTEXSHADER1VERTEX g_VertexShaderVertices1[] =
{
// x y z diffuse
{ -1.0f,-1.0f, 0.0f, 0xff00ff00, },//bl
{ 1.0f,-1.0f, 0.0f, 0xff00ff00, },//br
{ 1.0f, 1.0f, 0.0f, 0xffff0000, },//tr
{ -1.0f, 1.0f, 0.0f, 0xff0000ff, },//tl
};
// A structure for vertex shader2 vertex type
struct VERTEXSHADER2VERTEX
{
FLOAT x, y, z; // The untransformed position for the vertex
DWORD color; // The vertex color
FLOAT tu, tv; // the texture coords
};
// Our custom FVF, which describes our vertex shader2 vertex structure
#define D3DFVF_VERTEXSHADER2VERTEX (D3DFVF_XYZ|D3DFVF_DIFFUSE|D3DFVF_TEX1)
// Initialize vertices for rendering a quad
VERTEXSHADER2VERTEX g_VertexShaderVertices2[] =
{
// x y z diffuse u1 v1
{ -1.0f,-1.0f, 0.0f, 0xff00ff00, 1.0f, 1.0f,},//bl
{ 1.0f,-1.0f, 0.0f, 0xff00ff00, 0.0f, 1.0f,},//br
{ 1.0f, 1.0f, 0.0f, 0xffff0000, 0.0f, 0.0f,},//tr
{ -1.0f, 1.0f, 0.0f, 0xff0000ff, 1.0f, 0.0f,},//tl
};
// A structure for vertex shader3 vertex type
struct VERTEXSHADER3VERTEX
{
FLOAT x, y, z; // The untransformed position for the vertex
FLOAT nx, ny, nz; // The normal for the vertex
DWORD color; // The vertex color
FLOAT tu, tv; // the texture coords
};
// Our custom FVF, which describes our vertex shader2 vertex structure
#define D3DFVF_VERTEXSHADER3VERTEX (D3DFVF_XYZ|D3DFVF_NORMAL|D3DFVF_DIFFUSE|D3DFVF_TEX1)
// Initialize vertices for rendering a quad
VERTEXSHADER3VERTEX g_VertexShaderVertices3[] =
{
// x y z nx ny nz diffuse u1 v1
{ -1.0f,-1.0f, 0.0f, 0.0f,0.0f, 1.0f, 0xff00ff00, 1.0f, 1.0f,},//bl
{ 1.0f,-1.0f, 0.0f, 0.0f,0.0f, 1.0f, 0xff00ff00, 0.0f, 1.0f,},//br
{ 1.0f, 1.0f, 0.0f, 0.0f,0.0f, 1.0f, 0xffff0000, 0.0f, 0.0f,},//tr
{ -1.0f, 1.0f, 0.0f, 0.0f,0.0f, 1.0f, 0xff0000ff, 1.0f, 0.0f,},//tl
};
Note we have defined four vertex types:
- VERTEXSHADER0VERTEX
- VERTEXSHADER1VERTEX
- VERTEXSHADER2VERTEX
- VERTEXSHADER3VERTEX
They will correspond to four vertex shaders. The vertex definitions used here included a structure, an FVF code, and initialized vertices.
Shader Declarations
The declaration portion of a vertex shader defines the static external interface of the shader. For our purposes, a vertex shader declaration includes the following information:
- Loading data in the constant memory at the time that a shader is set as the current shader.
- Binding stream data to vertex shader input registers.
Each constant token specifies values for one or more contiguous 4-DWORD constant registers. This enables the shader to update an arbitrary subset of the constant memory, overwriting the device state, which contains the current values of the constant memory. These values can be overwritten between IDirect3DDevice8::DrawPrimitive calls when a specific shader is bound to a device by invoking the IDirect3DDevice8::SetVertexShaderConstant method. We will use this procedure for simple constants used by the shaders.
The stream binding information defines the type and vertex input register assignment of each element in each data stream. The type specifies the arithmetic data type and the dimensionality—one, two, three, or four values. Stream data elements that are less than four values are always expanded to four values with zero or more 0.0f values and one 1.0f value. For clarity, we'll have only one stream at this time.
Since this sample has four shaders, it must contain four shader declarations and they are included below:
//shader decl
float c[4] = {0.0f,0.5f,1.0f,2.0f};
DWORD dwDecl0[] =
{
D3DVSD_STREAM(0),
D3DVSD_REG(D3DVSDE_POSITION, D3DVSDT_FLOAT3 ), //D3DVSDE_POSITION,0
D3DVSD_REG(D3DVSDE_DIFFUSE, D3DVSDT_D3DCOLOR ), //D3DVSDE_DIFFUSE, 5
D3DVSD_CONST(0,1),*(DWORD*)&c[0],*(DWORD*)&c[1],*(DWORD*)&c[2],*(DWORD*)&c[3],
D3DVSD_END()
};
DWORD dwDecl1[] =
{
D3DVSD_STREAM(0),
D3DVSD_REG(D3DVSDE_POSITION, D3DVSDT_FLOAT3 ), //D3DVSDE_POSITION,0
D3DVSD_REG(D3DVSDE_DIFFUSE, D3DVSDT_D3DCOLOR ), //D3DVSDE_DIFFUSE, 5
D3DVSD_CONST(0,1),*(DWORD*)&c[0],*(DWORD*)&c[1],*(DWORD*)&c[2],*(DWORD*)&c[3],
D3DVSD_END()
};
DWORD dwDecl2[] =
{
D3DVSD_STREAM(0),
D3DVSD_REG(D3DVSDE_POSITION, D3DVSDT_FLOAT3 ), //D3DVSDE_POSITION, 0
D3DVSD_REG(D3DVSDE_DIFFUSE, D3DVSDT_D3DCOLOR ), //D3DVSDE_DIFFUSE, 5
D3DVSD_REG(D3DVSDE_TEXCOORD0, D3DVSDT_FLOAT2 ), //D3DVSDE_TEXCOORD0, 7
D3DVSD_CONST(0,1),*(DWORD*)&c[0],*(DWORD*)&c[1],*(DWORD*)&c[2],*(DWORD*)&c[3],
D3DVSD_END()
};
DWORD dwDecl3[] =
{
D3DVSD_CONST(0,1),*(DWORD*)&c[0],*(DWORD*)&c[1],*(DWORD*)&c[2],*(DWORD*)&c[3],
D3DVSD_STREAM(0),
D3DVSD_REG(D3DVSDE_POSITION, D3DVSDT_FLOAT3 ), //D3DVSDE_POSITION, 0
D3DVSD_REG(D3DVSDE_NORMAL, D3DVSDT_FLOAT3 ), //D3DVSDE_NORMAL, 3
D3DVSD_REG(D3DVSDE_DIFFUSE, D3DVSDT_D3DCOLOR ), //D3DVSDE_DIFFUSE, 5
D3DVSD_REG(D3DVSDE_TEXCOORD0, D3DVSDT_FLOAT2 ), //D3DVSDE_TEXCOORD0, 7
D3DVSD_CONST(0,1),*(DWORD*)&c[0],*(DWORD*)&c[1],*(DWORD*)&c[2],*(DWORD*)&c[3],
D3DVSD_END()
};
All of the shader declarations we use here define constants. All of the shaders we use here set the shader to use data stream 0. When you use an explicit shader declaration, as is the case here, the D3DVSD_REG preprocessor macros define the vertex register to vertex component mapping. In addition, the predefined D3DVSDE_ macros are used to specify the registers since we are using FVF vertex buffers.
The first two shader declarations, dwDecl0 and dwDecl1, then define position and diffuse color in register 0 and register 7 using the D3DVSD_REG macro and explicit register assignment. Note how this matches the VERTEXSHADER0VERTEX, D3DFVF_VERTEXSHADER0VERTEX and VERTEXSHADER1VERTEX, D3DFVF_VERTEXSHADER1VERTEX vertex definitions for the g_VertexShaderVertices0 and g_VertexShaderVertices1 vertex initializations.
The next two shader declarations, dwDecl2 and dwDecl3, are a bit more complex. The dwDecl2 declaration defines position, diffuse, and texture coordinates. This indicates we will be performing a texturing vertex shader. Note how this matches the VERTEXSHADER2VERTEX, D3DFVF_VERTEXSHADER2VERTEX vertex definition and the g_VertexShaderVertices2 vertex initialization. Finally, the dwDecl3 declaration defines position, normal, diffuse, and texture coordinates. This indicates we will be performing a texturing vertex shader with lighting. Note how this matches the VERTEXSHADER3VERTEX, D3DFVF_VERTEXSHADER3VERTEX vertex definition and the g_VertexShaderVertices3 vertex initialization.
While we perform lighting in this final vertex shader, in the future most interesting lighting tasks will be performed in pixel shaders. Still, certain global, standard lighting operations (like ambient) could be performed in vertex shaders for efficiency.
Shader Definitions
Since this sample has four shaders, it must contain four shader definitions that parallel the vertex definitions and shader declarations. Let's cover these briefly, as I will cover them in more detail later. SimpleVertexShader0 simply performs a 4x4 matrix transform using the m4x4 instruction and copies the result to the oPos position output register. SimpleVertexShader0 then copies a constant color to the oD0 diffuse color output register. SimpleVertexShader1 performs the same operation for the transform phase, but does it for each component using the 4x1 dot product using the dp4 instruction and writes the result to the oPos position output register. Instead of simply copying the constant color, the diffuse color specified with the vertices is used and copied to the oD0 diffuse color output register. SimpleVertexShader2 is slightly more complex. It too uses the dp4 instruction for transform, and writes the resulting position to the oPos position output register. This shader now adds texturing and texture coordinates to the mix and uses the texture coordinates to update the oTO output texture coordinate register. Finally, the SimpleVertexShader3 performs the same transform phase operation using the dp4 instruction for transform, and writes the resulting position to the oPos position output register. Instead of simply copying a color value or a texture value, this shader determines if the "quad" is facing a light, and if not, does not render any bits. It does this using the dp3 instruction using the normal component and a light direction stored in a constant register.
The source for these shaders is included below:
// Simple vertex shader0
// Constants
// reg c0 = (0,0.5,1.0,2.0)
// reg c4-7 = WorldViewProj matrix
// reg c8 = constant color
// Stream 0
// reg v0 = position ( 4x1 vector )
// reg v5 = diffuse color
const char SimpleVertexShader0[] =
"vs.1.0 // Shader version 1.0 \n"\
"m4x4 oPos , v0 , c4 // emit projected position \n"\
"mov oD0 , c8 // Diffuse color = c8 \n";
// Simple vertex shader1
// Constants
// reg c0 = (0,0.5,1.0,2.0)
// reg c4-7 = WorldViewProj matrix
// reg c8 = constant color
// Stream 0
// reg v0 = position ( 4x1 vector )
// reg v5 = diffuse color
const char SimpleVertexShader1[] =
"vs.1.0 // Shader version 1.0 \n"\
"dp4 oPos.x , v0 , c4 // emit projected x position \n"\
"dp4 oPos.y , v0 , c5 // emit projected y position \n"\
"dp4 oPos.z , v0 , c6 // emit projected z position \n"\
"dp4 oPos.w , v0 , c7 // emit projected w position \n"\
"mov oD0 , v5 // Diffuse color = vertex color \n";
// Simple vertex shader2
// Constants
// reg c0 = (0,0.5,1.0,2.0)
// reg c4-7 = WorldViewProj matrix
// reg c8 = constant color
// Stream 0
// reg v0 = position ( 4x1 vector )
// reg v5 = diffuse color
// reg v7 = texcoords ( 2x1 vector )
const char SimpleVertexShader2[] =
"vs.1.0 // Shader version 1.0 \n"\
"dp4 oPos.x , v0 , c4 // emit projected x position \n"\
"dp4 oPos.y , v0 , c5 // emit projected y position \n"\
"dp4 oPos.z , v0 , c6 // emit projected z position \n"\
"dp4 oPos.w , v0 , c7 // emit projected w position \n"\
"mov oT0.xy , v7 // copy texcoords \n";
// Simple vertex shader3
// Constants
// reg c0 = (0,0.5,1.0,2.0)
// reg c4-7 = WorldViewProj matrix
// reg c8 = constant color
// reg c12 = light dir
// Stream 0
// reg v0 = position ( 4x1 vector )
// reg v3 = normal ( 4x1 vector )
// reg v5 = diffuse color
// reg v7 = texcoords ( 2x1 vector )
const char SimpleVertexShader3[] =
"vs.1.0 //Shader version 1.0 \n"\
"dp4 oPos.x , v0 , c4 //emit projected x position \n"\
"dp4 oPos.y , v0 , c5 //emit projected y position \n"\
"dp4 oPos.z , v0 , c6 //emit projected z position \n"\
"dp4 oPos.w , v0 , c7 //emit projected w position \n"\
"dp3 r0.x , v3 , c12 //N dot L in world space \n"\
"mul oD0 , r0.x , v5 //Calculate color intensity \n"\
"mov oT0.xy , v7 //copy texcoords \n";
Each of these shader definitions is specified as a text string, making it easy to use with D3DXAssembleShader. In addition, I clearly document the constant registers and stream register definitions with each shader to make sure the bindings are clear. I highly suggest you adopt some similar self-documenting scheme for your own shader development.
Shader Debugging
Now that we have discussed the details of implementing shaders, let's talk about debugging. Determining what a shader is doing when it has gone astray is much easier when using a tool. Graphics chip IHV nVidia has provided a shader debugging tool on their web site (see http://www.nvidia.com on the developer area). I have marked the views provided by the nVidia shader debugger, shown below in Figure 6, so you can quickly locate where the shader details are presented.
First and foremost is the "Current Program" view. This contains our shader program and allows us to step through the instructions. The current instruction is displayed in blue, along with a dash symbol ("-") on the left. Any program lines with breakpoints set will have an asterisk ("*") along side them. The program display shows a disassembly of the currently set instruction; note it may look quite different to the source file originally used to compile the shader (particularly in terms of how register swizzle/masks are displayed). All programs finish with a "-end-" indicator to show that the shader is finished. Above the program display, in the column title, the current handle of the active shader is displayed. This matches the handle that the API returns as the result of a IDirect3DDevice8::CreateVertexShader call.
.gif)
Figure 5. nVidia shader debugger
Other important views include:
- "Temp Registers" view, which allows us to view intermediate results as we step through the program.
- "Input Streams" view, which allows us to see our original source data, as does the "Constant" view for any constants we loaded.
- "Shader Output" view, which is where you will see a numerical representation of the shader output.
Finally, the bottom of the shader shows us debugger status, scene status, and a message output view.
.gif)
Figure 6. nVidia shader debugger toolbar, with relevant commands shown
The debugger can be driven from the menu or from the toolbar. All options available in the menus can be accessed from the buttons on the toolbar. It is often most convenient to use the toolbar. The toolbar contains icons for the available debugger commands, and the ones relevant to vertex shaders are shown in Figure 6.
See nVidia's documentation for full details on using these buttons and this valuable tool. This is one extremely useful tool and if you are going to be doing any hard-core shader development, I highly suggest downloading this tool and getting familiar with it. Note this cool tool also allows you to perform pixel shader debugging, but that's a future column.
Some Simple Shaders for Your Viewing Pleasure
We've covered the APIs to use shaders, vertex definitions, shader declarations, shader definitions, and the nVidia shader debugger. Now it's time to see our vertex shaders in action. The BaseVS sample starts up using the FF pipeline. Hitting the 'v' key enables vertex shaders. You can then choose a shader from the Vertex Shader menu.
The first vertex shader, SimpleVertexShader0, performs the basic transform and uses a constant color. Looking at the code again, below:
vs.1.0 // Shader version 1.0
m4x4 oPos , v0 , c4 // emit projected position
mov oD0 , c8 // Diffuse color = c8
This vertex shader uses the m4x4 4x4 matrix transform instruction to take a combined world*view*projection matrix and transform the vertex position by that matrix and place the resulting transformed position (a vector) in the oPos output register. Then a constant diffuse color is loaded into the oD0 color output register. Note this shader depends on us loading a combined world*view*projection matrix as a constant. We did that using utility routine as follows:
LoadMatrix4( m_pd3dDevice , 4 , m_matWorld * m_matView * m_matProj );
The body of LoadMatrix4 is show below:
void LoadMatrix4( IDirect3DDevice8* pdevice , DWORD creg , const D3DXMATRIX& matrix )
{
D3DXMATRIX trans;
D3DXMatrixTranspose( &trans , &matrix );
pdevice->SetVertexShaderConstant( creg , &trans , 4 );
}
Note it loads the transpose of the original matrix, and uses IDirect3DDevice8::SetVertexShaderConstant to load the matrix into a constant register. In addition, a constant color was loaded in c8 due to our invoking:
float color[4] = {0,1,0,0};
m_pd3dDevice->SetVertexShaderConstant( 8 , color , 1 );
It can't get much simpler than that. Figure 7 shows the result of this simple vertex shader.
.gif)
Figure 7. Constant-color vertex shader
I suggest you use the nVidia shader debugger to step through this shader if anything is unclear. Hint, Figure 5 shows using the shader debugger on this shader.
The second shader, SimpleVertexShader1, produces almost the same result, but does so in a slightly different fashion. Looking at the code for SimpleVertexShader1 below:
vs.1.0 // Shader version 1.0
dp4 oPos.x , v0 , c4 // emit projected x position
dp4 oPos.y , v0 , c5 // emit projected y position
dp4 oPos.z , v0 , c6 // emit projected z position
dp4 oPos.w , v0 , c7 // emit projected w position
mov oD0 , v5 // Diffuse color = vertex color;
We see that instead of combining the operation on each vertex component into one operation using the m4x4 4x4 matrix transform instruction, this shader operates on each vertex component using the dp4 4x1 vector transform instruction. This is exactly equivalent, and if you remember from last month's column, the m4x4 instruction was shown to expand to 4 slots/clocks. The above four dp4 instructions are exactly what m4x4 expands to. No mystery now! Note the vertex position is in input register v0, and the diffuse color is in input register v5 just like our shader declaration stated. Also note that some standard constants are preloaded in constant registers c0-c3 by way of the shader constant declaration line:
D3DVSD_CONST(0,1),*(DWORD*)&c[0],*(DWORD*)&c[1],*(DWORD*)&c[2],*(DWORD*)&c[3],
In addition, the transform is again loaded in constant registers c4-c7, due to our invoking:
LoadMatrix4( m_pd3dDevice , 4 , m_matWorld * m_matView * m_matProj );
.gif)
Figure 8. Diffuse-color vertex shader
Here the color stored in the input stream register is favored over the one loaded in the constant register that was used in the previous shader. See Figure 8 for an example of a simple diffuse color vertex shader.
The third shader, SimpleVertexShader2, adds texturing to our vertex shader bag of tricks. Looking at the code for SimpleVertexShader2 below:
vs.1.0 // Shader version 1.0
dp4 oPos.x , v0 , c4 // emit projected x position
dp4 oPos.y , v0 , c5 // emit projected y position
dp4 oPos.z , v0 , c6 // emit projected z position
dp4 oPos.w , v0 , c7 // emit projected w position
mov oT0.xy , v7 // copy texcoords;
The same constant registers are loaded as we saw for SimpleVertexShader1. The input stream bindings are the same with the addition of texture coordinates bound to input register v7. We see that this shader also operates on each vertex component using the dp4 4x1 vector transform instruction. Then this shader moves texture coordinates from input register v7 to output register oT0. That produces the textured output seen in Figure 9.
.gif)
Figure 9. Simple texturing vertex shader
The fourth shader, SimpleVertexShader3, adds lighting to our vertex shader bag of tricks. Looking at the code for SimpleVertexShader3 below:
vs.1.0 // Shader version 1.0
dp4 oPos.x , v0 , c4 // emit projected x position
dp4 oPos.y , v0 , c5 // emit projected y position
dp4 oPos.z , v0 , c6 // emit projected z position
dp4 oPos.w , v0 , c7 // emit projected w position
dp3 r0 , v3 , c12 // N dot L in world space
mul oD0 , r0.x , v5 // Calculate color
mov oT0.xy , v7 // copy texcoords
We see that this shader also operates on each vertex component using the dp4 4x1 vector transform instruction. Then this shader performs the classic N dot L lighting operation using the dp3 vector transform instruction on the normal component in register v3. If the result of this operation is less than zero, the polygon is not facing the light and hence, not visible. We use that fact when calculating the result color using the mul operator into the output register oD0. Here we rely on the post-shader clamping of values to the range [0,1]; technically that calculation is part of the lighting phase, but why do it twice? Finally, the texture coordinates are updated into output register oT0 from register v7. One other detail, we have to change the texture stage operator to modulate so that we get the diffuse and texture combined as the resulting output color. This results in the "quad" as shown in Figure 10.
.gif)
Figure 10. Simple lighting and texturing vertex shader
Note that in addition to the texture being modulated by the diffuse color, the "quad" shows all black on the back-face as it rotates away from the light which is visually correct in that the back-face has no lighting and should be solid black.
That concludes our coverage of simple vertex shaders. We've just started to scratch the surface. Other uses for vertex shaders include character animation, custom fog effects, and wacky lighting effects. The fact that you write the shader code and are not dependent on the functionality provided by the fixed function pipeline really does give developers freedom and control. I encourage you to take this code and the nVidia shader debugger tool and continue to experiment on your own.
Last Word
With a basic understanding of vertex shaders in hand, you now have the tools to begin exploration of the limits of the possible in the vertex pipeline. You can now use constant and diffuse colors, perform texturing, and handle basic lighting tasks. Shaders and the programmable pipeline offer great creative freedom for 3D programmers and provide a vehicle to stay on top of the ever-increasing feature set of today's 3D graphics hardware. Next month we will continue the coverage of shaders.
I'd like to acknowledge the help of Mike Burrows and Mike Anderson (Microsoft) and Chris Seitz and Chris Maughan (nVidia) in producing this column.
Your feedback is welcome. Feel free to drop me a line at the address below with your comments, questions, topic ideas, or links to your own variations on topics the column covers. Please, though, don't expect an individual reply or send me support questions. Remember, Microsoft maintains active mailing lists as forums for like-minded developers to share information:
Driving DirectX