## Dec 31, 2012

### [Intro to Shader] 01. What is Shader

Source Code: GitHubZip

The very first question I used to get in my class was “what the heck is shader?” Looking back, I also asked this same question when I heard about shader first few times, but no one was able to give me the one sentence explanation that made sense to a noob like me. I eventually figured it out, and I think the following sentence is the easiest way to define what shader is.

Shaders are functions which calculate the position and colour of a pixel on screen.

Still not clear enough? Maybe it is easier to understand if we look at where shaders are used in a modern graphics pipeline.

Simplified 3D Graphics Pipeline
Only vertex and pixel shaders are covered in this book. So, let me quickly show you how these shaders are used in a modern 3D graphics pipeline.

One of the reasons why 3D pipeline exists is to display a 3D object onto a 2D screen. First, take a look at Figure 1.1, which shows an overly simplified 3D graphics pipeline. [1]

 Figure 1.1 Overly simplified 3D graphics pipeline

In figure 1.1, what a vertex shader takes as input is a 3D model itself. A 3D model is usually represented with polygons, which are nothing more than a collection of triangles. To make a triangle, you need 3 vertices, right? So, you can just say the input for a vertex shader is an array of vertices, instead. A-ha! Now you know why it is called vertex shader.

Then what does a vertex shader do? The most important responsibility of a vertex shader is transforming vertices of a 3D object into the screen space. You can compare this to how a painter captures real-world scenery onto a canvas. Have you heard of perspective drawing? Even if you draw exactly same two objects onto the canvas, the final sizes on the canvas can be different based on the distance between each of objects and your eyes. In other words, close-by objects look bigger and far-away objects look smaller.

In graphics, we like to say that those two objects had the same size in world space, but now they have different sizes in screen(=canvas) space. Well, do not worry about these spaces. We will learn more about it in the next chapter. I just wanted to tell you that vertex shaders must transform an object from one space to another.

Remember I told you a 3D model is basically a bunch of vertices? So if you transform all the vertices that make up a 3D model one by one, it is same as transforming the 3D object itself. This is exactly how vertex shaders work. Then, how many times a vertex shader will be executed? Exactly. Just same times as the number of vertices in the model.

We can sum up the last paragraph with the following sentence:

The main role of a vertex shader is transforming each vertex’s position into another space.

So, one thing a vertex shader must output at the end of its execution is the vertex position in screen space . Every three of these vertex positions make a triangle, also in screen space. [2]

Now, can you tell me how many pixels would be inside of each triangle? The screen is made of pixels, so we should know how many pixels we need to draw and where we need to draw them, right? This is what the rasterizer unit does. Rasterizer groups every three vertex positions a vertex shader outputs and makes a triangle to find out how many pixels are in it. So now you can guess how many times the pixel shader should be executed, right? Of course, as many times as the number of pixels the rasterizer finds out.

Then, what would be the main work of pixel shader? Here is a hint. This is the last stage of the 3D graphics pipeline before reaching to the screen.

The main responsibility of a pixel shader is to calculate the final colour on screen.

If you combine the roles of vertex and pixel shaders, you finally get the definition I mentioned earlier:

Shaders are functions which calculate the position and colour of a pixel on screen.

Even though we tried to define what shader is, I honestly do not think most beginners would get a firm grasp of it yet. You should actually write some sample codes to do so. Would it help if I say shader is a way to manipulate the positions and colours of pixels while we are drawing a 3D object? Maybe not? Don’t worry. Just keep reading, you will get your eureka moment pretty soon. :)

K, now we kind of know what shader is, but what does it mean to write a shader program? Let’s look at Figure 1.1 again. In Figure 1.1, do you see that some stages are rectangular while the others are round? Round stages are what GPU (Graphics Processing Unit) does automatically for you, which means we programmers have no control over them. On the other hand, rectangular stages are what programmers can manipulate “freely”. You get to write a function for each of those rectangles. This is what we call shader programming. So, you see there are only two rectangular stages in Figure 1.1, right? Yes, vertex and pixel shaders! So when someone talks about shader programming, he means writing a function for the Vertex Shader unit and another one for the Pixel Shader unit, that’s it! [3]

Just like anything in the life, there are multiple shader languages out there, but what this book uses is HLSL(High Level Shader Language) from DirectX. HLSL uses C-like syntax and is very close to other shader languages, such as GLSL[4]  and CgFX[5] . Once you learn HLSL, it is very trivial to switch to another shader languages.

The best way to learn a programming language is writing code. Debating over the philosophy and syntax of a language only makes beginners bored, uninterested or clueless. Once you feel the fun of coding in that new language, all the other things naturally follow. So I will not try to turn you off by listing all HLSL syntax at this moment. Instead, I will force you to write very easy shaders in HLSL first. If you are one of those people who cannot live without knowing all the syntax, please refer to the appendix at the very back. I really do not like to bore out people.

Well, I lied. There are still some initial setups we need to do. It is boring, I know. But you will need it to learn shader programming with this book, so please bear with me? It is not that long.

Preparation for Shader Programming
As mentioned in Introduction, the only focus of this book is shader programming. The reason why I decided not to cover anything about DirectX is because there are many good DirectX books out there, so I did not want to waste any of my time (and pages[6]) to discuss about it. Also I wanted to allow technical artists to learn HLSL programming through this book, so covering DirectX, mostly programmer-only material, was a no-no to me.

To allow technical artists to find this book useful, I separated each chapter into two steps. First step involves writing shader program in an application called Render Monkey from AMD. Both programmers and technical artists should do this step.

Second step, which is only for programmers, plugs the shaders authored in Render Monkey into a C++/DirectX framework. If you are a programmer who is not interested in C++/DirectX, feel free to skip this step, too.

Now, it is time to prepare something for these two steps.

Render Monkey
Render Monkey is a shader authoring tool provided by AMD. I found this tool great for quick prototyping. You can download version 1.82 from AMD website.

Just use the default option when install.

Optional: Simple DirectX Framework
If you are one of those braves wishing to run shaders in the C++/DirectX framework, please read this section.

First, install Visual C++ 2010 and DirectX SDK. If you do not have Visual C++, you can download the express version for free from Microsoft website. You can also download DirectX SDK from the same website.

Once you installed above two programs, open 01_DxFramework/BasicFramework.sln file from this book’s code samples. (You can download them from my blog, www.popekim.com). If you just run this program, you will see something like Figure 1.2.

 Figure 1.2. Super simple framework

This framework “supports” the following “features”:

• Basic window functions, such as window creation and message loop
• Direct 3D device creation
• Simple game loop
• Simple keyboard input handling

By the way, this stripped-down framework is made to run shader codes quickly. As a result, all functions are in a single .cpp file, and it does not use any concept of OOP(Object Oriented Programming). In other words, everything is written in C-style and all variables are globally defined. You see the problem? Yes. IF YOU ARE MAKING A REAL GAME, NEVER EVER WRITE YOUR FRAMEWORK THIS WAY. Again, this framework is intentionally made very simple to allow you to run shader demos very quickly.

Alright, that was enough warning, I think. Now, let’s take a look at the framework. First, open BasicFramework.h file.

//***************************************************************
//
//
// Super simple C-style framework for Shader Demo
// (NEVER ever write framework like this when you are making real
// games.)
//
// Author: Pope Kim
//
//***************************************************************

#pragma once

#include <d3d9.h>
#include <d3dx9.h>

// ---------- constants ------------------------------------
#define WIN_WIDTH 800
#define WIN_HEIGHT 600

// ---------------- function prototype  ------------------------

// Message procedure related
LRESULT WINAPI MsgProc( HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam );
void ProcessInput(HWND hWnd, WPARAM keyPress);

// Initialization-related
bool InitEverything(HWND hWnd);
bool InitD3D(HWND hWnd);
LPD3DXEFFECT LoadShader( const char * filename );
LPDIRECT3DTEXTURE9 LoadTexture(const char * filename);
LPD3DXMESH LoadModel(const char * filename);

// game loop related
void PlayDemo();
void Update();

// Rendering related
void RenderFrame();
void RenderScene();
void RenderInfo();

// cleanup related
void Cleanup();

This header file is very straight-forward. You probably noticed WIN_WIDTH and WIN_HEIGHT. These define the window size. All the other codes are just function declarations, and the implementations are all inside ShaderFramework.cpp. So let’s take a look at ShaderFramework.cpp.

You can see all the global variables at the top of the file.

//---------------------------------------------------------------
// Globals
//---------------------------------------------------------------

// D3D-related
LPDIRECT3D9       gpD3D       = NULL;   // D3D
LPDIRECT3DDEVICE9 gpD3DDevice = NULL;   // D3D device

// Fonts
ID3DXFont*        gpFont      = NULL;

// Models

// Textures

// Application name
const char* gAppName = "Super Simple Shader Demo Framework";

Now time to create the window.

//---------------------------------------------------------------
// Application entry point/message loop
//---------------------------------------------------------------

// entry point
INT WINAPI WinMain( HINSTANCE hInst, HINSTANCE, LPSTR, INT )
{

To create a windows, you need to register a window class first.

// register windows class
WNDCLASSEX wc = {sizeof(WNDCLASSEX), CS_CLASSDC, MsgProc, 0L, 0L,
GetModuleHandle(NULL), NULL, NULL, NULL, NULL,
gAppName, NULL };
RegisterClassEx( &wc );

Now, it is time to create an instance of the window class that we just registered. CreateWindow() functions does this. Use WIN_WIDTH and WIN_HEIGHT for the width and height of the window, respectively.

// creates program window
DWORD style = WS_OVERLAPPED | WS_CAPTION | WS_SYSMENU | WS_MINIMIZEBOX;
HWND hWnd = CreateWindow( gAppName, gAppName,
style, CW_USEDEFAULT, 0, WIN_WIDTH, WIN_HEIGHT,
GetDesktopWindow(), NULL, wc.hInstance, NULL );

The funny thing about a windowed program is that the actual area you can render onto is smaller than WIN_WIDTH and WIN_HEIGHT. It is because the window also has other junks like the title bar and border lines. So you need to adjust the window size once it is created so that the renderable area, or client rect, is equal to WIN_WIDTH and WIN_HEIGHT.

// Client Rect size will be same as WIN_WIDTH and WIN_HEIGHT
POINT ptDiff;
RECT rcClient, rcWindow;

GetClientRect(hWnd, &rcClient);
GetWindowRect(hWnd, &rcWindow);
ptDiff.x = (rcWindow.right - rcWindow.left) - rcClient.right;
ptDiff.y = (rcWindow.bottom - rcWindow.top) - rcClient.bottom;
MoveWindow(hWnd,rcWindow.left, rcWindow.top, WIN_WIDTH + ptDiff.x,
WIN_HEIGHT + ptDiff.y, TRUE);

Now that we got the correct windows size, let’s show the window!

ShowWindow( hWnd, SW_SHOWDEFAULT );
UpdateWindow( hWnd );

Next, we initialize Direct3D and load all D3D resources, such as textures, shaders and meshes. InitEverything() function contains all these things. If the program fails at initializing Direct3D or other stuff, it simply quits.

// Initialize everything including D3D
if( !InitEverything(hWnd) )
{
PostQuitMessage(1);
}

Once D3D initialization is completed, what is left is to keep running the demo until WM_QUIT message is sent. WM_QUIT is a window message which nicely asks us to finish the execution of the program.

// Message loop
MSG msg;
ZeroMemory(&msg, sizeof(msg));
while(msg.message!=WM_QUIT)
{
if( PeekMessage( &msg, NULL, 0U, 0U, PM_REMOVE ) )
{
TranslateMessage( &msg );
DispatchMessage( &msg );
}
else // If there's no message to handle, update and draw the game
{
PlayDemo();
}
}

When we need to finish the demo, we unregister the window class and return from the program.

UnregisterClass( gAppName, wc.hInstance );
return 0;
}

We also need to see the function that takes care of window messages.

// Message handler
LRESULT WINAPI MsgProc( HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam )
{
switch( msg )
{

keyboard input is handled by ProcessInput() function.

case WM_KEYDOWN:
ProcessInput(hWnd, wParam);
break;

When the window is being closed, all D3D resources we loaded during the initialization step should be released by calling CleanUp() function. Once it is done, “please terminate this program” message is sent.

case WM_DESTROY:
Cleanup();
PostQuitMessage(0);
return 0;
}

Any message that is not being handled by this function will be sent to the default message procedure, which will, in turn, handle them.

return DefWindowProc( hWnd, msg, wParam, lParam );
}

The only keyboard input this framework listens to at this moment is ESC key. When this key is pressed, the program will be terminated.

// Keyboard input handler
void ProcessInput( HWND hWnd, WPARAM keyPress)
{
switch(keyPress)
{
// when ESC key is pressed, quit the demo
case VK_ESCAPE:
PostMessage(hWnd, WM_DESTROY, 0L, 0L);
break;
}
}

Now, let’s look at the initialization code more closely.

//------------------------------------------------------------
// intialization code
//------------------------------------------------------------
bool InitEverything(HWND hWnd)
{

First, we initialize D3D by calling InitD3D() function. Unless it fails, we call LoadAssets() function to load D3D resources, such as textures, models and shaders.

// init D3D
if( !InitD3D(hWnd) )
{
return false;
}

{
return false;
}

Next up is font loading. We will use this font to display debug information on screen.

if(FAILED(D3DXCreateFont( gpD3DDevice, 20, 10, FW_BOLD, 1, FALSE,
DEFAULT_CHARSET, OUT_DEFAULT_PRECIS,
DEFAULT_QUALITY, (DEFAULT_PITCH | FF_DONTCARE),
"Arial", &gpFont )))
{
return false;
}

return true;
}

The meanings of parameters used with D3DXCreateFont():

• gpD3DDevice: D3D device
• 20: the height of the font
• 10: the width of the font
• FW_BOLD: use bold style
• 1: mipmap level
• FALSE: do not use italic style
• DEFAULT_CHARSET: use default character set
• OUT_DEFAULT_PRECIS: defines how close the final font properties displayed on the screen should be to the ones we are setting here
• DEFAULT_QUALITY: defines how close the final font quality displayed on the screen to the one we are setting here
• DEFAULT_PITCH | FF_DONTCARE: Use default pitch, and I don’t care about the font family
• "Arial": the name of font to use
• gpFont: stores the newly created font

Now let’s take a look at InitD3D() function, which creates a D3D object and D3D device. In order to load resources or draw with DirectX, you must create a D3D device.

// D3D and device initialization
bool InitD3D(HWND hWnd)
{

First, we create a Direct3D object.

// D3D
gpD3D = Direct3DCreate9( D3D_SDK_VERSION );
if ( !gpD3D )
{
return false;
}

Now, we need to fill in the structure to create a D3D device.

// fill in the structure needed to create a D3D device
D3DPRESENT_PARAMETERS d3dpp;
ZeroMemory( &d3dpp, sizeof(d3dpp) );

d3dpp.BackBufferWidth    = WIN_WIDTH;
d3dpp.BackBufferHeight    = WIN_HEIGHT;
d3dpp.BackBufferFormat    = D3DFMT_X8R8G8B8;
d3dpp.BackBufferCount    = 1;
d3dpp.MultiSampleType    = D3DMULTISAMPLE_NONE;
d3dpp.MultiSampleQuality    = 0;
d3dpp.hDeviceWindow    = hWnd;
d3dpp.Windowed    = TRUE;
d3dpp.EnableAutoDepthStencil    = TRUE;
d3dpp.AutoDepthStencilFormat    = D3DFMT_D24X8;
d3dpp.FullScreen_RefreshRateInHz = 0;
d3dpp.PresentationInterval    = D3DPRESENT_INTERVAL_ONE;

Here are some fields worth understanding:

• BackBufferWidth: the width of back buffer(rendering area)
• BackBuferHeight: the height of back buffer
• BackBufferFormat: the format of back buffer
• AutoDepthStencilFormat: the format of depth/stencil buffer
• SwapEffect: the effect of swap. For performance reasons, D3DSWAPEFFECT_DISCARD is recommended.
• PresentationInterval: the relationship between the refresh rate of monitor and the frequency of swapping back buffer. D3DPRESENT_INTERVAL_ONE means back buffer will be swapped whenever monitory v-sync happens. Most computer games swap the back buffer without waiting for V-sync.  (D3DPRESENT IMMEDIATE) It is mainly for performance reasons. The most noticeable downside of this mode is the screen tearing

Now that we have this structure filled, we can create a D3D device.

// create D3D device
if( FAILED( gpD3D->CreateDevice( D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL,
hWnd,
D3DCREATE_HARDWARE_VERTEXPROCESSING,
&d3dpp, &gpD3DDevice ) ) )
{
return false;
}

return true;
}

LoadAssets() function is supposed to load D3D resources, but there is no code in it at this moment. You will get to call other functions, such as LoadShader(), LoadTexture() and LoadModel(), to load resources in the later chapters.

{

return true;
}

Next is LoadShader() function, which loads a shader program saved in a .fx file. A .fx file is a text file which can contain both vertex and pixel shader functions. It can be dynamically compiled and loaded via D3DXCreateEffectFromFile() function. So, if there is any syntax error in HLSL code you write, this function will encounter compiler errors. The last parameter of this function is how you retrieve the error messages. We will print out the error messages in Visual C++’s output window.

{
LPD3DXEFFECT ret = NULL;

LPD3DXBUFFER pError = NULL;
DWORD dwShaderFlags = 0;

#if _DEBUG
#endif

D3DXCreateEffectFromFile(gpD3DDevice, filename,
NULL, NULL, dwShaderFlags, NULL, &ret, &pError);

// to output window
if(!ret && pError)
{
int size = pError->GetBufferSize();
void *ack = pError->GetBufferPointer();

if(ack)
{
char* str = new char[size];
sprintf(str, (const char*)ack, size);
OutputDebugString(str);
delete [] str;
}
}

return ret;
}

The meanings of the parameters of D3DXCreateEffectFromFile() function are:

• gpD3DDevice: D3D device
• filename: the name of shader file to load
• NULL: do not use additional #define definitions for shader compilation
• NULL: do not use additional #includes
• NULL: do not use an effect pool object for shared parameters
• ret: will store compiled shader
• pError: will point to error messages, if any

Next is model loading code. It assumes models are stored in .x format, which is supported by DirectX natively.

LPD3DXMESH LoadModel(const char * filename)
{
LPD3DXMESH ret = NULL;
if ( FAILED(D3DXLoadMeshFromX(filename,D3DXMESH_SYSTEMMEM, gpD3DDevice,
NULL,NULL,NULL,NULL, &ret)) )
{
OutputDebugString(filename);
OutputDebugString("\n");
};

return ret;
}

Again, the meaning of the above parameters for D3DXLoadMeshFromX() function call are:

• D3DXMESH_SYSTEMMEM: load the mesh to system memory
• gpD3DDevice: D3D device
• NULL: Don’t give me adjacency data
• NULL: Don’t give me material information
• NULL: Don’t give me effect instance
• NULL: Don’t give me the number of materials
• ret: will store loaded mesh

Finally, let’s look at LoadTexture(), which loads a texture(image) file.

LPDIRECT3DTEXTURE9 LoadTexture(const char * filename)
{
LPDIRECT3DTEXTURE9 ret = NULL;
if ( FAILED(D3DXCreateTextureFromFile(gpD3DDevice, filename, &ret)) )
{
OutputDebugString(filename);
OutputDebugString("\n");
}

return ret;
}

Next is our game loop function, PlayDemo(). This function is called whenever there is no window message to handle. For real games, you would calculate the elapsed time since last frame and use it for both update and rendering functions, but it is omitted here for simplicity.

//------------------------------------------------------------
// game loop
//------------------------------------------------------------
void PlayDemo()
{
Update();
RenderFrame();
}

There is nothing in Update() function yet. One day, we will add something in here.

// Game logic update
void Update()
{
}

Next is RenderFrame() function, which draws stuff onto screen.

//------------------------------------------------------------
// Rendering
//------------------------------------------------------------

void RenderFrame()
{

We first clear the back buffer with blue colour.

D3DCOLOR bgColour = 0xFF0000FF; // background colour - blue

gpD3DDevice->Clear( 0, NULL, (D3DCLEAR_TARGET | D3DCLEAR_ZBUFFER),
bgColour, 1.0f, 0 );

Then, we draw our scene and debug info.

gpD3DDevice->BeginScene();
{
RenderScene();    // draw 3D objects and so on
RenderInfo();     // show debug info
}
gpD3DDevice->EndScene();

Once rendering is done, we simply present what is drawn onto the back buffer to the screen.

gpD3DDevice->Present( NULL, NULL, NULL, NULL );
}

Just like Update() function, there is no code in RenderScene() function yet. We will write some code here in the next chapter to draw a 3D object.

// draw 3D objects and so on
void RenderScene()
{
}

RenderInfo() function simply displays key mapping information onto the screen.

// show debug info
void RenderInfo()
{
// text colour
D3DCOLOR fontColor = D3DCOLOR_ARGB(255,255,255,255);

// location to show the text
RECT rct;
rct.left=5;
rct.right=WIN_WIDTH / 3;
rct.top=5;
rct.bottom = WIN_HEIGHT / 3;

// show debug keys
gpFont->DrawText(NULL, "Demo Framework\n\nESC: Quit demo", -1, &rct, 0,
fontColor);
}

When the program is being shut down, we must release all D3D resources to prevent memory leak. Once all resources are released, the D3D device and the D3D object need to be released too.

//------------------------------------------------------------
// cleanup code
//------------------------------------------------------------

void Cleanup()
{
// release fonts
if(gpFont)
{
gpFont->Release();
gpFont = NULL;
}

// release models

// release textures

// release D3D
if(gpD3DDevice)
{
gpD3DDevice->Release();
gpD3DDevice = NULL;
}

if(gpD3D)
{
gpD3D->Release();
gpD3D = NULL;
}
}

That is it. We just finished writing a very simple framework. Even if you do not understand above code very well, that is completely fine. It does not really prevent you from learning HLSL with this book. But if your dream is to be a rendering dude, I highly recommend you to learn DirectX properly after finishing this book.

Thank you so much for suffering through the boring preparation steps. After the following quick summary, you we are off to the next chapter, where you will actually have some fun making something to show up on the screen!

Summary
The short summary of what we discussed in this chapter:

• Shaders are functions calculating the position and colour of each pixel.
• If you think shaders in terms of a painter’s workflow, vertex shader is perspective sketch and pixel shader is colouring.
• Shader programming is nothing more than writing functions which are executed by vertex and pixel shading units.
• Render Monkey is a great tool for quick shader prototyping.

------
Footnotes:

1. I intentionally over-simplified this figure to help you understand the roles of vertex and pixel shaders. Real 3D graphics pipelines are way more complicated than this.
2. Vertex shaders often outputs more information than just vertex positions. You will see more of it in the following chapters.
3. New shader types are introduced with DirectX 10 and 11. But they are not for beginners and currently not being used enough in real-world to be included in this book.
4. It stands for OpenGL Shader Language. As the name suggests, it is OpenGL’s shader language, which is somewhat different from HLSL syntax-wise.
5. It is a shader programming language supported by NVidia. It is almost identical to HLSL except a few things.
6. More pages = higher price = less beer = sadder life

## Dec 29, 2012

### [Intro to Shader] 00. Introduction

Source Code: GitHubZip

Introduction
This book is a collection of lecture materials I developed and used to teach a shader programming course at the Art Institute of Vancouver between 2007 and 2009.

I started to work on this book back in 2010 without knowing for sure if I want to publish this book or just release it online. But before finish writing this book, a South Korean publisher wanted to publish this book in South Korea, so I had to stop it to work on the Korean version first.

Now the Korean version is out, and selling pretty well: in fact, it made its way to the best seller in computer programming category and has been picked up by a number of computer science universities and game schools as the textbook for their shader programming classes. So I thought it is time to come back to where it all started and finish the unfinished business.

I am still not sure whether I want to publish this book or not. If there is any publisher who wants to publish this easy-to-follow introductory shader programming book for both programmers and technical artists, feel free to contact me.

Why I Wrote This Book
When I was starting to teach in 2007, I was looking for a decent text book that can teach college students how to program shader language. Unfortunately, I could not find one and I believe there is still none out there. All the shader books out there focus more on advanced techniques, which are good for intermediate-to-advanced graphics programmers like me. It was very obvious to me that these books would rather scare off most college students than excite them. Sure, most DirectX books contain some introductory shader programming material, but I found none of those were very useful for the following reasons:

• In those books, shader is a second-class citizen; they only touch the surface
• They are too academic and often put too much focus on theory and syntax
• They have too many examples which have no real-world use
• They consist of too many pages to cover both DirectX and Shaders, which sacrifice portability of the book and increase the price unnecessarily. 600~1000 pages? Jees.

So what did I do? I started to teach without any textbook. I truly believe that students should get their hands dirty first to find fun in any new programming language, so I somewhat ignored a lot of theory or maths and rather focused more on practical techniques or other basic materials that will eventually lead to other practical techniques. One great thing about teaching at a college is that you find things that students have hard time understanding while you have been taking them granted. This really helped me refine my teaching method again and again, and the result is this book.

My shader programming course was picked as one of the best courses offered in the school for 3 years, so I have no doubt this book will help anyone who wants to learn shader programming. Some of my students used the demo they made in my class as part of their portfolio and managed to get jobs at various game companies, including big ones like Ubisoft and Electric Arts, so I can say I did not do such a bad job there, eh? :P

I stopped teaching after December 2009 to focus on my day-time job, game programming, but it kept bothering me that this material would never see the light ever again. So, I decided to make a book out of it. I hope anyone who wants to learn the black art of shader programming finds this book very useful.

As I did in my class, this book follows these simple rules:

• Get your hands dirty: Of course, you cannot complete ignore maths or theories when you write shader code. But I found coding-first-and-learning-theories-later is a much better way to learn shader programming. You have to question yourself first ‘why does it work this way?’ and then get the answer by learning theories. Once you learn any theory this way, you never forget. Therefore, this book is all about code-first approach. Just follow the book as you code. It will tell you background theories here and there whenever necessary. After a while, you will find yourself be able to not only write shader codes but also understand a good amount of theories behind it.
• Easy real-world explanation: I even had some game art students listening in my lectures, and I wanted them to understand my lectures easily, too! What both artists and programmers can understand is real-world examples, so I tried my best to do the same thing in this book. Keen readers might find something that is not 100% correct in theory. This is most likely intentional in order to make the material easier to understand. (But, sometimes, it will be because I do not even understand the theories 100%. I tend to think computer graphics is a big giant trick show that makes human eyes believe it is real, so as long as it is not obviously wrong and the end result “looks” same as the correct one, I do not make a big fuss out of it.)
• Strictly for beginners: This book is strictly for beginners. I do not see any reason to compete against tons of great shader books already out there for advanced readers. I also want this book to be very compact: it should not cost a fortune to start to learn. Once you find shader programming fun after reading this book, go ahead and read other great books to adventure further, and if you find some amazing new techniques, do not forget to email me about it. :-)
• Order matters: Another blessing of teaching at a college was that I had a full control over the order of learning. This book assumes the same; you should read this book from beginning to the end in order. I will not repeat something I explained in the previous chapters. For example, while I am at normal mapping technique, I will not re-explain how lighting works because it is already covered in the previous chapters. Please, read this book from the beginning to the end as if you are taking a college course. I do not think I am asking you too much; after all, this book is pretty slim, right?

What This Book Covers
This book covers how to implement basic and some intermediate shader techniques using Vertex and Pixel Shaders. It consists of mainly three parts

• Part 1 shows the definition of shader, the hello-world of shader programming – colour shader, texture mapping and lighting techniques.
• Part 2 extends what is learned in Part 1 to teach most common shader techniques used in games: specular mapping, normal mapping, shadow mapping and so on.
• Part 3 introduces 2D post-processing techniques, which became more important recently.

This book doesn’t cover new shader units added in DirectX 10 or 11. Geometry, Hull and Computer Shaders are some examples. They are not really for beginners and I have yet to see more practical use of these new shaders in real-world to find out the best way to teach them. However, I don’t think it would be too hard to learn those new ones by yourself once you build a solid foundation with this book.

Programmers
Most of my students were in their second year in the college. Prerequisites of my class included C++, 3D Math and DirectX. I don’t think it is too different from how game programmers get to shader programming, so I would assume any programmer who reads this book is comfortable with above topics. C++ and DirectX are musts. If you know 3D math very well, that is a bonus.

Technical Artists
Who does not love Technical Artists? They make my life easier, and I bet they make artists’ life easier too! Nowadays, they are even authoring shaders, mostly by using some type of graph editors. Do you feel like graph editors are too limiting, and want to learn how to code instead? Then, this book is for you. Even some art school students were able to follow my lectures, so go for it. In this book, the only difference for technical artists is that you get to skip the last part of each chapter, which is playing with the DirectX framework. But do not worry. You get to learn exactly the same thing in another great program called Render Monkey.

If You Have A Question
If you have any question while reading this book, feel free to visit my blog. Other updates including the errata will be uploaded there too.

http://www.popekim.com

Special Thanks To
It was impossible for me to write this book without the tremendous support from so many people around me.

First, I would like to thank all my students who asked me endless questions and challenged me every day. Their cat-like curiosity really made me rethink what I thought I had already known very well, which helped me write this book easier-to-follow and easier-to-understand.

I also would like to acknowledge two test readers, who very carefully read over this book and tested every single line of code for me. Jinyoung Song from NeoPle a Nexon company and Kyle Lee from Youth HiTech, without your hawk eyes, this book would still have 100 bugs.

Also I should say thanks to a talented concept artist, David Sloan, who drew a number of figures in this book. If you find anything that does not look like a programmer art, that is his work.

Last but not least, there were my friends and ex-coworkers who encouraged me to finish this book whenever I make excuses, such as ‘is it worth writing it?’ and ‘I do not have bandwidth to do it’. They are all very talented game developers, so I figured it is fun to list their names here: Vladimir Kouznetsov, Noel Austin, Karl Schmidt, Daniel Barrero, Aaron Lake and Andreas Loebenstein.

Pope Kim
from Burnaby and Montreal, Canada

## Dec 28, 2012

### Surviving Winter in Montreal - Snowfall - ALOT

So yesterday we had some snow.. I mean.. somewhat alot.... 45 to 50 Cm a day. fun eh?

First I shoveled the pathway in front of my house... it took me about 30 mins.

Then I moved to my backyard/parking lot.  I spent about 2~3 hours to clear 2/3 of the parking lot until I had to go out for my Thursday ritual: pub crawl with one of my buddies here. The city does really good job at cleaning major roads. All those snow piles at the curbside will be removed by city in coming days. I was told it would take about a week this time.

One nice thing about having a heavy snow was that there were not many people at the pub.  So it was quiet and cozy: just how I like. Yum~

So while having a couple of drinks, I asked my friend if it's common to have this much snow.  He said "Yeah, this is decent.." So I thought 'okay I have to be prepared for worse snowfalls and shoveling'... But... now I'm watching a news while writing this blog post. It says it was the record high snow fall for a single day in Montreal history. I'm gonna go f**king kick his ass... :P   Oh well, the good news is that now I know i can survive the worst snowfall here.. right?

By the way, I shoveled about 3 more hours today to finish my parking lot.

## Dec 3, 2012

### [Slides] Relic's FX System

I totally forgot to share this.  It's presented at KGC 2012 by my ex-coworker, Daniel Barrero.  I had to nag him enough to prepare this.. so I'll take a tiny bit of credit :)

## Nov 18, 2012

### SSD height-based fade-out effect by ttmayrin

I just heard from a Korean game developer, ttmayrin, who implemented Screen Space Decals in his engine and he loves it. The only problem he had was how the decal border was too obvious because my implementation used in Space Marine lacks any smoothing effect.

He added a fade-out effect based on the height and kindly shared his code on his blog.  His blog is in Korean, but the code and screenshots below ("stolen" from his blog) should explain itself very well.

It's a very simple and cheap way of implementing it and definitely could be one of the decal types artists can choose.

Please note that SSD's projection box has unit length, centered at 0, 0, 0.

## Oct 12, 2012

### [Siggraph 2012] Screen Space Decals in Warhammer 40,000: Space Marine

As I promised, here are the slides I presented at Siggraph 2012.  It's about Screen Space Decal technique I developed for Space Marine

Although it was independently developed, it's very similar to other techniques. such as Deferred Decals. However, I believe we were the only game who used this tech very extensively in a game.  So I focused more on the problems we had, and the solutions we used to solve or avoid them.

## Aug 27, 2012

### Joining Square Enix / Eidos Montreal for....

Today, my ex-student & friend, Friso, asked me why I'm not saying anything about joining Eidos Montreal on my blog. I guess I was just busy dealing with the slowest moving company and finding a new place in Montreal.  Now those things are kinda under control, so let me break the old news.

I've recently accepted a senior rendering programmer position at Eidos Montreal. I don't think I can reveal too much info about the actual project I'm going to work on, but I'm pretty sure I'm allowed to say it's an unannounced AAA project since the programming director says it on his public LinkedIn profile :P

K, so how did it happen?

My plan was to start a serious job hunting after my Siggraph 2012 presentation, but several recruiters contacted me 1 to 2 weeks before Siggraph.  (I guess recruiters contact Siggraph presenters first?) I started to talk to them and had some phone interviews before Siggraph. Eidos Montreal was one of them. Then, a couple of people from Eidos Montreal wanted to meet up during Siggraph, so I did.

To be honest, I didn't think I'd accept this position because it meant I had to move to Montreal: I love Vancouver so much and I already had a couple of better offers from Vancouver companies. But after a quick 10-minute meeting with Eidos Montreal, I pretty much decided to join the team. It was simply a project that I can't ever pass on, that is.

I think it was around August 20th when I accepted the offer. Now I'm moving out from my place early next week, and starting at Eidos the week after. Pretty quick, eh?  I wanted to start early. If this moving company had not been unreasonably slow, I would have started next week even.  The reason is because I'm presenting at Korea Game Conference 2012 in early October, so I wanted to work at this new company at least for a month before taking a "vacation".

Am I excited? I was a lot, and now a bit.  I'm not that kind of person who easily gets or stays excited for a while. We will see how this new journey turns out. I might get more excited once I start.. We will see...

Okay, I think that's enough to summarize my last month or so. Ciao~ until I write next time.

## Jul 11, 2012

### Why Da Hell Am I Inventing the Wheel Again?

My goal of this week: finishing the Screen Space Decal presentation that I'll give out at Siggraph 2012. If it was purely just making a ppt file I think I would have been more inspired to finish it. The problem is that I'm not with Relic any more and I have no access to their codebase or running editor.  So, I'm re-implementing all the features again just to have a demo program where I can take enough screen captures.

It's pretty boring to be honest. I'm not really learning anything by doing this. Just repeating what I already know inside out. Anyways all the feature implementation is done, and some test assets are all placed.  So I'll try to finish this week.

I have a couple of interesting projects I wanna work on after this, so I might be able to use that as my motivation to finish the presentation... we will seee...

## Jul 8, 2012

### OpenCL: Getting OUT_OF_RESOURCE or CL_MEM_OBJECT_ALLOCATION_FAILURE?

I recently had a chance to use OpenCL to speed up an app I was developing. It was pretty fun, but certainly debugging OpenCL was not so easy. I wish it had more error messages at least.

This is the problem I had:

When I was calling clEnqueueNDRangeKernel() function, it suddenly gave me CL_MEM_OBJECT_ALLOCATION_FAILURE error and later I also got OUT_OF_RESOURCE error too. And the following is all the things I tried and how I eventually solved.... This post is more for my own reference if I run into the same issue later. :P

1. Check how much memory is really being allocated for OpenCL
The sum of all the cl_mem buffers I was using was less than3MB.  OpenCL's minimal global memory size requirement is 128MB. So no problem here. Move on.

2. Implement notification function
So apparently we can attach a notification function that will get some descriptive messages whenever there's an error.  I managed to get a bit more error messages after implementing this: clEnqueueWriteBuffer() function calls were giving me CL_MEM_OBJECT_ALLOCATION_FAILURE. But this one didn't really help to solve the problem.

3. Try to run OpenCL on CPU
It ran fine on my Intel CPU, it only happend on my NVidia GPU.  So I installed the up-to-date GPU driver. Still same problem :(

4. Check memory stomp
So I looked into my OpenCL code more closely.  And it turned out to be I was reading/writing outside of allocated local memory.. DOH! After fixing this, the problem disappeared.. Yay... I solved the problem.. But somehow I still feel stupid. :)

## Jun 20, 2012

### Won Best Speaker Award

I won the best speaker award for the presentation I gave out at KGC last year.  And this is the proof :)

I think this is my first award since my school years. heh~ YAY ME!

## Jun 11, 2012

### I'm Presenting at Siggraph 2012

Yay~ My Screen-Space Decal is accepted at Siggraph 2012.  And it looks like I'll be presenting on Monday!. From this link, look at the bottom!

I'm very excited... but now I'm worried that I have to fend for myself because I'm not with any company now... Poor me :(

### Unity, I Love You, But Your UV Is Horrible

I love Unity.  This is the first engine I felt like I can't make anything better than this.  Of course, it's mostly due to its awesome editor.  But the other day, I found one thing that annoys me: UV coordinates system.

Normally, I(or most graphics programmer, I think?) would map UV coordinates to a texture this way: top-left: (0,0) and bottom-right: (1,1)

(0,0)        (1,0)
+-----+-----+
|     |     |
|     |     |
+-----+-----+
|     |     |
|     |     |
+-----+-----+
(0,1)        (1,1)

This one has a nice benefit: the memory layout of textures match UV coordinate system. So any UV-based texture manipulation becomes easy and straight forward. One example, would be copying a sub rect of a texture based on the UV coords.

But for some reason, Unity maps UV coords differently:  bottom-left(0,0) and top-right(1,1).... so basically it's flipped vertically.

(0,1)        (1,1)
+-----+-----+
|     |     |
|     |     |
+-----+-----+
|     |     |
|     |     |
+-----+-----+
(0,0)        (1,0)

What..? why would you ever do that?  I know some math books use this way, but I can't think of any practical benefits of mapping UVs in this way.  Sometimes we rendering programmer make a wrong decision we are so used to certain notations. And, to me, this is clearly the case.

If there's any benefit i'm missing here, please let me know.  But just don't tell me "it's traditionally done this way, so it must be right". If I was a person who'd accept that kinda BS argument, I'd be a good Christian believing that Jesus was reborn in 3 days after he was killed.

## May 23, 2012

### Using P4SandBox for 3 Weeks or So

I have been using P4SandBox for 3 weeks or so, and I'm pretty happy with it. The private repo is great: creating new streams(read it as branches) and switching between them are fast. (It internally uses shelving, of course)  So streams almost work as branches + automatic shelving.

One bug I found:

• If you MANUALLY shelve in a stream, you won't be able to unshelve it!  So currently I'm not doing any manual shelving in a stream: Instead, I keep creating a new stream if I need to shelve what I've been working on and start to work on something else. By the way, it works fine in the traditional central-controlled scheme with no P4SandBox involved.
One annoyance I found:
• When you copy/merge, (it's basically push/pull in Git) it doesn't propagate change histories. Perforce said the next version will have this feature implemented

## May 1, 2012

### Personal Choice of Version Control System in 2012

Sometime ago, I wrote a blog post about my personal choice of VCS and the winner was  Subversion.  But now there has been two changes from Perforce that made me want to re-evaluate. Those two changes are:
• New feature called P4SandBox
• More generous free version limitations: 20 users and 20 workspaces

What is P4SandBox?
This new feature available in Perforce 2012.1 enables us to use a local private depot similar to Git.  But this is not a completely distributed system: it is more like you make a mirror in your HDD and work offline.  Then when you are ready to commit to the central server, you push all your local changes to the central server. Also P4SandBox has the concept of streams which almost work similar to the powerful Git branching.

This video will explain these new features better than me :)

Background
Being in the gaming industry about 10 years and playing with some open-source projects means I have dealt with different version control systems, or VCSs, from complete free, brute-force manual file copy method to very expensive commercial-grade Perforce.

So which program have I been using at home? Subversion.... I know! A lot of people will argue that other programs are better, and I am not gonna say they are wrong. The reason why I have been using Subversion is because it did what I want with the least amount of annoyance.  But now I'm gonna switch back to Perforce because I became to believe Perforce will serve me better now.  Below is the list of what I need/want from my personal VCS and how the most popular VCSs do the job:

Windows Support
Yes, I'm a MS whore.  I use Windows all the time, and I, as a game programmer, personally don't see huge need for Linux for myself.  Also if I can maintain only one OS at home, that's less drama for me. (Yay?).
• Git(-2): I like Git a lot, especially how it handles branches, so I really wanted to use it on Windows.  But, as far as I know, the only way to use Git server on Windows is through Cygwin or msysgit.  Cygwin is basically doing Linux emulation in a sense, and I personally don't enjoy installing Cygwin. msysgit is a bit easier to install on Windows, but still I had to set up SSH or what not, so there's no one-button solution for Git on Windows.  So big no-no to a MS whore like me.
• Perforce(+1): Perforce supports Windows pretty well.  It comes with easy-to-install server program/service for windows.
• Subversion(+2): This was actually a big surprise to me.  There is a program called VisualSVN Server, which is one-click solution for Subversion server on Windows.  It just works and comes with https access and access control all in one nice and simple GUI.  This was even easier than installing Perforce.
Occasional Multi-User Support
Although my VCS is mostly there to keep a history and backups of my own codes, sometimes I open it up to my friends so that I can get useful feedback from them.  So having multi-user support is very useful for me.

• Git(+1): Git can easily support multiple users.  But setting access control for each users can be a bit of PITA on Windows.  When I tried it last time, I had to make fake Windows user accounts and hook'em up with SSH.
• Perforce(+1): Perforce used to be free for either i) 2 users and 5 client workspaces, or ii) unlimited users and up to 1000 files.  But now it's free for 20 users and 20 client workspaces with unlimited # of files.  I believe each user need 2 workspaces to get P4SandBox working(one for main and the other for the mirror) per computer.  So if we say each user uses 2 computers, one user will take up 4 workspaces.  So 5 full-time users...  I think it's very reasonable for my personal projects. So I changed this score from -1 to 1
• Subversion(+2): As I said earlier in Windows Support section, VisualSVN Server comes with a nice GUI where you can simply setup users and access control.  So another big thumbs up from me.

Cost
I'm cheap. I love free stuff.

• Git(+1): free
• Perforce(0): free for limited use. and this time the limitation is not severe.. So I changed the score from -1 to 0.
• Subversion(+1): free

GUI Client
I'm in love with Perforce's nice GUI clients. Not so much with P4V; more with P4Win. But P4Win is discontinued.... Oh well, P4V is still good enough.  Sure, I still use command-line a lot for certain things GUI clients don't support, but I found 90% of time, using GUI clients are much faster and easier.

• Git(-1): it doesn't have any nice free GUI client as far as I know.  There are some being developed at this moment, but they don't seem to be mature or free enough to use them.  TortoiseGit is good enough most of the time.  But I still prefer P4Win style, real GUI clients.
• Perforce(+2): P4Win is awesome. P4V is great, too.
• Subversion(+1): I found a program called SmartSVN.  It has limited functionalities unless you buy the pro version, but I found the basic free version is good enough for day-to-day operations.  Anything that cannot be done through the free SmartSVN version, I use TortoiseSVN.  Then anything that cannot be done by TortoiseSVN, I use command-line.
Branching
Who doesn't love branching?  It's such a neat tool to fuck around(read it as experiment) your code without ruining your projects.
• Git(+2): I love the powerful branching feature of Git. You don't need to make a copy in different directories, so it helps a lot with path referencing in the code.  Say you have a program that links with library Awesome, and now you wanna branch library Awesome.  With Git, you simply need to switch to different branch and build.  But with other source control systems like Perforce, you will have to branch the library into a different directory and change the library path in your program code.
• Perforce(+2): As I just explained in Git section above, branching into a different folder sucks.  Also the speed of branching a large number of files is slow because Perforce server controls everything.  Network speed is slower than your HDD's spin rate. However, if you use streams in P4SandBox the branching almost became Git's branching. Everything works inside your sandbox without connecting to the central server, so it's super fast.  So I changed the score from -2 to 2 here.
• Subversion(-1): The speed is fast enough.  But still you have to branch into different directory.. uggh.. that bothers me.

Final Score
So final score for me is like this.
Don't forget.  This is the score for my personal need. Not for the big giant game studios.  So if you ever comeback and say "but Perforce is better because it can supports 200 users easily", I'm gonna make you watch this video for 2 hours before you go to bed.