<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-398682525365778708</id><updated>2012-01-27T22:23:29.364-08:00</updated><title type='text'>Diary of a Graphics Programmer</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default?start-index=101&amp;max-results=100'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>101</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3237631949993947948</id><published>2011-09-18T12:16:00.000-07:00</published><updated>2012-01-26T12:25:00.620-08:00</updated><title type='text'>Graphics Demo Programming</title><content type='html'>About 8 or ten years ago I started thinking about doing a graphics demo for the demo scene. I started to prepare a minimum skeleton that should compile to the smallest possible exe. Over the years I kept this Skeleton alive going from Windows XP to Windows 7 and from DirectX 8 to DirectX 10.&lt;br /&gt;More than three years ago I put the source code up on Google Code here and kept updating it:&lt;br /&gt;&lt;br /&gt;http://code.google.com/p/graphicsdemoskeleton/&lt;br /&gt;&lt;br /&gt;Although the source code is rather short, I played around with many ideas over the years. I read articles by the demo scene about getting smaller exe's just by using Visual Studio. After realizing that my exe got bigger with every new Visual Studio version, I switched to Pelles C; a free development environment with a compiler that follows the C99 standard:&lt;br /&gt;&lt;br /&gt;http://www.smorgasbordet.com/pellesc/&lt;br /&gt;&lt;br /&gt;My exe is now 838 bytes in size without violating Windows rules about releasing occupied resources. I tried to replace some of the code with assembly code, especially the entry points of the D3D functions and saved a few bytes at some point in time but removed it again because it was too inconvenient.&lt;br /&gt;At some point (probably while it was running on DirectX 9) I implemented a small GPU particle system that didn't add much to the size, which was pretty cool.&lt;br /&gt;One of the interesting things I found out was that HLSL code was packing in some cases smaller than C code for the CPU. I found this remarkable and I thought it would be a cool idea to write a small CPU stub and then go from there in HLSL.&lt;br /&gt;I know there will be times when I go back to this piece of code and wonder what else I can do with it and spend half an hour looking through it. It was certainly the project with some of the lowest priorities in the last ten years ... maybe you can take the source and do something cool with it :-)&lt;br /&gt;&lt;br /&gt;There is also a whole demo framework released by Inigo Quelez here&lt;br /&gt;&lt;br /&gt;http://www.iquilezles.org/www/material/isystem1k4k/isystem1k4k.htm&lt;br /&gt;&lt;br /&gt;Other useful links are:&lt;br /&gt;&lt;br /&gt;http://yupferris.blogspot.com/&lt;br /&gt;http://4klang.untergrund.net/&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3237631949993947948?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3237631949993947948/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3237631949993947948' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3237631949993947948'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3237631949993947948'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/09/graphics-demo-programming.html' title='Graphics Demo Programming'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2095424413759655494</id><published>2011-09-14T12:17:00.000-07:00</published><updated>2012-01-26T12:25:09.617-08:00</updated><title type='text'>DirectX 11.1 Notes</title><content type='html'>I plan to update this blog in the next couple of days with more information as soon as it becomes available.&lt;br /&gt;&lt;br /&gt;The upgrade from DirectX 11 to DirectX 11.1 is targeting the following areas: &lt;br /&gt;&lt;ul&gt;&lt;li&gt;Enabling Metro-style applications (there’s some additional access control for this which impacted almost every API in the OS so there’s no Direct3D 9 in Metro-style, but Direct3D 11.x is fully supported)&lt;/li&gt;&lt;li&gt;Technical computing (optional extended double-precision instructions (division, etc.) and optional Sum of Absolute Differences instruction; enabling hardware access via Direct3D from session 0; HLSL compiler improvements)&lt;/li&gt;&lt;li&gt;Low-power device optimizations (reduced shader precision hints for HLSL/driver, Tiled Graphics Rendering optimizations, 4 bit-per-pixel DXGI formats 565, 5551, 4444)&lt;/li&gt;&lt;li&gt;Various graphics stack unification efforts (Direct3D 11 Video, Direct3D  interop improvements for Direct2D &amp;amp; Media Foundation, etc.)&lt;br /&gt;&lt;li&gt;Windows on ARM&lt;/li&gt;&lt;br /&gt;&lt;br /&gt;&lt;/ul&gt;The main new feature targeting consumers is stereoscopic rendering.  The new DirectX 11.1 features are described on the &lt;a href="http://msdn.microsoft.com/en-us/library/hh404562%28v=VS.85%29.aspx#support_a_larger_number_of_uavs" title="MSDN website"&gt;MSDN website&lt;/a&gt;.  For me the most remarkable things on this page are: 1. Use UAV's at every Pipeline stage 2. Use logical operations in a render target 3. Shader tracing: looks like a new way to measure shader performance   The functions (ID3D11Device::CheckFeatureSupport / ID3D11Device::CheckFormatSupport) that check for supported features and formats were extended as well.  D3D11_FEATURE_DATA_ARCHITECTURE_INFO seems to be for tiled-based rendering hardware commonly used in mobile GPUs and D3D11_FEATURE_DATA_SHADER_MIN_PRECISION_SUPPORT too.  DXGI_FORMAT was extended to support video formats that now can be processed with shaders by using resource views (SRV/RTV/UAV).  There is a new D3D_FEATURE_LEVEL_11_1 defined (i.e. a minor revision of the hardware feature set), but I don’t (yet) have a good link to give you to summarize the required features. Of course, there’s no public drivers or hardware out yet that exposes FL 11.1 anyhow. As before, DirectX 11.1 (the API) works with a  range of Feature Levels (the hardware).   WARP on Windows 8 supports FL 11.0 (and 10.1 as before) and includes support for the DXGI 1.2 16bpp formats (565, 5551, 4444).  The Windows 8 Developer Preview SDK includes the latest HLSL compiler FXC tool and D3DCompiler.DLL, the Debug Layers DLL, and the REF device DLL for DirectX 11.1 on Windows 8.  The MSDN documentation now includes details on SM 4.x and SM 5.0 shader assembly (for deciphering the compiler’s disassembly output) plus details on BC6H/BC7 compression formats   &amp;lt;&lt;a href="http://msdn.microsoft.com/en-us/library/bb943998(v=VS.85).aspx"&gt;http://msdn.microsoft.com/en-us/library/bb943998(v=VS.85).aspx&lt;/a&gt;&amp;gt; &amp;lt;&lt;a href="http://msdn.microsoft.com/en-us/library/hh447232(v=VS.85).aspx"&gt;http://msdn.microsoft.com/en-us/library/hh447232(v=VS.85).aspx&lt;/a&gt;&amp;gt; &amp;lt;&lt;a href="http://msdn.microsoft.com/en-us/library/hh308955(v=VS.85).aspx"&gt;http://msdn.microsoft.com/en-us/library/hh308955(v=VS.85).aspx&lt;/a&gt;&amp;gt;  What is different from DirectX 11 on PC &lt;ul&gt;&lt;li&gt;D3DX9, D3DX10, D3DX11 are not supported for Metro-style applications.&lt;/li&gt;&lt;li&gt;The Texconv sample includes the "DirectXTex" library which has all the texture processing functionality, WIC-based IO, DDS codec, BC software compression/decompression, etc. as shared source that was in D3DX11.&lt;/li&gt;&lt;li&gt;D3DCSX_44.DLL is in the Windows SDK for redist with applications, and I believe is supported in Metro style applications.&lt;/li&gt;&lt;li&gt;D3DCompiler_44.DLL is available for REDIST with Desktop applications and for development, but is not supported for REDIST in Metro-style applications. We've long recommended not doing run-time compliation, and Metro style enforces this at deployment time.&lt;/li&gt;&lt;li&gt;XINPUT1_4.DLL and XAUDIO2_8.DLL are included in the OS and are fully supported for Metro style applications.&lt;/li&gt;&lt;/ul&gt;Here are reference links:  Outlines all the new features: http://msdn.microsoft.com/en-us/library/windows/desktop/hh404457.aspx  DirectX 11.1 Features &amp;lt;&lt;a href="http://msdn.microsoft.com/en-us/library/hh404562(v=VS.85).aspx"&gt;http://msdn.microsoft.com/en-us/library/hh404562(v=VS.85).aspx&lt;/a&gt;&amp;gt;   DXGI 1.2 &amp;lt;&lt;a href="http://msdn.microsoft.com/en-us/library/hh404490(v=VS.85).aspx"&gt;http://msdn.microsoft.com/en-us/library/hh404490(v=VS.85).aspx&lt;/a&gt;&amp;gt;   WDDM 1.2 &amp;lt;&lt;a href="http://go.microsoft.com/fwlink/?LinkId=226814"&gt;http://go.microsoft.com/fwlink/?LinkId=226814&lt;/a&gt;&amp;gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2095424413759655494?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2095424413759655494/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2095424413759655494' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2095424413759655494'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2095424413759655494'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/09/directx-111-notes.html' title='DirectX 11.1 Notes'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-9113293353124742789</id><published>2011-07-31T12:18:00.000-07:00</published><updated>2012-01-26T12:25:18.600-08:00</updated><title type='text'>Even Error-Distribution: Rules for Designing Graphics Sub-Systems (Part III)</title><content type='html'>The design rule of "Even Error-Distribution" is very common in everything we do as graphics/engine programmers. Compared to the "&lt;a href="http://altdevblogaday.com/2011/07/03/no-look-up-tables-rules-for-designing-graphics-sub-systems-part-ii/" title="No Look-up Tables"&gt;No Look-up Tables&lt;/a&gt;" and "&lt;a href="http://altdevblogaday.com/2011/06/13/screen-space-rules-for-designing-graphics-sub-systems-part-i/" title="Screen-Space."&gt;Screen-Space&lt;/a&gt;" rules, it is probably easier to agree on this principle in general. The idea is that whatever technique you implement, you face the observer always with a consistent "error" level. The word error describes here a difference between what we consider the real world experience and the visual experience in a game. Obvious examples are toon and hatch shading, where we do not even try to render anything that resembles the real world but something that is considered beautiful. More complex examples are the penumbras of shadow maps, ambient occlusion or a real-time global illumination approach that has a rather low granularity.&lt;br /&gt;The idea behind this design rule is that whatever you do, do it consistently and hope that the user will adjust to the error and not recognize it after a while anymore. Because the error is evenly distributed throughout your whole game, it is tolerated easier by the user.&lt;br /&gt;&lt;br /&gt;To look at it from a different perspective. At Confetti we target most of the available gaming platforms. We can render very similar geometry and textures on different platforms. For example iOS/Android with OpenGL ES 2.0 and then Windows with DirecX 11 or XBOX 360 with its Direct3D. For iOS / Android you want to pick different lighting and shadowing techniques than for the higher end platforms. For shadows it might be stencil shadow volumes on low-end platforms and shadow maps on high end platforms. Those two shadowing techiques have very different performance and visual characteristics. The "error" resulting from stencil shadow volumes is that the shadows are -by default- very sharp and pronounced while shadow maps on the higher end platforms can be softer and more like real life shadows.&lt;br /&gt;A user that watches the same game running on those platforms, will adjust to the "even" error of each of those shadow mapping techniques as long as they do not change on the fly. If you would mix the sharp and the soft shadows, users will complain that the shadow quality changes. If you provide only one or the other shadow, there is a high chance that people will just get used to the shadow appearance.&lt;br /&gt;Similar ideas apply to all the graphics programming techniques we use. Light mapping might be a viable option on low end platforms and provide pixel perfect lighting, a dynamic solution replacing those light maps might have a higher error level and not being pixel perfect. As long as the lower quality version always looks consistent, there is a high chance that users won't complain. If we would change the quality level in-game, we are probably faced with reviews that say that the quality is changing.&lt;br /&gt;&lt;br /&gt;Following this idea, one can exclude techniques that change the error level on the fly during game play. There were certainly a lot of shadow map techniques in the past that had different quality levels based on the angle between the camera and the sun. Although in many cases they looked better than the competing techniques, users perceived the cases when their quality was lowest as a problem.&lt;br /&gt;Any technique based on re-projection, were the quality of shadows, Ambient Occlusion or Global Illumination changes while the user watches a scene, would violate the "Even Error-Distribution" rule.&lt;br /&gt;A game that mixes light maps that hold shadow and/or light data and dynamic light and / or regular shadow maps might have the challenge to make sure that there is no visible difference between the light and shadow quality. Quite often the light mapped data looks better than the dynamic techniques and the experience is inconsistent. Evenly distributing the error into the light map data would increase the user experience because he/she is able to adjust better to an even error distribution. The same is true for any form of megatexture approach.&lt;br /&gt;A common problem of mixing light mapped and generated light and shadow data is that in many cases dynamic objects like cars or characters do not receive the light mapped data. Users seems to have adjusted to the difference in quality here because it was consistent.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-9113293353124742789?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/9113293353124742789/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=9113293353124742789' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/9113293353124742789'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/9113293353124742789'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/07/even-error-distribution-rules-for.html' title='Even Error-Distribution: Rules for Designing Graphics Sub-Systems (Part III)'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5859364890433403233</id><published>2011-07-03T12:19:00.000-07:00</published><updated>2012-01-26T12:25:26.617-08:00</updated><title type='text'>No Look-up Tables: Rules for Designing Graphics Sub-systems (Part II)</title><content type='html'>One of the design principles I apply to graphics systems nowadays is to avoid any look-up table. That means not only mathemetical look-up tables -the way we used them in the 80ieth and the 90ieth- but also cached lighting, shadow and other data, that is sometimes stored in light or radiosity maps.&lt;br /&gt;&lt;br /&gt;This design principle follows the development of GPUs. While decent GPUs offer with each new iteration a larger number of arithetic instructions, the memory bandwidth is stagnating since several years. Additionally transporting data to the GPU might be slow due to several bottlenecks like DVD speed, PCI Express bus etc.. In many cases it might be more efficient to calculate a result with the help of arithmetic instructions, instead of doing a look-up in a texture or any other memory area. Saving streaming bandwidth throughout the hardware is also a good motivation to avoid look-up textures like this. Quite often any look-up technique doesn't allow a 24 hour game cycle, where the light and shadows have to move accordingly with time.&lt;br /&gt;In many cases using pre-baked textures to store lighting, shadowing or other data also leads to a scenario where the geometry on which the texture is applied is not destructible anymore. &lt;br /&gt;&lt;br /&gt;Typical examples are:&lt;br /&gt;- Pre-calculating a lighting equation and storing results in a texture like a 2D, Cube or 3D map&lt;br /&gt;- Large terrain textures, like megatextures. Texture synthesis might be more efficient here&lt;br /&gt;- Light and radiosity maps and other pre-calculated data for Global illumination approaches&lt;br /&gt;- Signed distance fields (if they don't allow 24 hour cycle lights and shadows)&lt;br /&gt;- Voxels as long as they require a large amount of data to be re-read each frame and don't allow dynamic lighting and shadows (24 hour cycle)&lt;br /&gt;- ... and more ...&lt;br /&gt;&lt;br /&gt;Following the "No Look-up Table" design principle, one of the options to store intermediate data is to cache it in GPU memory, so that data doesn't need to be generated on the fly. This might be a good option depending on the amount of memory available on the GPU. &lt;br /&gt;Depending on the underlying hardware platform or the requirements of the game, the choice between different caching schemes makes a system like this very flexible.&lt;br /&gt;&lt;br /&gt;Here are a collection of ideas that might help to make an informed decision on when to apply a caching scheme, that keeps data around in GPU memory:&lt;br /&gt;- Whenever the data is not visible to the user, it doesn't need to be generated. For example color, light and shadow data only need to be generated if the user can see them. That requires that they are on-screen with sufficient size. Projecting an object into screen-space allows to calculate its size. If it is too small or not visible any data attached to it doesn't need to be generated. This idea does not only apply to geometric objects but also light and shadows. If a shadow is too small on the screen, we do not have to re-generate it.&lt;br /&gt;- Cascaded Shadow maps introduced a "level of shadowing" system that distributes shadow resolution along the view frustum in a way that the shadow resolution distribution dedicates less resolution to objects farer away, while closer up objects recieve relativly more shadow resolution. Similarly lighting quality should increase and decrease based on distance. Cascaded Reflective shadow maps extend the idea on any global illumination data, like one bounce diffuse lighting and ambient occlusion.&lt;br /&gt;- If the quality requirements are rather low because the object is far away or small, screen-space techniques might allow to store data in a higher density. For example Screen-Space Global illumination approaches that are based on the G-Buffer - that is already used to apply Deferred Lights in a scene- can offer an efficient way to light far away objects.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5859364890433403233?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5859364890433403233/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5859364890433403233' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5859364890433403233'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5859364890433403233'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/07/no-look-up-tables-rules-for-designing.html' title='No Look-up Tables: Rules for Designing Graphics Sub-systems (Part II)'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-8739156312668700220</id><published>2011-06-13T12:20:00.000-07:00</published><updated>2012-01-26T12:25:34.036-08:00</updated><title type='text'>Screen-Space: Rules for Designing Graphics Sub-systems (Part I)</title><content type='html'>Since programming GPUs allow one to design more complex graphics systems, I started to develop a few simple rules that have survived the test of time, while designing graphics sub-systems like Skydome, PostFX, Vegetation, Particle, Global Illumination, Light &amp;amp; Shadow systems etc..&lt;br /&gt;Here are three of them:&lt;br /&gt;1. Screen-Space (Part I - this part)&lt;br /&gt;2. No Look-up Tables (Part II)&lt;br /&gt;3. Even Error Distribution (Part III)&lt;br /&gt;&lt;br /&gt;Today we focus on the Screen-Space design rule. It says: "do everything you can in Screen-Space because it is more efficient most of the time". This is easy to say for the wide range of effects that are part of a Post-Processing Pipeline like Depth of Field, Motion Blur, Tone Mapping and color filters, light streaks and others (read more in [Engel07]), as well as anti-aliasing techniques like MLAA that anti-alias the image in screen-space.&lt;br /&gt;With the increased number of arithmetic instructions available and the stagnating growth of memory bandwidth, two new groups of sub-systems can be moved into screen-space.&lt;br /&gt;Accompanying Deferred Lighting systems, more expensive materials like skin and hair can now be applied in screen-space; this way a screen-space material system is possible [Engel], solving some of the bigger challenges to implementing a Deferred Lighting pipeline.&lt;br /&gt;Global Illumination and Shadow filter kernels can be moved into screen-space as well. For example, for a large number of Point or Ellipsoidal Shadow Maps, all the shadow data can be stored in a shadow collector in screen-space and then an expensive filter kernel can be applied to this screen-space texture [Engel2010].&lt;br /&gt;&lt;br /&gt;The wide range of abilities available with screen-space filter kernels makes it valuable to look at the challenges while implementing them in general. The common challenges to applying materials or lights and shadows with the help of large-scale filter kernels in screen-space are mostly:&lt;br /&gt;1. Scale filter kernel based on camera distance&lt;br /&gt;2. Add anisotropic "behavior" to the screen-space filter kernel&lt;br /&gt;3. Restricting the filter kernel based on the Z value of the Tap&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Scaling Filter Kernel based on Camera Distance&lt;/strong&gt;&lt;br /&gt;Using a screen-space filter kernel for filtering shadows, GI, emulating sub-surface scattering for skin or rendering hair, requires at some point to scale the filter kernel based on the distance from the camera or, better yet, the near plane to the pixel in question. What has worked in the past is:&lt;br /&gt;&lt;pre lang="LANG=C" escaped="true"&gt;// linear depth read more in [Gilham]&lt;br /&gt;// Q = FarClip / (FarClip – NearClip)&lt;br /&gt;// Depth = value from a hyperbolic depth buffer&lt;br /&gt;float  depthLin= (-NearClip * Q) / (Depth - Q);&lt;/pre&gt;&lt;pre lang="LANG=C" escaped="true"&gt;// scale based on distance to the viewer&lt;br /&gt;// renderer-&amp;gt;setShaderConstant4f("TexelSize", vec4(width, height, 1.0f / width, width / height));&lt;br /&gt;sampleStep.xy = float2(1.0f, TexelSize.w) * sqrt(1.0f / ((depthLin.xx * depthLin.xx) * bias));&lt;/pre&gt;Scaling only happens based on linearized depth values that are going from 0.0..1.0 between the near and far plane. This considers the camera's near and far plane settings. The bias value is a user defined "magic" value. The last channel in the TexelSize variable holds the x and y direction ratio of the pixel. The inner term - 1.0/distance&lt;sup&gt;2&lt;/sup&gt;- of the equation resembles a simple light attenuation function. We will improve this equation in the near future.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Anisotropic Screen-Space Filter Kernel&lt;/strong&gt;&lt;br /&gt;Following [Geusebroek], anisotropy can be added to a screen-space filter kernel by projecting into a ellipse following the orientation of the geometry.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/06/ssss_aniso.jpg"&gt;&lt;img class="alignleft size-full wp-image-8458" src="http://altdevblogaday.com/wp-content/uploads/2011/06/ssss_aniso.jpg" alt="" width="530" height="488" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 1 - Anisotropic Screen-Space Filter Kernel&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Normals that are stored in a world-space buffer in a G-Buffer can be compared to the view vector. The elliptical "response" is achieved by taking the square root of this operation.&lt;br /&gt;&lt;pre lang="LANG=C" escaped="true"&gt;float Aniso = saturate(sqrt(dot( viewVec, normal )));&lt;/pre&gt;&lt;br /&gt;&lt;strong&gt;Restricting the filter kernel based on the Z value of the Tap&lt;br /&gt;&lt;/strong&gt;One of the challenges with any screen-space filter kernel is the fact that the wide filter kernel can smear values into the penumbra around "corners" of geometry (read more in [Gumbau].&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/06/Error.jpg"&gt;&lt;img class="alignleft size-large wp-image-8460" src="http://altdevblogaday.com/wp-content/uploads/2011/06/Error-1024x575.jpg" alt="" width="695" height="390" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 2 - Error introduced by running a large filter kernel in screen-space&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;A common way to solve this problem is to compare the depth values of the center of the filter kernel with the depth values of the filter kernel taps and define a certain threshold where we consider the difference between the depth values large enough to early out. A source code snippet for this might look like this.&lt;br /&gt;&lt;br /&gt;&lt;pre lang="LANG=C" escaped="true"&gt;bool isValidSample = bool( abs(sampleDepth - d) &amp;lt; errDepth );&lt;br /&gt;if (isValidSample &amp;amp;&amp;amp; isShadow)&lt;br /&gt;{&lt;br /&gt;  // the sample is considered valid&lt;br /&gt;  sumWeightsOK += weights[i+1];     // accumulate valid weights&lt;br /&gt; Shadow += sampleL0.x * weights[i+1];   // accumulate weighted shadow value&lt;br /&gt;}&lt;/pre&gt;&lt;strong&gt;Acknowledgements&lt;br /&gt;&lt;/strong&gt;I would like to thank Carlos Dominguez for the discussions about how to scale filter kernels based on camera distance.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;br /&gt;[Engel] Wolfgang Engel, "Deferred Lighting / Shadows / Materials", FMX 2011, &lt;a href="http://www.confettispecialfx.com/confetti-on-fmx-in-stuttgart-ii"&gt;http://www.confettispecialfx.com/confetti-on-fmx-in-stuttgart-ii&lt;/a&gt;&lt;br /&gt;[Engel07] Wolfgang Engel, "Post-Processing Pipeline", GDC 2007, http://www.coretechniques.info/index_2007.html&lt;br /&gt;[Engel2010] Wolfgang Engel, "Massive Point Light Soft Shadows", &lt;a href="http://www.confettispecialfx.com/massive-point-light-soft-shadows"&gt;http://www.confettispecialfx.com/massive-point-light-soft-shadows&lt;/a&gt;&lt;br /&gt;[Geusebroek] Jan-Mark Geusebroek, Arnold W. M. Smeulders, J. van de Weijer, “Fast anisotropic Gauss filtering”, IEEE Transactions on Image Processing, Volume 12 (8), page 938-943, 2003&lt;br /&gt;[Gilham] David Gilham, "Real-Time Depth-of-Field Implemented with a Post-Processing only Technique", ShaderX5: Advanced Rendering, Charles River Media / Thomson, pp 163 - 175, ISBN 1-58450-499-4&lt;br /&gt;[Gumbau] Jesus Gumbau, Miguel Chover, and Mateu Sbert, “Screen-Space Soft Shadows”, GPU Pro, pp. 477 - 490&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-8739156312668700220?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/8739156312668700220/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=8739156312668700220' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8739156312668700220'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8739156312668700220'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/06/screen-space-rules-for-designing.html' title='Screen-Space: Rules for Designing Graphics Sub-systems (Part I)'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4046500821353565663</id><published>2011-05-29T12:22:00.000-07:00</published><updated>2012-01-26T12:25:43.183-08:00</updated><title type='text'>Points, Vertices and Vectors</title><content type='html'>This post covers some facts about Points, Vertices and Vectors that might be useful. This is a collection of ideas to create a short math primer for engineers that want to explore computer graphics. The resulting material will be used in future computer graphics classes. Your feedback is highly welcome!&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Points&lt;/strong&gt;&lt;br /&gt;A 3D point is a location in space, in a 3D coordinate system. We can find a point &lt;em&gt;P&lt;/em&gt; with coordinates [&lt;em&gt;P&lt;/em&gt;&lt;sub&gt;x&lt;/sub&gt;, &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;y&lt;/sub&gt;, &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;z&lt;/sub&gt;] by starting from the origin at [0, 0, 0] and moving the distance &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;x&lt;/sub&gt;, &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;y&lt;/sub&gt; and &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;z&lt;/sub&gt; along the x, y and z axis.&lt;br /&gt;&lt;br /&gt;Two points define a line segment between them, three points define a triangle with corners at those points, and several interconnected triangles can be used to define the surface of an object; sometimes also called mesh.&lt;br /&gt;&lt;br /&gt;Points that are used to define geometric entities like triangles, are often called vertices. In graphics programming, vertices are an array of structures or a structure of arrays and not only describe a position but also include other data like for example color, a normal vector or texture coordinates.&lt;br /&gt;&lt;br /&gt;The difference of two points is a vector: &lt;strong&gt;V&lt;/strong&gt; = &lt;em&gt;P&lt;/em&gt; - &lt;em&gt;Q&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Vectors&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;While a point is a reference to a location, a vector is the difference between two points which describes a direction and a distance -length-, or a displacement.&lt;br /&gt;&lt;br /&gt;Like points, vectors can be represented by three coordinates. Those three values are retrieved by subtracting the tail from the vector from its head.&lt;br /&gt;&lt;br /&gt;Δx = (x&lt;sub&gt;h&lt;/sub&gt; - x&lt;sub&gt;t&lt;/sub&gt;)&lt;br /&gt;Δy = (y&lt;sub&gt;h&lt;/sub&gt; - y&lt;sub&gt;t&lt;/sub&gt;)&lt;br /&gt;Δz = (z&lt;sub&gt;h&lt;/sub&gt; - z&lt;sub&gt;t&lt;/sub&gt;)&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/Vector.jpg"&gt;&lt;/a&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/Vector.jpg"&gt;&lt;img class="alignleft size-large wp-image-4655" src="http://altdevblogaday.com/wp-content/uploads/2011/04/Vector-1024x563.jpg" alt="Vector components" width="695" height="382" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Figure 1 - Vector components Δx, Δy and Δz&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Two vectors are equal if they have the same values. Thus considering a value as a difference of two points, there are any number of vectors with the same direction and length.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/InstancesOfVector.jpg"&gt;&lt;img class="alignleft size-large wp-image-4649" src="http://altdevblogaday.com/wp-content/uploads/2011/04/InstancesOfVector-1024x590.jpg" alt="" width="695" height="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;2 - Instances of one vector&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The difference between between points and vectors is reiterated by saying they live in a different space, the Euclidean space &lt;img src="http://www.codecogs.com/eq.latex? \ \mathbb{E}^3" alt="" /&gt; and the vector space &lt;img src="http://www.codecogs.com/eq.latex? \ \mathbb{R}^3" alt="" /&gt;. Read more in [Farin].&lt;br /&gt;&lt;br /&gt;The primary reason for differentiating between points and vectors is to achieve geometric constructions which are coordinate independent. Such constructions are manipulations applied to objects that produce the same result regardless of the location of the coordinate origin.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Scalar Multiplication, Addition and Subtraction of Vectors&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;A vector &lt;strong&gt;V&lt;/strong&gt; can be multiplied by a scalar. Multiplying by 2 doubles the vectors components.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ v = \left[ {\begin{array}{*{20}{c}} 2\\ 3\\ 4\\ 0\\ \end{array}} \right] then 2v = \left[ {\begin{array}{*{20}{c}} 4\\ 6\\ 8\\ 0\\ \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ v = \left[ {\begin{array}{*{20}{c}} n1\\ n2\\ n3\\ 0\\ \end{array}} \right]\, \, then \, \lambda \, v = \left[ {\begin{array}{*{20}{c}} \lambda n1\\ \lambda n2\\ \lambda n3\\ 0\\ \end{array}}\right]\, where \, [\lambda\, \in \,\mathbb{R}^3]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Similarly dividing the vector by 2 halves its components. The direction of the vector remains unchanged, only its magnitude changes.&lt;br /&gt;&lt;br /&gt;The result of adding two vectors &lt;strong&gt;V&lt;/strong&gt; and &lt;strong&gt;W&lt;/strong&gt; can be obtained geometrically.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/VectorAddition.jpg"&gt;&lt;img class="alignleft size-full wp-image-4656" src="http://altdevblogaday.com/wp-content/uploads/2011/04/VectorAddition.jpg" alt="" width="347" height="213" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;3 - Adding two vectors&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Placing the tail of w to the head of &lt;strong&gt;V&lt;/strong&gt; leads to the resulting vector, going from &lt;strong&gt;V&lt;/strong&gt;'s tail to &lt;strong&gt;W&lt;/strong&gt;'s head. In a similar manner vector subtraction can visualized.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/VectorSubtraction.jpg"&gt;&lt;img class="alignleft size-full wp-image-4658" src="http://altdevblogaday.com/wp-content/uploads/2011/04/VectorSubtraction.jpg" alt="" width="324" height="238" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;4 - Subtracting two vectors&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Similar to addition, the tail of the vector that should be subtracted -&lt;strong&gt;W&lt;/strong&gt;- is placed to the head of &lt;strong&gt;V&lt;/strong&gt;. Then the vector that should be subtracted is negated. The resulting vector runs from &lt;strong&gt;V&lt;/strong&gt;'s tail to &lt;strong&gt;W&lt;/strong&gt;'s head.&lt;br /&gt;&lt;br /&gt;Alternatively, by the parallelogram law, the vector sum can be seen as the diagonal of the parallelogram formed by the two vectors.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/VectorParallelogrammRule.jpg"&gt;&lt;img class="alignleft size-full wp-image-4657" src="http://altdevblogaday.com/wp-content/uploads/2011/04/VectorParallelogrammRule.jpg" alt="" width="562" height="453" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;5 - Parallelogram rule&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The vectors &lt;strong&gt;V&lt;/strong&gt; - &lt;strong&gt;W&lt;/strong&gt; and &lt;strong&gt;V&lt;/strong&gt; + &lt;strong&gt;W&lt;/strong&gt; are the diagonals of the parallelogram defined by &lt;strong&gt;V&lt;/strong&gt; and &lt;strong&gt;W&lt;/strong&gt;. Arithmetically, vectors are added or subtracted by adding or subtracting the components of each vector.&lt;br /&gt;&lt;br /&gt;All the vector additions and subtractions are coordinate independent operations, since vectors are defined as difference of points.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Homogeneous Coordinates&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Representing both points and vectors with three coordinates can be confusing. Homogeneous coordinates are a useful tool to make the distinction explicit. Adding a fourth coordinate, named w, allows us to describe a direction or a vector by setting this coordinate to 0. In all other cases we have a point or location.&lt;br /&gt;&lt;br /&gt;Dividing a homogeneous point [&lt;em&gt;P&lt;/em&gt;&lt;sub&gt;x&lt;/sub&gt;, &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;y&lt;/sub&gt;, &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;z&lt;/sub&gt;, &lt;em&gt;P&lt;/em&gt;&lt;sub&gt;w&lt;/sub&gt;] by the w component leads to the corresponding 3D point. If the w component equals to zero, the point would be infinitely far away, which is then interpreted as a direction. Using any non-zero value for w, will lead to points all corresponding to the same 3D point. For example the point (3, 4, 5) has homogeneous coordinates (6, 8, 10, 2) or (12, 16, 20, 4).&lt;br /&gt;&lt;br /&gt;The reason why this coordinate system is called "homogeneous" is because it is possible to transform functions f(x, y, z) into the form f(x/w, y/w, z/w) without disturbing the degree of the curve. This is useful in the field of projective geometry. For example a collection of 2D homogeneous points (x/t, y/t, t) exist on a xy-plane where t is the z-coordinate as illustrated in figure 6.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/ProjectiveGeometry.jpg"&gt;&lt;img class="alignleft size-large wp-image-4650" src="http://altdevblogaday.com/wp-content/uploads/2011/04/ProjectiveGeometry-1024x592.jpg" alt="" width="695" height="401" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;6 - 2D homogenous coodinates can be visualized as a plane in 3D space&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Figure &lt;/em&gt;6 shows a triangle on the t = 1 plane, and a similar triangle much larger on a distant plane. This creates an arbitrary xy plane in three dimensions. The t- or z-coordinate of the plane is immaterial because the x- and y-coordinates are eventually scaled by t.&lt;br /&gt;Homogeneous coordinates are also used to create a translation transform.&lt;br /&gt;&lt;br /&gt;In game development, some math libraries have dedicated point and vector classes. The main distinction is made by setting the fourth channel to zero for vectors and one for points [Eberly].&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Pythagorean Theorem&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The length or magnitude of a vector can be obtained by applying the Pythagorean Theorem. The opposite -b- and adjacent -a- side of a right-angled triangle represents orthogonal directions. The hypotenuse is the shortest path distance between those.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ a^2 + b^2 = c^2" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheorem.jpg"&gt;&lt;img class="alignleft size-full wp-image-4651" src="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheorem.jpg" alt="" width="370" height="300" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;7 - Pythagorean Theorem&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;It helps thinking of the Pythagorean Theorem as a tool to compare "things" moving at right angles. For example if a is 3, b equals 4, then c equals 5 [Azad].&lt;br /&gt;&lt;br /&gt;The Pythagorean Theorem can also be applied to right-angled triangles chained together.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheoremChainedTogether.jpg"&gt;&lt;img class="alignleft size-large wp-image-4652" src="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheoremChainedTogether-814x1024.jpg" alt="" width="350" height="441" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;8 - Pythagorean Theorem with two triangles chained together&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ a^2 + b^2 = c^2" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ c^2 + d^2 = e^2" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Replacing &lt;img src="http://www.codecogs.com/eq.latex? \ c^2" alt="" /&gt; with &lt;img src="http://www.codecogs.com/eq.latex? \ a^2 + b^2" alt="" /&gt; leads to&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ a^2 + b^2 + d^2 = e^2" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ e^2" alt="" /&gt; is now written in three orthogonal components. Instead of lining the triangles flat, we can now tilt the green one a bit and therefore consider an additional dimension.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheoremin3D.jpg"&gt;&lt;img class="alignleft size-large wp-image-4654" src="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheoremin3D-1024x605.jpg" alt="" width="695" height="410" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;9 - Pythagorean Theorem in 3D&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Renaming the sides to x, y and z instead of a, b and d we get:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ x^2 + y^2 + z^2 = distance^2" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;This works with any number of dimensions.&lt;br /&gt;&lt;br /&gt;The Pythagorean Theorem is the basis for computing distance between two points. Consider the following two triangles:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheoremDistance.jpg"&gt;&lt;img class="alignleft size-large wp-image-4653" src="http://altdevblogaday.com/wp-content/uploads/2011/04/PythagoreanTheoremDistance-1024x636.jpg" alt="" width="556" height="345" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;10 - Pythagorean Theorem used for distance calculations&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The distance from the tip of the blue triangle at coordinate (4, 3) to the tip of the green triangle at coordinate (8, 5) can be calculated by creating a virtual triangle between those points. Subtracting the points leads to a 2D vector.&lt;br /&gt;&lt;br /&gt;Δx = (x&lt;sub&gt;head&lt;/sub&gt; - x&lt;sub&gt;tail&lt;/sub&gt;)&lt;br /&gt;Δy = (y&lt;sub&gt;head&lt;/sub&gt; - y&lt;sub&gt;tail&lt;/sub&gt;)&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |v| = \sqrt {{{(\Delta x)}^2} + {{(\Delta y)}^2}}}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;In this case&lt;br /&gt;&lt;br /&gt;Δx = 8 - 4 = 4&lt;br /&gt;Δy = 5 - 3 = 2&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |v| = \sqrt {{{(4)}^2} + {{(2)}^2}}}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |v| = \sqrt {20}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |v| = 4.47" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Extending the idea to three dimensions shows the well-known equation:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |v| = \sqrt {{{(\Delta x)}^2} + {{(\Delta y)}^2 + {{(\Delta z)}^2}}}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Unit Vectors&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;A unit vector has a length or magnitude of 1. This is a useful property for vector multiplications, because those consider the magnitude of a vector and the computation time can be reduced if this magnitude is one (more on this later). A unit column vector might look like this:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ v = \left[ {\begin{array}{*{20}{c}} 1\\ 0\\ 0\\ 0\\ \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;and&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |v| = 1" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Converting a vector into a unit form is called normalizing and is achieved by dividing a vector's components by its magnitude. Its magnitude is retrieved by applying the Pythagorean Theorem.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |v| = \sqrt {{x^2} + {y^2} + {z^2}}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ {v_{unit}} = \frac{1}{{|v|}}\left[ {\begin{array}{*{20}{c}} x\\ y\\ z\\ \end{array}} \right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;An example might be:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ v = \left[ {\begin{array}{*{20}{c}} 1\\ 2\\ 3\\ 0\\ \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |v| = \sqrt {{1^2} + {2^2} + {3^2}} = \sqrt {14}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ {v_{unit}} = \frac{1}{{\sqrt{14}}}\left[ {\begin{array}{*{20}{c}} 1\\ 2\\ 3\\ \end{array}} \right] \approx  \left[ {\begin{array}{*{20}{c}} 0.267\\ 0.535\\ 0.802\\ 0 \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Cartesian Unit Vectors&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Now that we have investigated the scalar multiplication of vectors, vector addition and subtraction and unit vectors, we can combine those to permit the algebraic manipulation of vectors (read more at [Vince][Lengyel]). A tool that helps to achieve this is called Cartesian unit vectors. The three Cartesian unit vectors i, j and k are aligned with the x-, y- and z-axes.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ i = \left[ {\begin{array}{*{20}{c}} 1\\ 0\\ 0\\ 0\\ \end{array}}\right] j = \left[ {\begin{array}{*{20}{c}} 0\\ 1\\ 0\\ 0\\ \end{array}}\right] k = \left[ {\begin{array}{*{20}{c}} 0\\ 0\\ 1\\ 0\\ \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Any vector aligned with the x-, y- and z-axes can be defined by a scalar multiple of the unit vectors &lt;strong&gt;i&lt;/strong&gt;, &lt;strong&gt;j&lt;/strong&gt; and &lt;strong&gt;k&lt;/strong&gt;. For example a vector 15 units long aligned with the y-axis is simply 15&lt;strong&gt;j&lt;/strong&gt;. A vector 25 units long aligned with the z axis is 25&lt;strong&gt;k&lt;/strong&gt;.&lt;br /&gt;&lt;br /&gt;By employing the rules of vector addition and subtraction, we can compose a vector &lt;strong&gt;R&lt;/strong&gt; by summing three Cartesian unit vectors as follows.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = ai + bj + ck" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;This is equivalent to writing &lt;strong&gt;R&lt;/strong&gt; as&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = \left[ {\begin{array}{*{20}{c}} a\\ b\\ c\\ 0\\ \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;The magnitude of R would then be computed as&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |R| = \sqrt {{a^2} + {b^2} + {c^2}}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Any pair of Cartesian vectors such as R and S can be combined as follows&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = ai + bj + ck" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ S = di + ej + fk" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R \pm S = (a \pm d)i + (b \pm e)j + (c \pm f)k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;An example would be&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = 2i + 3j + 4k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ S = 5i + 6j + 7k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R + S = 7i + 9j + 11k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |R + S| = \sqrt {{7^2} + {9^2} + {11^2}} \approx 15.84" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Vector Multiplication&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Vector multiplication provides some powerful ways of computing angles and surface orientations. While the multiplication of two scalars is a familiar operation, the multiplication of vectors is a multiplication of two 3D lines, which is not an easy operation to visualize. In vector analysis, there are generally two ways to multiply vectors: one results in a scalar value and the other one in a vector.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Scalar or Dot Product&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Multiplying the magnitude of two vectors |R| and |S| is a valid operation but it ignores the orientation of the vectors, which is one of their important features. Therefore we want to include the angles between the vectors. In case of the scalar product, this is done by projecting one vector onto the other.&lt;br /&gt;&lt;br /&gt;&lt;img class="alignleft size-large wp-image-4919" src="http://altdevblogaday.com/wp-content/uploads/2011/04/dotProductGeometric-1024x605.jpg" alt="" width="695" height="410" /&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;11 - Projecting &lt;strong&gt;R&lt;/strong&gt; on &lt;strong&gt;S &lt;/strong&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The projection of &lt;strong&gt;R&lt;/strong&gt; on &lt;strong&gt;S&lt;/strong&gt; creates the basis for the scalar product, because it takes into account their relative orientation. The length of &lt;strong&gt;R&lt;/strong&gt; on &lt;strong&gt;S&lt;/strong&gt; is&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ |R|cos\beta" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Then we can multiply the projected length of &lt;strong&gt;R &lt;/strong&gt;with the magnitude of &lt;strong&gt;S&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R \cdot S = |S||R|cos\beta" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;or commonly written&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R \cdot S = |R||S|cos\beta" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;The &lt;img src="http://www.codecogs.com/eq.latex?  \cdot" alt="" /&gt; symbol is used to represent scalar multiplications and to distinguish it from the vector product, which employs the &lt;img src="http://www.codecogs.com/eq.latex? \ \times" alt="" /&gt; symbol. Because of this symbol, the scalar product is often referred to as the dot product. This geometric interpretation of the scalar product shows that in case the magnitude of &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt; is one -in other words they are unit vectors- the calculation of the scalar product only relies on &lt;img src="http://www.codecogs.com/eq.latex? \ cos \beta" alt="" /&gt;. The following figure shows a number of dot product scenarios.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/04/dotProductVectors.jpg"&gt;&lt;img class="alignleft size-large wp-image-4920" src="http://altdevblogaday.com/wp-content/uploads/2011/04/dotProductVectors-1024x661.jpg" alt="" width="556" height="358" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;12 - Dot product&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The geometric representation of the dot product is useful to imagine how it works but it doesn't map well to computer hardware. The algebraic representation maps better to computer hardware and is calculated with the help of Cartesian components:&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = R_xi + R_yj + R_zk" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ S = S_xi + S_yj + S_zk" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot S = (R_xi + R_yj + R_zk) \cdot (S_xi + S_yj + S_zk) \\ = R_xi \cdot (S_xi + S_yj + S_zk) + R_yi \cdot (S_xi + S_yj + S_zk) + R_zi \cdot (S_xi + S_yj + S_zk)" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot S = R_xS_xi \cdot i + R_xS_yi \cdot j + R_xS_zi \cdot k   \\ + R_yS_xj \cdot i + R_yS_yj \cdot j + R_yS_zj \cdot k  \\ + R_zS_xk \cdot i + R_zS_yk \cdot j + R_zS_zk \cdot k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;There are various dot product terms such as &lt;img src="http://www.codecogs.com/eq.latex?  i \cdot i, i \cdot j, i \cdot k" alt="" /&gt; etc. in this equation. With the help of the geometric representation of the dot product it can be determined that terms that are mutually perpendicular like &lt;img src="http://www.codecogs.com/eq.latex?  i \cdot j, i \cdot k, j \cdot k" alt="" /&gt; are zero because the cosinus of 90 degrees is zero. This leads to&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot S = R_xS_xi \cdot i + R_yS_yj \cdot j + R_zS_zk \cdot k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Finally, terms with two vectors that are parallel to themselve lead to a value of one because the cosinus of a degree of zero is one. Additionally the Cartesian vectors are all unit vectors, which leads to&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ i \cdot i = |i||i|cos(0)= 1" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;So we end up with the familiar equation&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot S = R_xS_x + R_yS_y + R_zS_z" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;An example:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = \left[ {\begin{array}{*{20}{c}} 2\\ 0\\ 4\\ 0\\ \end{array}}\right] S = \left[ {\begin{array}{*{20}{c}} 5\\ 6\\ 10\\ 0\\ \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot S= |R||S|cos \beta" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |R| = \sqrt {{2^2} + {0^2} + {4^2}} \approx 4.472" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |S| = \sqrt {{5^2} + {6^2} + {10^2}} \approx 12.689" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Comparing the two ways of calculating the scalar product shows the same result:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot S= |R||S|cos \beta = 2  *5 + 0 * 6 + 4 * 10 = 50" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot S= 12.689 * 4.472  cos \beta = 50" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ cos \beta = \frac{50}{12.689 * 4.472} \approx 0.8811" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Solving for &lt;img src="http://www.codecogs.com/eq.latex? \\beta" alt="" /&gt; leads to the angle between the two vectors:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\  \beta = cos^{-1} (0.8811) \approx 28.22^\circ" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;The angle returned by the scalar or dot product ranges between &lt;img src="http://www.codecogs.com/eq.latex? \\  0^\circ" alt="" /&gt; and &lt;img src="http://www.codecogs.com/eq.latex? \\  180^\circ" alt="" /&gt;, because, as the angle between two vectors increases beyond &lt;img src="http://www.codecogs.com/eq.latex? \\  180^\circ" alt="" /&gt; the returned angle &lt;img src="http://www.codecogs.com/eq.latex? \\  \beta" alt="" /&gt; is always the smallest angle associated with the geometry.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Scalar Product in Lighting Calculations&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Many games utilize the Blinn-Phong lighting model (see &lt;a title="Wikipedia" href="http://en.wikipedia.org/wiki/Blinn%E2%80%93Phong_shading_model" target="_blank"&gt;Wikipedia&lt;/a&gt;; ignore the code on this page). A part of the diffuse component of this lighting model is the Lambert's Law term published in 1760. Lambert stated that the intensity of illumination on a diffuse surface is proportional to the consine of the angle between the surface normal vector and the light source direction.&lt;br /&gt;&lt;br /&gt;Let's assume our light source is located in our reference space for lighting at (20, 30, 40), while our normal vector is normalized and located at (0, 11, 0). The point where the intensity of illumination is measured is located at (0, 10, 0).&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/05/DotProductLighting.jpg"&gt;&lt;/a&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/05/DotProductLighting.jpg"&gt;&lt;img class="alignleft size-large wp-image-6746" src="http://altdevblogaday.com/wp-content/uploads/2011/05/DotProductLighting-1024x765.jpg" alt="" width="695" height="519" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure &lt;/em&gt;13 - Lighting Calculation&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The light and normal vector are calculated by subtracting the position of the point where the intensity is measured -representing their tails- from their heads.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ L = \left[ {\begin{array}{*{20}{c}} 20 - 0\\ 30 - 10\\ 40 - 0\\ 0\\ \end{array}}\right] N = \left[ {\begin{array}{*{20}{c}} 0\\ 11 - 10\\ 0\\ 0\\ \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ L \cdot N= |L||N|cos \beta = 20  * 0 + 20 * 1 + 40 * 0 = 20" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |L| = \sqrt {{20^2} + {20^2} + {40^2}} \approx 48.9898" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |N| = 1" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ L \cdot N= 48.9898 * 1.0 * cos \beta = 20" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ cos \beta = \frac{20}{48.9898 * 1.0} \approx 0.4082" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Instead of using the original light vector, the following scalar product normalizes the light vector first, before using it in the lighting equation.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ {L_{unit}} = \frac{1}{{|L|}}\left[ {\begin{array}{*{20}{c}} x\\ y\\ z\\ \end{array}} \right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ {L_{unit}} = \frac{1}{{48.9898}}\left[ {\begin{array}{*{20}{c}} 20\\ 20\\ 40\\ 0\\ \end{array}} \right] \approx  \left[ {\begin{array}{*{20}{c}} 0.4082\\ 0.4082\\ 0.8165\\ 0 \end{array}}\right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;To test if the light vectors magnitude is one:&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |L| = \sqrt {{0.4082^2} + {0.4082^2} + {0.8165^2}} \approx 1.0" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Plugging the unit light vector and the unit normal vector into the algebraic representation of the scalar product.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ L \cdot N= |L||N|cos \beta = 0.4082  * 0 + 0.4082 * 1 + 0.8165 * 0 = 0.4082" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Now solving the geometrical representation for the cosine of the angle.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ L \cdot N= |L||N|cos \beta = 0.4082" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ cos \beta = \frac{0.4082}{1.0 * 1.0} = 0.4082" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;In case the light and the normal vector are unit vectors, the result of the algebraic scalar product calculation equals the cosinus of the angle. The algebraic scalar product is implemented in the dot product intrinsic available for the CPU and GPU. In other words, in case the involved vectors are unit vectors, a processor can calculate the cosine of the angle faster. This is the reason why normalized vectors might be more efficient in programming computer hardware.&lt;br /&gt;&lt;br /&gt;Following Lambert's law, the intensity of illumination on a diffuse surface is proportional to the consine of the angle between the surface normal and the light source direction. That means that the point at (0, 10, 0) receives about 0.4082 of the original light intensity at (20, 30, 40) (attenuation is not considered in this example).&lt;br /&gt;&lt;br /&gt;Coming back to image 12, in case, the unit light vector would have a y component that is one or minus one and therefore its x and y component would be zero, it would point in the same or opposite direction as the normal and therefore the last equation would result in one or minus one. If the unit light vector would have a z or x component equaling to one and therefore the other components would be zero, those equations would result in zero.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The Vector Product&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Like the scalar product, the vector or cross product depends on the modulus of two vectors and the angle between them, but the result of the vector product is essentially different: it is another vector, at right angles to both the original vectors.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = T" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;and&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ |T| = |R||S|sin\theta" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;For an understanding of the vector product &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt;, it helps to imagine a plane through those two vectors as shown in figure 14.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/05/VectorProductInAPlane.jpg"&gt;&lt;img class="alignleft size-large wp-image-6752" src="http://altdevblogaday.com/wp-content/uploads/2011/05/VectorProductInAPlane-1024x680.jpg" alt="" width="695" height="461" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure&lt;/em&gt; 14 - Vector Product&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The angle &lt;img src="http://www.codecogs.com/eq.latex? \\ \theta" alt="" /&gt; between the directions of the vectors suffices &lt;img src="http://www.codecogs.com/eq.latex? \\ 0 \leq \theta \leq 180^\circ" alt="" /&gt;. There are two possible choices for the direction of the vector, each the negation of the other; the one chosen here is determined by the right-hand rule. Hold your right hand so that your forefinger points forward, your middle finger points out to the left, and your thumb points up. If you roughly align your forefinger with &lt;strong&gt;R&lt;/strong&gt;, and your middle finger with &lt;strong&gt;S&lt;/strong&gt;, then the cross product will point in the direction of your thumb.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/05/VectorProductHandRule.jpg"&gt;&lt;img src="http://altdevblogaday.com/wp-content/uploads/2011/05/VectorProductHandRule.jpg" alt="" width="398" height="340" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure&lt;/em&gt; 15 - Right-Hand rule Vector Product&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The resulting vector of the cross product is perpendicular to &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt;, that is&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \cdot T = 0" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;and&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ S \cdot T = 0" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;The two vectors &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt; can be orthogonal but do not have to be. This makes the vector product an ideal way of computing normals. A property of the vector product that will be covered later is, that the magnitude of &lt;strong&gt;T&lt;/strong&gt; is the area of the parallelogram defined by &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt;.&lt;br /&gt;&lt;br /&gt;Let's multiply two vectors together using the vector product.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = R_xi + R_yj + R_zk" alt="" /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ S = S_xi + S_yj + S_zk" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = (R_xi + R_yj + R_zk) \times (S_xi + S_yj + S_zk) \\ = R_xi \times (S_xi + S_yj + S_zk) + R_yi \times (S_xi + S_yj + S_zk) + R_zi \times (S_xi + S_yj + S_zk)" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = R_xS_xi \times i + R_xS_yi \times j + R_xS_zi \times k   \\ + R_yS_xj \times i + R_yS_yj \times j + R_yS_zj \times k  \\ + R_zS_xk \times i + R_zS_yk \times j + R_zS_zk \times k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;There are various vector product terms such as &lt;img src="http://www.codecogs.com/eq.latex?\\ i \times i, i \times j, i \times k" alt="" /&gt; etc. in this equation. The terms &lt;img src="http://www.codecogs.com/eq.latex? \\ i \times i, j \times j, k \times k" alt="" /&gt; will result in a vector whose magnitude is zero, because the angle between those vectors is &lt;img src="http://www.codecogs.com/eq.latex?  \\ 0^\circ" alt="" /&gt;, and sin&lt;img src="http://www.codecogs.com/eq.latex? \\ 0^\circ = 0" alt="" /&gt;. This leaves&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = R_xS_yi \times j + R_xS_zi \times k  + R_yS_xj \times i + R_yS_zj \times k + R_zS_xk \times i + R_zS_yk \times j" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;The other products between the unit vectors can be reasoned as:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\i \times j = k \\ j \times i = -k \\ j \times k = i \\ k \times j = -i \\ k \times i = j \\ i \times k = -j" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Those results show, that the commutative multiplication law is not applicable to vector products. In other words&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\i \times j  != j \times i" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Applying those findings reduces the vector product term to&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = R_xS_yk - R_xS_zj  - R_yS_xk + R_yS_zi  + R_zS_xj - R_zS_yi" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Now re-grouping the equation to bring like terms together leads to:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = (R_yS_z - R_zS_y)i + (R_zS_x - R_xS_z)j + (R_xS_y - R_yS_x)k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;To achieve a visual pattern for remembering the vector product, some authors reverse the sign of the &lt;strong&gt;j&lt;/strong&gt; scalar term.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = (R_yS_z - R_zS_y)i - (R_xS_z - R_zS_x)j + (R_xS_y - R_yS_x)k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Re-writing the vector product as determinants might help to memorize it as well.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = \begin{vmatrix} R_y &amp;amp; R_z \\ S_y &amp;amp; S_z \end{vmatrix} i - \begin{vmatrix} R_x &amp;amp; R_z \\ S_x &amp;amp; S_z\end{vmatrix}j + \begin{vmatrix} R_x &amp;amp; R_y \\ S_x &amp;amp; S_y \end{vmatrix}k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;A 2x2 determinant is the difference between the product of the diagonal terms. With determinants a "recipe" for a vector product consists of the following steps:&lt;br /&gt;&lt;br /&gt;1. Write the two vectors that should be multiplied as Cartesian vectors&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = R_xi + R_yj + R_zk" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ S = S_xi + S_yj + S_zk" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;2. Write the cross product of those two vectors in determinant form, if this helps to memorize the process; otherwise skip to step 3.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = \begin{vmatrix} R_y &amp;amp; R_z \\ S_y &amp;amp; S_z \end{vmatrix} i - \begin{vmatrix} R_x &amp;amp; R_z \\ S_x &amp;amp; S_z\end{vmatrix}j + \begin{vmatrix} R_x &amp;amp; R_y \\ S_x &amp;amp; S_y \end{vmatrix}k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;3. Then compute by plugging in the numbers into&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = (R_yS_z - R_zS_y)i - (R_xS_z - R_zS_x)j + (R_xS_y - R_yS_x)k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;A simple example of a vector product calculation is to show that the assumptions that were made above, while simplifying the vector product, hold up.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\i \times j = k \\ j \times i = -k \\ j \times k = i \\ k \times j = -i \\ k \times i = j \\ i \times k = -j" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;To show that there is a sign reversal when the vectors are reversed &lt;img src="http://www.codecogs.com/eq.latex? \\i \times k = -j,  k \times i = j" alt="" /&gt;, let's calculate the cross product of those terms.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ i = 1i + 0j + 0k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ k = 0i + 0j + 1k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ i \times k = \begin{vmatrix} 0 &amp;amp; 0 \\ 0 &amp;amp; 1 \end{vmatrix} i - \begin{vmatrix} 1 &amp;amp; 0 \\ 0 &amp;amp; 1\end{vmatrix}j + \begin{vmatrix} 1 &amp;amp; 0 \\ 0 &amp;amp; 0 \end{vmatrix}k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ i \times k = (0  * 1 - 0  * 0)i - (1  * 1 - 0  * 0)j + (1  * 0 - 0  * 0)k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;The i and k terms are both zero, but the j term is -1, which makes &lt;img src="http://www.codecogs.com/eq.latex? \\i \times k = -j" alt="" /&gt;. Now reversing the vector product&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ k = 0i + 0j + 1k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ i = 1i + 0j + 0k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ k \times i = \begin{vmatrix} 0 &amp;amp; 1 \\ 0 &amp;amp; 0 \end{vmatrix} i - \begin{vmatrix} 0 &amp;amp; 1 \\ 1 &amp;amp; 0\end{vmatrix}j + \begin{vmatrix} 0 &amp;amp; 0 \\ 1 &amp;amp; 0 \end{vmatrix}k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ k \times i = (0  * 0 - 1  * 0)i - (0  * 0 - 1  * 1)j + (0  * 0 - 0  * 1)k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Which shows &lt;img src="http://www.codecogs.com/eq.latex? \\k \times i = j" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Deriving a Unit Normal Vector for a Triangle&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Image 16 shows a triangle with vertices defined in anti-clockwise order. The side pointing towards the viewer is defined as the visible side in this scene. That means that the normal is expected to point roughly in the direction of where the viewer is located.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/05/DerivingAUnitNormal.jpg"&gt;&lt;img class="alignleft size-large wp-image-6952" src="http://altdevblogaday.com/wp-content/uploads/2011/05/DerivingAUnitNormal-1024x589.jpg" alt="" width="695" height="399" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure&lt;/em&gt; 16 - Deriving a Unit Normal Vector&lt;/em&gt;&lt;br /&gt;The vertices of the triangle are:&lt;br /&gt;&lt;br /&gt;P1 (0, 2, 1)&lt;br /&gt;P2 (0, 1, 4)&lt;br /&gt;P3 (2, 0, 1)&lt;br /&gt;&lt;br /&gt;The two vectors &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt; are retrieved by subtracting the vertex at the head from the vertex at its tail.&lt;br /&gt;&lt;br /&gt;Δx = (x&lt;sub&gt;h&lt;/sub&gt; - x&lt;sub&gt;t&lt;/sub&gt;)&lt;br /&gt;Δy = (y&lt;sub&gt;h&lt;/sub&gt; - y&lt;sub&gt;t&lt;/sub&gt;)&lt;br /&gt;Δz = (z&lt;sub&gt;h&lt;/sub&gt; - z&lt;sub&gt;t&lt;/sub&gt;)&lt;br /&gt;&lt;br /&gt;Bringing the result into the Cartesian form&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ R = 0-2i + 2-0j + 1-1k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ S = 0-2i + 1-0j + 4-1k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = \begin{vmatrix} 2 &amp;amp; 0 \\ 1 &amp;amp; 3 \end{vmatrix} i - \begin{vmatrix} -2 &amp;amp; 0 \\ -2 &amp;amp; 3\end{vmatrix}j + \begin{vmatrix} -2 &amp;amp; 2 \\ -2 &amp;amp; 1 \end{vmatrix}k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ R \times S = (2 * 3 - 0 * 1)i - (-2 * 3 - 0 * -2)j + (-2 * 1 - 2 * -2)k" alt="" /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ N = 6i + 6j + 2k" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |N| = \sqrt {{6^2} + {6^2} + {2^2}}" alt="" /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? |N| = \sqrt {{76}} = 8.7178" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ {N_{unit}} = \frac{1}{{|N|}}\left[ {\begin{array}{*{20}{c}} 6\\ 6\\ 2\\ \end{array}} \right]" alt="" /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \ {N_{unit}} = \frac{1}{{8.7178}}\left[ {\begin{array}{*{20}{c}} 6\\ 6\\ 2\\ \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} 0.6882\\ 0.6882\\ 0.2294\\ \end{array}} \right]" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;It is a common mistake to believe that if &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt; are unit vectors, the cross product will also be a unit vector. The vector product equation shows that this is only true when the angle between the two vectors is 90 degrees and therefore the sinus of the angle theta is 1.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ |T| = |R||S|sin\theta" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;Please read in [Van Verth] about CPU implementation details.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Areas&lt;/strong&gt;&lt;br /&gt;The vector product might be used to determine the area of a parallelogram or a triangle (with the vertices at P&lt;sub&gt;1&lt;/sub&gt; - P&lt;sub&gt;3&lt;/sub&gt;). Image 17 shows the two vectors helping to form a parallelogram and a triangle.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/05/VectorProductAreaCalculation.jpg"&gt;&lt;img class="alignleft size-large wp-image-6955" src="http://altdevblogaday.com/wp-content/uploads/2011/05/VectorProductAreaCalculation-1024x590.jpg" alt="" width="695" height="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;em&gt;Figure&lt;/em&gt; 17 - Deriving the Area of a Parallelogramm / Triangle with the Vector Product&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;The height h is &lt;img src="http://www.codecogs.com/eq.latex? \\ h = |S|sin\theta" alt="" /&gt;, therefore the area of the parallelogram is&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\ area = |R|*h = |R||S|sin\theta" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;This equals the magnitude of the cross product vector &lt;strong&gt;T&lt;/strong&gt;. Thus when we calculate the vector product of &lt;strong&gt;R&lt;/strong&gt; and &lt;strong&gt;S&lt;/strong&gt;, the length of the normal vector equals the area of the parallelogram formed by those vectors. The triangle forms half of the parallelogram and therefore half of the area.&lt;br /&gt;&lt;br /&gt;area of parallelogram = &lt;img src="http://www.codecogs.com/eq.latex? \\ |T|" alt="" /&gt;&lt;br /&gt;area of triangle =&lt;img src="http://www.codecogs.com/eq.latex? \\  \frac{1}{{2}}|T|" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;or&lt;br /&gt;&lt;br /&gt;area of triangle =&lt;img src="http://www.codecogs.com/eq.latex? \\  \frac{1}{{2}}|R \times S|" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;To compute the surface area of a mesh constructed from triangles or parallelograms, the magnitude of its non-normalized normals can be used like this.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.codecogs.com/eq.latex? \\  \frac{MagnitudeOfAllNormals }{{2}}" alt="" /&gt;&lt;br /&gt;&lt;br /&gt;The sign of the magnitude of the normal shows if the vertices are clockwise or counter-clockwise oriented.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;[Azad] Kalid Azad, "Math Better Eplained", &lt;a href="http://betterexplained.com/articles/math-betterexplained-ebook-available/"&gt;http://betterexplained.com/articles/math-betterexplained-ebook-available/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[Eberly] David H. Eberly, "3D Game Engine Design", p. 15, 2nd Edition, Morgan Kauffman 2007&lt;br /&gt;&lt;br /&gt;[Farin] Gerald Farin, Dianne Hansford, "The Geometry Toolbox - For Graphics and Modeling", p. 16, AK Peters 1998&lt;br /&gt;&lt;br /&gt;[Lengyel] Eric Lengyel, Mathematics for 3D Game Programming and Computer Graphics, Second Edition, Charles River Media 2003&lt;br /&gt;&lt;br /&gt;[Vince] John Vince, "Mathematics for Computer Graphics", Springer, 3rd Edition, 2010&lt;br /&gt;&lt;br /&gt;[Van Verth] James M. Van Verth, Lars M. Bishop, "Essential Mathematics for Games &amp;amp; Interactive Applications - A Programmer's Guide", Morgan Kaufmann 2004&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4046500821353565663?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4046500821353565663/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4046500821353565663' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4046500821353565663'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4046500821353565663'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/05/points-vertices-and-vectors.html' title='Points, Vertices and Vectors'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5358135164480502994</id><published>2011-03-15T12:23:00.000-07:00</published><updated>2012-01-26T12:25:52.012-08:00</updated><title type='text'>Thoughts on the knowledge of an up-to-date Graphics Programmer</title><content type='html'>I am teaching a class called "GPU Programming" at UCSD -now the second time-. While getting feedback from my students this year, I realized that the amount of knowledge to program a next-gen graphics engine is pretty high. For my class I only look at certain pieces of a graphics engine, that are easy to modularize. For each of those modules I explain the whole architecture, how it maps to a GPU and why you want to architect it the way I am describing it. I don't cover many of the smaller parts of a renderer, like streaming system, object serialization, memory management etc..&lt;br /&gt;&lt;br /&gt;There are certainly more pieces that you could talk about, I just picked the ones I believe can easily be separated from the renderer or the ones that I would consider are more important features of a renderer. Here is the list of the last class:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;DirectX 11 API&lt;/li&gt;&lt;li&gt;Deferred Lighting / MSAA and more&lt;/li&gt;&lt;li&gt;Order-Independent Transparency&lt;/li&gt;&lt;li&gt;Shadows: Cascaded, Cube, Soft Shadows and more&lt;/li&gt;&lt;li&gt;PostFX: HDR, Depth of Field, Motion Blur, Color Filters and more&lt;/li&gt;&lt;li&gt;GPU Particle System&lt;/li&gt;&lt;li&gt;Real-Time Dynamic Global Illumination - several techniques&lt;/li&gt;&lt;li&gt;CUDA, DirectCompute&lt;/li&gt;&lt;/ul&gt;Following is a short overview why I believe those are important topics for future graphics programmers:&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;DirectX 11 API&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;I am totally agnostic to any graphics API. I don't care which API I am using as long as the API exposes all the hardware features. In fact in the last two years I worked more with OpenGL ES 2.0 on mobile devices (that unfortunately doesn't expose many hardware features) than with DirectX 11. My students at UCSD prefer OpenGL. The reason why I expose them in one session to DirectX 11 is that this API currently exposes more features than OpenGL (although OpenGL catches up ... fortunately). On Windows platforms DirectX has better driver support, while on Apple, you want to prefer using OpenGL. As far as I know Apple is beta testing OpenGL 3.2 and is otherwise still on 2.0.&lt;br /&gt;&lt;br /&gt;I am trying to teach the Direct3D API by highlighting the concepts and not talking too much about API calls and parameters. Having learned one API should enable anyone to use any other graphics API on his own, because all the underlying principles are the same. Graphics API's are not that different anymore; just the amount of hardware functionality they expose is different.&lt;br /&gt;&lt;br /&gt;One thing that is remarkable is that there is no good Direct3D 11 book available. Together with others I published a short Direct3D 10 book that was mostly written about 3 - 4 years ago at [&lt;a title="Programming Vertex, Geometry, and Pixel Shaders" href="http://wiki.gamedev.net/index.php/D3DBook:Book_Cover" target="_blank"&gt;Direct3D10&lt;/a&gt;]. There is a book by A.K. Peters coming out on &lt;a href="Practical Rendering and Computation With Direct3d 11" target="_blank"&gt;Direct3D 11&lt;/a&gt;, several of the authors that worked on the Direct3D 10 book worked on this one too and it looks promising.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Deferred Lighting / MSAA / Order-Independent Transparency&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Nowadays, it is easy to say that Deferred Lighting is a standard setup for a rendering system. Using Deferred Lighting in a rendering system streamlines your renderer design in a certain direction, so you have to be fully aware of the side effects of Deferred Lighting.&lt;br /&gt;&lt;br /&gt;Currently we still differ between rendering opaque and transparent objects. Only opaque objects get the Deferred Lighting treatment and transparent objects -that can't be rendered into the depth buffer- require a simplified lighting model, that is only applied to transparent objects.&lt;br /&gt;&lt;br /&gt;If we want to reach CG movie lighting and shadows, we need 1000's of lights and 100's of shadows. I think we can render on most hardware now for objects that are in the depth buffer 1000's of lights, the shadows are harder to achieve. Unfortunately there is no generic solution for rendering shadows on transparent objects either.&lt;br /&gt;&lt;br /&gt;Designing a renderer so that it supports Order-Independent Transparency (OIT) might help here, although currently available techniques running on average hardware are still too expensive.&lt;br /&gt;&lt;br /&gt;Following the development of OIT is certainly of great interest to graphics programmers, so I added this topic to the curriculum of my class.&lt;br /&gt;&lt;br /&gt;MSAA is expensive when used with Deferred Lighting (commonly only used by running the lighting/shadow shader per-sample on edges of objects and per-pixel everywhere else). MLAA doesn't cover moving objects very well although it is a good replacement for everything else.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Shadows: Cascaded, Cube, Soft Shadows and more&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Shadows are still expensive because they consume lots of vertex /geometry shader cycles and/or are memory bandwidth hungry. For cube shadow maps, my last article on a typical Ellipsoid Light Shadow setup can be found &lt;a title="Shadows" href="http://altdevblogaday.com/2011/02/28/shadows-thoughts-on-ellipsoid-light-shadow-rendering/" target="_blank"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Cascaded Shadow maps and the future development for outdoor shadows represent a natural Level-of-Shadow (LOS) system. With each cascade, the distribution of shadow resolution is lower and therefore the shadow map area to on-screen pixel ratio is already part of the approach, with future approaches probably offering a more detailed LOS system. The expectation is that the "Multi-Frustum Shadow" approach taken with Cascaded Shadow Maps will be brought to the next level with finer granularity and better LOS.&lt;br /&gt;&lt;br /&gt;Cube shadow maps can cover many different light types. Like with Cascaded Shadow Maps, the culling of objects and therefore the amount of geometry rendered into those maps is a challenge. Their error distribution compared to their next competitor Dual-Paraboloid Shadow maps is better and therefore they are favorable to those.&lt;br /&gt;&lt;br /&gt;Soft Shadows are a refinement that will be available in more and more games. Rendering perceptually correct shadows that show a softer penumbra based on the distance of the occluder to the shadow receiver is a nice looking feature that should be widely available soon.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;PostFX: HDR, Depth of Field, Motion Blur, Color Filters and more&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Different parts of a modern PostFX pipeline full-fill very different tasks in a rendering system. Many of them are dealing with color quality in general, like HDR rendering and tone mapping. Others mimic real-world camera systems so that the player -who is expected to be accustomed to all the errors camera lenses introduce- feels comfortable while playing a game.&lt;br /&gt;&lt;br /&gt;Depth of Field and motion blur are quite often used to get over Level-of-Detail (LOD)rendering shortcomings. In an open world game Depth of Field can be used to hide the fact that the buildings 200 meters away from the camera use a lower LOD level. Motion blur is usually used to offer the sense of speed.&lt;br /&gt;In recent development, very nice looking Depth of Field with Bokeh is used to guide the attention of the user to certain parts of the screen. Those effects are more expensive, although you can also run them on an integrated GPU -like Intel's Sandy Bridge- [&lt;a title="RawK" href="http://www.confettispecialfx.com/rawk®-graphics-demo-for-sandy-bridge" target="_blank"&gt;RawK&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;GPU Particle System&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Modern particle systems are used to cover much more than just explosion effects. They can represent liquid or small objects that are flocking, cast light and shadows and expose many other behavior patterns like collision response, flight physics or for example leaf, trash or grass behavior.&lt;br /&gt;Mimicking those systems is now part of a graphics sub-system that runs favorably on the GPU, to achieve large numbers of particles. As long as all the memory access is happening on the GPU in "streaming" patterns, those systems can simulate very high numbers of particles.&lt;br /&gt;&lt;br /&gt;With a full-featured list of requirements for a next-gen particle system, it should be easy to define the position of one or more programmers who deal only with this system.&lt;br /&gt;A GPU Particle System is a "mini" game engine with all the features of a game engine like drawing, simulation, collision detection, collision response, audio support, networking etc.. It demonstrates the GPU usage patterns of next-gen engines.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Real-Time Dynamic Global Illumination - several techniques&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;It looks like the next level of detail in lighting is expected to be -what is commonly named- Global Illumination. Everything said about Deferred Lighting and shadows applies also to Global Illumination. So shadows are more difficult and transparent objects are challenging.&lt;br /&gt;Whatever the Global Illumination technique of choice is, the most critical aspect is that it is fully dynamic and does not occupy much memory. A typical system based on a Light Propagation Volume approach consumes about 1.5 to 2.5 Mb of memory and extends the shadow map already used, following the Reflective shadow map idea developed by Carsten Dachsbacher et. all.&lt;br /&gt;Looking at it from a birds eye of view, Reflective shadow maps seem to be a good starting point for any development in the area. Collecting the bouncing diffuse, specular light and occlusion "somewhere" and then re-applying the light data to a scene is difficult, while balancing quality and performance [Dachsbacher], [DachsbacherSii], [Kaplanyan].&lt;br /&gt;As usual I keep stressing the fact that whatever we do should be as "dynamic" as possible without the usage of look-up textures or light maps. That especially applies to Global Illumination.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;CUDA, DirectCompute&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Another important topic for a graphics programmer are the new "General Programming" interfaces that allow to program the GPU more like a CPU. That means that algorithms that didn't fit well into the rasterized graphics pipeline assumed by most GPUs nowadays, can be implemented easier; as long as the data set is suitable for GPU usage.&lt;br /&gt;&lt;br /&gt;CUDA represents a good entry level knowledge here. It gives a very good overview on how NVIDIA GPUs actually work, how different types of memories need to be involved and how code is executed on those GPUs.&lt;br /&gt;&lt;br /&gt;DirectCompute and OpenCL are more abstract and hide some of the valuable knowledge required for CUDA programming. Although both are expected to work on all GPUs and are therefore easier portable.&lt;br /&gt;&lt;br /&gt;All this being said, what will my future list for the class look like?&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Direct3D X API&lt;/li&gt;&lt;li&gt;CUDA, DirectCompute&lt;/li&gt;&lt;li&gt;Deferred Lighting / MSAA and more&lt;/li&gt;&lt;li&gt;Order-Independent Transparency&lt;/li&gt;&lt;li&gt;Shadows: Cascaded, Cube, Soft Shadows and more&lt;/li&gt;&lt;li&gt;PostFX: HDR, Depth of Field, Motion Blur, Color Filters and more&lt;/li&gt;&lt;li&gt;GPU Particle System&lt;/li&gt;&lt;li&gt;Real-Time Dynamic Global Illumination - several techniques&lt;/li&gt;&lt;/ul&gt;I will move CUDA, DirectCompute and or OpenCL to the second lesson. Then I replace parts of the implementation of the following lessons by implementing them with the GPGPU programming APIs. Deferred Lighting / AA will be tile-based, Order-Independent Transparency will use a compute API to store data for example in a linked list, shadows will use new storage patterns that are driven by a simple rasterizer that creates depth values, PostFX will use blur kernels that are more random and complex with the help of the compute APIs, GPU Particle System will do all the simulations with the help of the compute APIs ... and Real-Time Dynamic Global Illumination will store light data in volumes that are not evenly spaced out and calculate light propagation with the help of the compute API.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;[Dachsbacher] Carsten Dachsbacher, Marc Stamminger, “Reflective Shadow Maps”,&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/download/rsm.pdf"&gt;http://www.vis.uni-stuttgart.de/~dachsbcn/download/rsm.pdf&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/publications.html"&gt;http://www.vis.uni-stuttgart.de/~dachsbcn/publications.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/publications.html"&gt;&lt;/a&gt;[DachsbacherSii] Carsten Dachsbacher, Marc Stamminger, “Splatting Indirect Illumination”,&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/download/sii.pdf"&gt;http://www.vis.uni-stuttgart.de/~dachsbcn/download/sii.pdf&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/download/sii.pdf"&gt;&lt;/a&gt;[Kaplanyan] Anton Kaplanyan, Wolfgang Engel, Carsten Dachsbacher,&lt;br /&gt;“Diffuse Global Illumination with Temporally Coherent Light Propagation Volumes”, GPU Pro 2,&lt;br /&gt;pp 185 – 203, AK Peters, 2011&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5358135164480502994?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5358135164480502994/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5358135164480502994' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5358135164480502994'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5358135164480502994'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/03/thoughts-on-knowledge-of-up-to-date.html' title='Thoughts on the knowledge of an up-to-date Graphics Programmer'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4925821427344093812</id><published>2011-02-28T09:01:00.000-08:00</published><updated>2012-01-26T12:26:00.514-08:00</updated><title type='text'>Shadows - Thoughts on Ellipsoid Light Shadow Rendering</title><content type='html'>A shadow system in a modern game needs to be able to mimic a wide range of shadows. The following text describes a shadow system that was used in the RawK® demo that is tailored to Intel's Sandy Bridge chipset [&lt;a title="RawK" href="http://www.confettispecialfx.com/rawk%C2%AE-graphics-demo-for-sandy-bridge" target="_blank"&gt;RawK&lt;/a&gt;].&lt;br /&gt;This demo prototypes the characteristics of an open world game, when it comes to indoor shadow rendering.  In an open-world game where the viewer can go inside buildings and stay outside as well, there might be shadows for&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Cloud shadows, most of the time clouds just projected down&lt;/li&gt;&lt;li&gt;Self-Shadowing for the main character or more characters: those are optional shadows with their own frustum that just cover characters bodies close to the camera&lt;/li&gt;&lt;li&gt;Sun shadows: Cascaded Shadow Maps&lt;/li&gt;&lt;li&gt;Shadows from point, spot and other light types&lt;/li&gt;&lt;/ul&gt;For the first three types of shadows one might consider a shadow collector that collects the shadow data of all three types in a screen-space texture, that is then filtered and applied to the scene.&lt;br /&gt;Shadows from point, spot and other light types might be cached. Trading memory against the effort of updating shadow maps makes sense on some platforms. The following text will focus on shadows coming from ellipsoidal and point lights but similar thoughts apply for light types other than directional lights.&lt;br /&gt;Developing a shadow system for those light types usually means facing the following challenges:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Shadow Rendering&lt;/li&gt;&lt;li&gt;Shadow Caching&lt;/li&gt;&lt;li&gt;Shadow Bias value&lt;/li&gt;&lt;li&gt;Softening the Penumbra&lt;/li&gt;&lt;/ol&gt;&lt;strong&gt;Shadow Rendering&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;For point light types and similar light types, the favorite storage method is a cube texture map. Compared to its main competitor the dual-paraboloid shadow map, it offers a more even error distribution. The hemispheric projection for dual-paraboloid shadow maps requires a high-level of tessellation that might not be common in a game where normal maps mimic the finer details.&lt;br /&gt;&lt;br /&gt;Rendering into a cube shadow map can be done with a DirectX 10 and above capable graphics card in one draw call with the help of the geometry shader. The performance of the geometry shader on some graphics cards is not as good as one would expect. In those cases it helps to move some of the calculations from the geometry into the vertex shader. The inner loop of a typical geometry shader used to render into a cube map might look like this:&lt;br /&gt;&lt;pre escaped="true" lang="C"&gt;// Loop over cube faces&lt;br /&gt;[unroll]&lt;br /&gt;for (int i = 0; i &amp;lt; 6; i++)&lt;br /&gt;{&lt;br /&gt;  // Translate the view projection matrix to the position of the light&lt;br /&gt;  float4x4 pViewProjArray = viewProjArray[i];&lt;br /&gt;&lt;br /&gt;  //&lt;br /&gt;  // translate&lt;br /&gt;  //&lt;br /&gt;  // access the row HLSL[row][column]&lt;br /&gt;  pViewProjArray[0].w += dot(pViewProjArray[0].xyz, -In[0].lightpos.xyz);&lt;br /&gt;  pViewProjArray[1].w += dot(pViewProjArray[1].xyz, -In[0].lightpos.xyz);&lt;br /&gt;  pViewProjArray[2].w += dot(pViewProjArray[2].xyz, -In[0].lightpos.xyz);&lt;br /&gt;  pViewProjArray[3].w += dot(pViewProjArray[3].xyz, -In[0].lightpos.xyz);&lt;br /&gt;&lt;br /&gt;  float4 pos[3];&lt;br /&gt;  pos[0] = mul(pViewProjArray, float4(In[0].position.xyz, 1.0));&lt;br /&gt;  pos[1] = mul(pViewProjArray, float4(In[1].position.xyz, 1.0));&lt;br /&gt;  pos[2] = mul(pViewProjArray, float4(In[2].position.xyz, 1.0));&lt;br /&gt;&lt;br /&gt;  // Use frustum culling to improve performance&lt;br /&gt;  float4 t0 = saturate(pos[0].xyxy * float4(-1, -1, 1, 1) - pos[0].w);&lt;br /&gt;  float4 t1 = saturate(pos[1].xyxy * float4(-1, -1, 1, 1) - pos[1].w);&lt;br /&gt;  float4 t2 = saturate(pos[2].xyxy * float4(-1, -1, 1, 1) - pos[2].w);&lt;br /&gt;  float4 t = t0 * t1 * t2;&lt;br /&gt;&lt;br /&gt;  [branch]&lt;br /&gt;  if (!any(t))&lt;br /&gt;  {&lt;br /&gt;   // Use backface culling to improve performance&lt;br /&gt;   float2 d0 = pos[1].xy * pos[0].w - pos[0].xy * pos[1].w;&lt;br /&gt;   float2 d1 = pos[2].xy * pos[0].w - pos[0].xy * pos[2].w;&lt;br /&gt;&lt;br /&gt;   [branch]&lt;br /&gt;   if (d1.x * d0.y &amp;gt; d0.x * d1.y || min(min(pos[0].w, pos[1].w), pos[2].w) &amp;lt; 0.0)&lt;br /&gt;   {&lt;br /&gt;    Out.face = i;&lt;br /&gt;&lt;br /&gt;    [unroll]&lt;br /&gt;    for (int k = 0; k &amp;lt; 3; k++)&lt;br /&gt;    {&lt;br /&gt;     Out.position = pos[k];&lt;br /&gt;     Stream.Append(Out);&lt;br /&gt;    }&lt;br /&gt;    Stream.RestartStrip();&lt;br /&gt;   }&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;To relieve the workload of the geometry shader the offset and transformation code was moved into the vertex shader:&lt;br /&gt;&lt;pre escaped="true" lang="C"&gt;[Vertex shader]&lt;br /&gt;&lt;br /&gt;float4x4 viewProjArray[6];&lt;br /&gt;float3 LightPos;&lt;br /&gt;&lt;br /&gt;GsIn main(VsIn In)&lt;br /&gt;{&lt;br /&gt;  GsIn Out;&lt;br /&gt;&lt;br /&gt;  float3 position = In.position - LightPos;&lt;br /&gt;&lt;br /&gt;  [unroll]&lt;br /&gt;  for (int i=0; i&amp;lt;3; ++i)&lt;br /&gt;  {&lt;br /&gt;    Out.position[i] = mul(viewProjArray[i*2], float4(position.xyz, 1.0));&lt;br /&gt;    Out.extraZ[i] = mul(viewProjArray[i*2+1], float4(position.xyz, 1.0)).z;&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;  return Out;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;//------------------------------------------------------------------------------&lt;br /&gt;[Geometry shader]&lt;br /&gt;&lt;br /&gt;#define POSITIVE_X 0&lt;br /&gt;#define NEGATIVE_X 1&lt;br /&gt;#define POSITIVE_Y 2&lt;br /&gt;#define NEGATIVE_Y 3&lt;br /&gt;#define POSITIVE_Z 4&lt;br /&gt;#define NEGATIVE_Z 5&lt;br /&gt;&lt;br /&gt;float4 UnpackPositionForFace(GsIn data, int face)&lt;br /&gt;{&lt;br /&gt;  float4 res = data.position[face/2];&lt;br /&gt;&lt;br /&gt;  [flatten]&lt;br /&gt;  if (face%2)&lt;br /&gt;  {&lt;br /&gt;    res.w = -res.w;&lt;br /&gt;    res.z = data.extraZ[face/2];&lt;br /&gt;    [flatten]&lt;br /&gt;    if (face==NEGATIVE_Y)&lt;br /&gt;     res.y = -res.y;&lt;br /&gt;    else&lt;br /&gt;     res.x = -res.x;&lt;br /&gt;  }&lt;br /&gt; return res;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;[maxvertexcount(18)]&lt;br /&gt;void main(triangle GsIn In[3], inout TriangleStream&amp;lt;PsIn&amp;gt; Stream)&lt;br /&gt;{&lt;br /&gt;  PsIn Out;&lt;br /&gt;&lt;br /&gt;  // Loop over cube faces&lt;br /&gt;  [unroll]&lt;br /&gt;  for (int i = 0; i &amp;lt; 6; i++)&lt;br /&gt;  {&lt;br /&gt;    float4 pos[3];&lt;br /&gt;    pos[0] = UnpackPositionForFace(In[0], i);&lt;br /&gt;    pos[1] = UnpackPositionForFace(In[1], i);&lt;br /&gt;    pos[2] = UnpackPositionForFace(In[2], i);&lt;br /&gt;&lt;br /&gt;    // Use frustum culling to improve performance&lt;br /&gt;    float4 t0 = saturate(pos[0].xyxy * float4(-1, -1, 1, 1) - pos[0].w);&lt;br /&gt;    float4 t1 = saturate(pos[1].xyxy * float4(-1, -1, 1, 1) - pos[1].w);&lt;br /&gt;    float4 t2 = saturate(pos[2].xyxy * float4(-1, -1, 1, 1) - pos[2].w);&lt;br /&gt;    float4 t = t0 * t1 * t2;&lt;br /&gt;&lt;br /&gt;    [branch]&lt;br /&gt;    if (!any(t))&lt;br /&gt;    {&lt;br /&gt;     // Use backface culling to improve performance&lt;br /&gt;     float2 d0 = pos[1].xy * pos[0].w - pos[0].xy * pos[1].w;&lt;br /&gt;     float2 d1 = pos[2].xy * pos[0].w - pos[0].xy * pos[2].w;&lt;br /&gt;&lt;br /&gt;     [branch]&lt;br /&gt;     if (d1.x * d0.y &amp;gt; d0.x * d1.y || min(min(pos[0].w, pos[1].w), pos[2].w) &amp;lt; 0.0)&lt;br /&gt;     {&lt;br /&gt;      Out.face = i;&lt;br /&gt;&lt;br /&gt;      [unroll]&lt;br /&gt;      for (int k = 0; k &amp;lt; 3; k++)&lt;br /&gt;      {&lt;br /&gt;       Out.position = pos[k];&lt;br /&gt;       Stream.Append(Out);&lt;br /&gt;      }&lt;br /&gt;      Stream.RestartStrip();&lt;br /&gt;     }&lt;br /&gt;    }&lt;br /&gt;  }&lt;br /&gt;}&lt;/pre&gt;Cube shadow maps are not only useful to store point light shadows but shadows from other light types as well. For example shadows from ellipsoidal lights, where each of the directions has its own attenuation value, can be stored in cube maps as well.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/shad1.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1336" src="http://altdevblogaday.com/wp-content/uploads/2011/02/shad1-300x168.jpg" alt="" width="300" height="168" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 1: Ellipsoid Lighting&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/shad2.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1343" src="http://altdevblogaday.com/wp-content/uploads/2011/02/shad2-300x168.jpg" alt="" width="300" height="168" /&gt;&lt;/a&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 2 -- 8 Ellipsoidal Light Shadow Maps&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/shad5.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1344" src="http://altdevblogaday.com/wp-content/uploads/2011/02/shad5-300x168.jpg" alt="" width="300" height="168" /&gt;&lt;/a&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 3: Ellipsoid Lighting&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/Shadows1.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1339" src="http://altdevblogaday.com/wp-content/uploads/2011/02/Shadows1-300x175.jpg" alt="" width="300" height="175" /&gt;&lt;/a&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Images 4: Many Shadows&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/Shadows2.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1331" src="http://altdevblogaday.com/wp-content/uploads/2011/02/Shadows2-300x175.jpg" alt="" width="300" height="175" /&gt;&lt;/a&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Images 5: Level Shadows&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/Shadows31.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1341" src="http://altdevblogaday.com/wp-content/uploads/2011/02/Shadows31-300x175.jpg" alt="" width="300" height="175" /&gt;&lt;/a&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Images 6: More Shadows&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Shadow Caching&lt;br /&gt;&lt;/strong&gt;Depending on the amount of memory that is available on the platform, caching 16-bit depth cube shadow maps might become an option. For example integrated graphics chips usually share memory with the CPU and might have a higher amount of - then usually slower- memory available. Storing for example 100 256x256x6 16-bit cube shadow maps is about 75 Mb.&lt;br /&gt;&lt;br /&gt;To find a good caching algorithm, the following parameters might be useful:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Distance from shadow to camera&lt;/li&gt;&lt;li&gt;Size of shadow on screen&lt;/li&gt;&lt;li&gt;Is there a moving object in the area of the light / shadow ?&lt;/li&gt;&lt;/ul&gt;From those parameters and others, the question if anything moves in the area of influence of the light / shadow is certainly the most important one. As long as nothing moves or changes in the area of the light, an update of the shadow map is not necessary and the shadow data can stay unaltered in memory.&lt;br /&gt;&lt;br /&gt;Even if something is moving in the area of influence of the light, an update of the shadow map might not be necessary if the shadow is not easily visible from the point of view of the player. If the shadow is far away and it is hard to spot that an object is moving through the shadow, it would make sense to not update the map and to keep it cached.&lt;br /&gt;The question if a light with a shadow map with a very small visible area on screen needs to be updated, follows a similar logic.&lt;br /&gt;&lt;br /&gt;If there is not enough memory available, caching might be restricted by distance and then maps are moved in and out into the cache.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Shadow Bias Value&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The classical shadow mapping algorithm generates a binary value based on a comparison. Because this comparison relies on hardware precision, it is prone to generate slight errors in edge cases.&lt;br /&gt;In case of a regular 2D shadow map, the usual solution is to introduce a shadow bias value. Commonly this value needs to be picked by the user, which makes it scene dependent. In case of cube shadow maps that are attached to a moving light, there is no sensible way to pick a working value.&lt;br /&gt;&lt;br /&gt;Approximating the binary comparison with an exponential function will lead to better overall results [Salvi].&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/ESM.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1345" src="http://altdevblogaday.com/wp-content/uploads/2011/02/ESM-300x190.jpg" alt="" width="300" height="190" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 7: Exponential Shadow Mapping Function&lt;/em&gt;&lt;br /&gt;&lt;pre escaped="true" lang="C"&gt;float depth = tex2D(ShadowSampler, pos.xy).x;&lt;br /&gt;shadow = saturate(2.0 - exp((pos.z - depth) * k));&lt;/pre&gt;&lt;strong&gt;Softening the Penumbra&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;There are many approaches that cover the softening of the Penumbra. Certainly all the probability based shadow filtering techniques that can elevate hardware filtering have a very good quality / performance ratio.&lt;br /&gt;Screen-space filtering to achieve perceptually correct cube shadow maps is an area where game developers just started to do research. An implementation is described in [&lt;a title="Massive Screen-Space Soft Point Light Shadows" href="http://www.confettispecialfx.com/massive-point-light-soft-shadows" target="_blank"&gt;Engel&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/16-PointLightSoftShadows.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1348" src="http://altdevblogaday.com/wp-content/uploads/2011/02/16-PointLightSoftShadows-300x225.jpg" alt="" width="300" height="225" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 8: 16 Screen-Space Soft Point Light Shadows&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://altdevblogaday.com/wp-content/uploads/2011/02/32-PointLightSoftShadows.jpg"&gt;&lt;img class="alignnone size-medium wp-image-1349" src="http://altdevblogaday.com/wp-content/uploads/2011/02/32-PointLightSoftShadows-300x168.jpg" alt="" width="300" height="168" /&gt;&lt;/a&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Image 9: 32 Screen-Space Soft Point Light Shadows&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Future Development&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Game developers try to move away from pre-calculated lighting and shadowing and any other pre-calculated data. The main reasons to do this are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;hard to mimic a 24 hour cycle&lt;/li&gt;&lt;li&gt;storing those light or radiosity maps on disk or even the DVD / Blu-ray required a lot of memory&lt;/li&gt;&lt;li&gt;streaming the light or radiosity maps from disk or hard-drive through hardware to the GPU consumes valuable memory bandwidth&lt;/li&gt;&lt;li&gt;geometry with light maps or radiosity maps is not destructible anymore (this is a reason to avoid any solution with pre-computed maps)&lt;/li&gt;&lt;li&gt;while the environment is lit nicely, it is hard to light characters in a consistent way with this environment&lt;/li&gt;&lt;/ul&gt;A shadow caching scheme might be one tool to remove pre-calculated data. Following the recent development in dynamic global illumination in the area of one-bounce lighting effects[Dachsbacher][DachsbacherSI][Kaplanyan], it is possible to store not only shadow data but also data for reflective shadow maps in cube maps. All the ideas mentioned above apply then to this approach.  One question that remains them is if it is best to cache the data in cube shadow maps or use a memory area with higher density for this, like a Light Propagation Volume.&lt;br /&gt;In any case temporal coherence can be used to improve shadows and global illumination data over time.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Acknowledgements&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;I want to thank my business partner Peter Santoki for the help, feedback and encouragement while implementing the ideas covered above. I also would like to thank Tim Martin for help in researching the general topic of cube shadow map rendering and Igor Lobanchikov for the cube map optimizations trick.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;References &lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;[Engel] Wolfgang Engel, "Massive Screen-Space Soft Point Light Shadows",&lt;br /&gt;&lt;a href="http://www.confettispecialfx.com/massive-point-light-soft-shadows"&gt;http://www.confettispecialfx.com/massive-point-light-soft-shadows&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[Dachsbacher] Carsten Dachsbacher, Marc Stamminger, "Reflective Shadow Maps",&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/download/rsm.pdf"&gt;http://www.vis.uni-stuttgart.de/~dachsbcn/download/rsm.pdf&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/publications.html"&gt;http://www.vis.uni-stuttgart.de/~dachsbcn/publications.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[DachsbacherSii] Carsten Dachsbacher, Marc Stamminger, "Splatting Indirect Illumination",&lt;br /&gt;&lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/download/sii.pdf"&gt; http://www.vis.uni-stuttgart.de/~dachsbcn/download/sii.pdf&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[Kaplanyan] Anton Kaplanyan, Wolfgang Engel, Carsten Dachsbacher,&lt;br /&gt;"Diffuse Global Illumination with Temporally Coherent Light Propagation Volumes",&lt;br /&gt;pp 185 - 203, AK Peters, 2011&lt;br /&gt;&lt;br /&gt;[RawK] RawK®, &lt;a href="http://www.confettispecialfx.com/rawk%C2%AE-graphics-demo-for-sandy-bridge"&gt;http://www.confettispecialfx.com/rawk%C2%AE-graphics-demo-for-sandy-bridge&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[Salvi] Marco Salvi, "Rendering Filtered Shadows with Exponential Shadow Maps", ShaderX6&lt;br /&gt;Marco Salvi's website: &lt;a href="http://pixelstoomany.wordpress.com/?s=Exponential"&gt;http://pixelstoomany.wordpress.com/?s=Exponential&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4925821427344093812?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4925821427344093812/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4925821427344093812' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4925821427344093812'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4925821427344093812'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2011/02/shadows-thoughts-on-ellipsoid-light.html' title='Shadows - Thoughts on Ellipsoid Light Shadow Rendering'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4692718683222767452</id><published>2010-03-20T10:40:00.000-07:00</published><updated>2010-03-20T10:40:27.304-07:00</updated><title type='text'>Edge Detection Trick</title><content type='html'>Benualdo posted in the Light Pre-Pass Thread a cool trick on how to detect edges to run a per-sample shader for MSAA (just in case centroid sampling doesn't work for you). Here it is:&lt;br /&gt;----------&lt;br /&gt;another stupid trick for edge detection pass on platforms that support sampling the MSAA surface with linear sampling: sample the normal buffer twice, once with POINT sampling and once with LINEAR sampling. Use clip(-abs(L-P)+eps). The linear sampled value should be used to compute the lighting of "non-MSAA" texels in the same shader to avoid an extra pass.&lt;br /&gt;----------&lt;br /&gt;eps is a small threshold value to bias the texkill test so that when the multisampled normals are only a little different then we could use the averaged value to perform the lighting at non-MSAA resolution during the first pass as an optimization.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4692718683222767452?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4692718683222767452/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4692718683222767452' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4692718683222767452'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4692718683222767452'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2010/03/edge-detection-trick.html' title='Edge Detection Trick'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6431847953147509348</id><published>2010-02-27T10:30:00.000-08:00</published><updated>2010-02-27T10:37:49.667-08:00</updated><title type='text'>GPU Pro</title><content type='html'>There is a blog concerning the upcoming book GPU Pro at&amp;nbsp;&lt;a href="http://gpupro.blogspot.com/"&gt;http://gpupro.blogspot.com/&lt;/a&gt;. &lt;br /&gt;I posted the Table of Contents for GPU Pro. You can pre-order it on &lt;a href="http://www.amazon.com/gp/product/1568814720/ref=s9_simh_gw_p14_t1?pf_rd_m=ATVPDKIKX0DER&amp;pf_rd_s=center-2&amp;pf_rd_r=18KGKN63DF6R4GCHNV7W&amp;pf_rd_t=101&amp;pf_rd_p=470938631&amp;pf_rd_i=507846"&gt;Amazon here&lt;/a&gt;.&lt;br /&gt;There is another blog for GPU Pro 2 with a call for authors, in case you want to see your name written in golden letters in a book :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6431847953147509348?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6431847953147509348/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6431847953147509348' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6431847953147509348'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6431847953147509348'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2010/02/gpu-pro.html' title='GPU Pro'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-624103099810146227</id><published>2010-01-31T09:31:00.000-08:00</published><updated>2010-01-31T10:17:11.539-08:00</updated><title type='text'>Hardware Tessellation</title><content type='html'>I was thinking about the advantages of Hardware Tessellation. I can see mainly three: &lt;br /&gt;- Compression&lt;br /&gt;Reduces on-disk storage, system, video memory usage -&gt;only the coarse mesh is stored&lt;br /&gt;Animation data is only stored for the coarse mesh&lt;br /&gt;- Memory bandwidth &lt;br /&gt;GPU fetches only vertex data of coarse mesh through PCI-E bus -&gt; higher vertex cache and fetch performance&lt;br /&gt;- Scalability&lt;br /&gt;Subdivision is recursive  -&gt; offers auto-LOD with adaptive metrics&lt;br /&gt;&lt;br /&gt;With the DirectX 11 implementation it might also reduce the workload of the vertex shader because the shader transforms or animates only the coarse mesh. But if we add up the additional workload of the hull and domain shader it might be a wash.&lt;br /&gt;&lt;br /&gt;For console developers, being able to store more world geometry on disc and in memory would be a great advantage. The reduction of the read memory bandwidth -while reading the data from memory- would also increase the efficiency.&lt;br /&gt;The main question is if tessellating the geometry puts such a huge workload on the GPU that it is not feasible. I would love to have some real-world data here ...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-624103099810146227?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/624103099810146227/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=624103099810146227' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/624103099810146227'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/624103099810146227'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2010/01/hardware-tessellation.html' title='Hardware Tessellation'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7843121246357066077</id><published>2009-12-30T21:46:00.001-08:00</published><updated>2010-01-16T21:33:20.326-08:00</updated><title type='text'>Direct3D 11 Overview</title><content type='html'>Here is a first draft for the data flow in the DirectX 11 rendering pipeline:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_2YU3pmPHKN4/S1KhDSPmotI/AAAAAAAAAcw/d38b4oA_DxM/s1600-h/DX11.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 303px; height: 400px;" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/S1KhDSPmotI/AAAAAAAAAcw/d38b4oA_DxM/s400/DX11.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5427577578743833298" /&gt;&lt;/a&gt;&lt;br /&gt;And here is the DirectCompute overview:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_2YU3pmPHKN4/Szw7AyiTTSI/AAAAAAAAAcQ/iAhd7TKPk_Y/s1600-h/DirectComputeCheatSheet.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 195px;" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/Szw7AyiTTSI/AAAAAAAAAcQ/iAhd7TKPk_Y/s400/DirectComputeCheatSheet.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5421272936198917410" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I would consider those now beta.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7843121246357066077?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7843121246357066077/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7843121246357066077' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7843121246357066077'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7843121246357066077'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/12/direct3d-11-overview.html' title='Direct3D 11 Overview'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2YU3pmPHKN4/S1KhDSPmotI/AAAAAAAAAcw/d38b4oA_DxM/s72-c/DX11.JPG' height='72' width='72'/><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4021830829048860265</id><published>2009-12-29T20:26:00.000-08:00</published><updated>2010-01-02T17:37:34.421-08:00</updated><title type='text'>Direct3D 10 Overview</title><content type='html'>I started working on a Direct3D 10 overview that only covers one page. Here is the latest version.&lt;p&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_2YU3pmPHKN4/Sz_0vqlzrBI/AAAAAAAAAcg/CpDXxOB-r3U/s1600-h/D3D10CheatSheet.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 301px; height: 400px;" src="http://3.bp.blogspot.com/_2YU3pmPHKN4/Sz_0vqlzrBI/AAAAAAAAAcg/CpDXxOB-r3U/s400/D3D10CheatSheet.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5422321576101260306" /&gt;&lt;/a&gt;&lt;p&gt;&lt;br /&gt;Please note that this overview has nothing to do with the way the hardware works. It is just a diagram that shows the data flow and the usage of the Direct3D 10 API to stream the data through several logical stages that might be represented in hardware by one unit. If you are interested in the actual hardware design, I would recommend reading &lt;br /&gt;&lt;br /&gt;&lt;a href="http://graphics.stanford.edu/~kayvonf/papers/fatahalianCACM.pdf"&gt;A Closer Look at GPUs&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4021830829048860265?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4021830829048860265/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4021830829048860265' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4021830829048860265'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4021830829048860265'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/12/direct3d-10-overview.html' title='Direct3D 10 Overview'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_2YU3pmPHKN4/Sz_0vqlzrBI/AAAAAAAAAcg/CpDXxOB-r3U/s72-c/D3D10CheatSheet.jpg' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7573478702769800184</id><published>2009-12-24T10:57:00.000-08:00</published><updated>2010-02-25T11:14:52.757-08:00</updated><title type='text'>New Links</title><content type='html'>I updated my list of links on the right side with some of the websites I keep an eye on.&lt;br /&gt;I never met Brian Karis but he has a few very forward thinking posts on his blog. The same is true for Pierre Terdiman. He covers many non-graphics related tasks and I believe I read his blog and former website since 7 years (?). Aurelio Reis has some cool procedural stuff on his blog. Simon Green worked on some of the coolest stuff that you can find in the NVIDIA SDK. His blog has some interesting entries on how the GPUs nowadays can render CG movie content in real-time while CPUs still need a lot more time to do the same. Then I also added Mike Acton's blog. I wonder how I could have forgotten this as often as Mike and I met in the last few months. He is certainly one of the SPU and Multi-core programming authorities in the industry. I especially like his opinion regarding C++ and data-centric design. Lots of people repeated this mantra in the last two years but I heard it from him before.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7573478702769800184?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7573478702769800184/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7573478702769800184' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7573478702769800184'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7573478702769800184'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/12/new-links.html' title='New Links'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4440835170573279813</id><published>2009-11-29T11:14:00.001-08:00</published><updated>2009-11-29T11:15:47.819-08:00</updated><title type='text'>CSE 190 GPU Programming UCSD</title><content type='html'>I am going to teach GPU Programming in the upcoming quarter at UCSD. Look out for course CSE 190. Here is the announcment:&lt;br /&gt;&lt;br /&gt;Course Objectives:&lt;br /&gt;This course will cover techniques on how to implement 3D graphics&lt;br /&gt;techniques in an efficient way on the Graphics Processing Unit (GPU).&lt;br /&gt;&lt;br /&gt;Course Description:&lt;br /&gt;This course focuses on algorithms and approaches for programming a&lt;br /&gt;GPU, including vertex, hull, tesselator, domain, geometry, pixel and&lt;br /&gt;compute shaders. After an introduction into each of the algorithms,&lt;br /&gt;the students will learn step-by-step on how to implement those&lt;br /&gt;algorithms on the GPU. Particular subjects may include geometry&lt;br /&gt;manipulations, lighting, shadowing, real-time global illumination,&lt;br /&gt;image space effects and 3D Engine design.&lt;br /&gt;&lt;br /&gt;Example Textbook(s):&lt;br /&gt;A list of reading assignments will be given out each week.&lt;br /&gt;&lt;br /&gt;Laboratory work:&lt;br /&gt;Programming assignments.&lt;br /&gt;&lt;br /&gt;Very exciting :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4440835170573279813?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4440835170573279813/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4440835170573279813' title='16 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4440835170573279813'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4440835170573279813'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/11/cse-190-gpu-programming-ucsd.html' title='CSE 190 GPU Programming UCSD'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>16</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3541752327201661516</id><published>2009-11-29T10:42:00.000-08:00</published><updated>2009-12-01T07:50:03.412-08:00</updated><title type='text'>Order-Independent Transparency</title><content type='html'>Transparent objects that require alpha blending cannot be rendered on top in a G-Buffer. Blending two or more normals, depth or position values leads to wrong results.&lt;br /&gt;In other words deferred lighting of objects that need to be visible through each other is not easily possible because the data for the object that is visible through another object is lost in a G-Buffer that can only store one layer of data for normals, depth and position.&lt;br /&gt;The traditional way to work around this is to have a separate rendering path that deals with rendering and lighting of transparent objects that need to be alpha blended. In essence that means there is a second lighting system that can be forward rendered and usually has a lower quality than the deferred lights.&lt;br /&gt;This system breaks down as soon as you have light numbers that are higher than a few dozen lights because forward rendering can't render so many lights. In that case it would be an advantage to use the same deferred lighting system that is used on opaque objects on transparent objects that would require alpha blending.&lt;br /&gt;The simple case is for example windows where you can look through one window and maybe two more windows behind each other and see what is behind them. For example you look through the window from the outside into a house and then in the house is another glass wall through which you can look and then behind that glass wall is a freshwater tank that is lit ... etc. you got the idea.&lt;br /&gt;This would be the "light" case to solve. Much harder are scenarios in which the number of transparent objects that can be behind each other is much higher ... like with particles or a room of transparent T-pots :-).&lt;br /&gt;&lt;br /&gt;On DirectX9 and DirectX 10 class of hardware, one of the solutions that is mentioned to solve the problem of order-independent transparency is called Depth Peeling. It seems this techniques was first described by Abraham Mammen ("Transparency and antialiasing algorithms Implemented with the virtual pixel maps technique",  IEEE Computer Graphics and Applications, vol. 9, no. 4, pp. 43-55, July/Aug. 1989) and Paul Diefenbach ("Pipeline rendering: Interaction and realism through hardware-based multi-pass rendering", Ph.D., University of Pennsylvania, 1996, 152 pages)(I don't have access to those papers). A description of the implementation was given by Cass Everitt &lt;a href="http://developer.nvidia.com/object/Interactive_Order_Transparency.html"&gt;here&lt;/a&gt;. The idea is to extract each unique depth in a scene into layers. Those layers are then composited in depth-sorted order to produce the correct blended image.&lt;br /&gt;In other words: the standard depth test gives us the nearest fragment/pixel. The next pass over the scene gives us the second nearest fragment/pixel; the pass after this pass the third nearest fragment/pixel. The passes after the first pass are rendered by using the depth buffer computed in the first pass and "peel away" depths values that are less than or equal to the values in that depth buffer. All the values that are not "peeled away" are stored in another depth buffer. Pseudo code might look like this:&lt;br /&gt;&lt;br /&gt;const float bias 0.0000001;&lt;br /&gt;&lt;br /&gt;// peel away pixels from previous layers&lt;br /&gt;// use a small bias to avoid precision issues.&lt;br /&gt;clip(In.pos.z - PreviousPassDepth - bias);&lt;br /&gt;&lt;br /&gt;By using the depth values from the previous pass for the following pass, multiple layers of depth can be stored. As soon as all the depth layers are generated, for each of the layers the G-Buffer data needs to be generated. This might be the color and normal render targets. In case we want to store three layers of depth, color and normal data also need to be stored for those three depth layers.&lt;br /&gt;Having a scene that has many transparent objects overlay each other, the number of layers increases substantially and therefore the memory consumption.&lt;br /&gt;&lt;br /&gt;A more advanced depth peeling technique was named Dual Depth Peeling and described by Louis Bavoil et al. &lt;a href="http://developer.download.nvidia.com/SDK/10.5/opengl/src/dual_depth_peeling/doc/DualDepthPeeling.pdf"&gt;here&lt;/a&gt;. The main advantage of this technique is that it peels a layer from the front and a layer from the back at the same time. This way four layers can be peeled away in two geometry passes.&lt;br /&gt;On hardware that doesn't support independent blending equations in MRTs, the two layers per pass are generated by using MAX blending and writing out each component of a float2(-depth, depth) variable into a dedicated render target that is part of a MRT. &lt;br /&gt;&lt;br /&gt;Nicolas Thibieroz describes in "Robust Order-Independent Transparency via Reverse Depth Peeling in DirectX 10" in ShaderX6 a technique called Reverse Depth Peeling. While depth peeling extracts layers in a front-to-back order and stores them for later usage, his technique peels the layers in back-to-front order and can blend with the backbuffer immediately. There is no need to store all the layers compared to depth peeling. Especially on console platforms this is a huge advantage.&lt;br /&gt;The order of operations is:&lt;br /&gt;&lt;br /&gt;1. Determine furthest layer&lt;br /&gt;2. Fill-up depth buffer texture&lt;br /&gt;3. Fill-up normal and color buffer&lt;br /&gt;4. Do lighting &amp; shadowing&lt;br /&gt;5. Blend in backbuffer&lt;br /&gt;6. Go to 1 for the next layer&lt;br /&gt;&lt;br /&gt;Another technique is giving up MSAA and using the samples to store up to eight layers of data. &lt;a href="http://www.sci.utah.edu/~bavoil/research/kbuffer/StencilRoutedABuffer_Sigg07.pdf"&gt;Kevin Myers et al.&lt;/a&gt; uses in the article "Stencil Routed A-Buffer" the stencil buffer to do sub-pixel routing of fragments. This way eight layers can be written in one pass. Because the layers are not ordered by depth they need to be sorted afterwards. The drawbacks are that the algorithm is limited to eight layers, allocates lots of memory (8xMSAA can be depending on the underlying implementation a 8x screen-size render target), requires hardware that supports 8xMSAA and the bitonic sort might be expensive. Giving up MSAA, the "light" case described above would be easily possible with this technique with satisfying performance but it won't work on scenes where many objects are visible behind several other objects.&lt;br /&gt;&lt;br /&gt;Another technique extends Dual Depth Peeling by attaching a sorted bucket list. The article &lt;a href="http://portal.acm.org/citation.cfm?id=1572769.1572779"&gt;"Efficient Depth Peeling via Bucket Sort"&lt;/a&gt; by Fang Liu et al. describes an adaptive scheme that requires two geometry passes to store depth value ranges in a bucket list, sorted with the help of a depth histogram. An implementation will be described in the upcoming book &lt;a href="http://gpupro.blogspot.com/"&gt;GPU Pro&lt;/a&gt;. The following image from this article shows the required passes.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_2YU3pmPHKN4/SxK9GdW0BLI/AAAAAAAAAbI/iA1wHV5f8Yg/s1600/DualDepthPeelingwithBucketList.PNG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 274px;" src="http://2.bp.blogspot.com/_2YU3pmPHKN4/SxK9GdW0BLI/AAAAAAAAAbI/iA1wHV5f8Yg/s400/DualDepthPeelingwithBucketList.PNG" border="0" alt=""id="BLOGGER_PHOTO_ID_5409594021082563762" /&gt;&lt;/a&gt;&lt;br /&gt;The Initial Pass is similar to Dual Depth Peeling. Similar to other techniques that utilize eight render targets, 32:32:32:32 each, the technique has huge memory requirements.&lt;br /&gt;&lt;br /&gt;To my knowledge those are the widely known techniques for order-independent transparency on DirectX 10 today. Do you know of any newer techniques suitable for DirectX 10 or DirectX 11 hardware?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3541752327201661516?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3541752327201661516/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3541752327201661516' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3541752327201661516'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3541752327201661516'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/11/order-independent-transparency.html' title='Order-Independent Transparency'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_2YU3pmPHKN4/SxK9GdW0BLI/AAAAAAAAAbI/iA1wHV5f8Yg/s72-c/DualDepthPeelingwithBucketList.PNG' height='72' width='72'/><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4444559633463429798</id><published>2009-11-15T13:30:00.001-08:00</published><updated>2009-11-17T18:44:09.978-08:00</updated><title type='text'>You want to become a Graphics Programmer ...</title><content type='html'>I regularly receive e-mails with the question what kind of books I recommend if someone wants to become a graphics programmer. Here is my current list (maybe some of you guys can add to this list?):&lt;br /&gt;First of all math is required:&lt;br /&gt;- &lt;a href="http://www.amazon.com/Vector-Calculus-Jerrold-E-Marsden/dp/0716749920/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258320777&amp;sr=8-1"&gt;Vector Calculus&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://www.amazon.com/Vector-Calculus-Linear-Algebra-Differential/dp/0971576653/ref=sr_1_2?ie=UTF8&amp;s=books&amp;qid=1258320878&amp;sr=8-2"&gt;Vector Calculus, Linear Algebra, and Differential Forms&lt;/a&gt; I have the 1999 version of this book&lt;br /&gt;- &lt;a href="http://www.amazon.com/Computer-Graphics-Mathematical-First-Steps/dp/0135995728/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258321011&amp;sr=1-1"&gt;Computer Graphics Mathematical First Steps&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://www.amazon.com/Mathematics-Computer-Graphics-John-Vince/dp/1846280346/ref=sr_1_2?ie=UTF8&amp;s=books&amp;qid=1258321048&amp;sr=1-2"&gt;Mathematics for Computer Graphics&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For a general knowledge in programming the CPU:&lt;br /&gt;- &lt;a href="http://www.amazon.com/Write-Great-Code-Understanding-Machine/dp/1593270038/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258321123&amp;sr=1-1"&gt;Write Great Code Volume 1: Understanding the Machine&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For a better knowledge on how to program the GPU:&lt;br /&gt;- &lt;a href="http://msdn.microsoft.com/en-us/directx/default.aspx"&gt;DirectX documentation&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://developer.nvidia.com/object/gpu_programming_guide.html"&gt;NVIDIA GPU Programming Guide&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://developer.amd.com/media/gpu_assets/ATI_Radeon_HD_2000_programming_guide.pdf"&gt;ATI GPU Programming Guide&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To learn about how to program certain effects in an efficient way:&lt;br /&gt;- &lt;a href="http://www.amazon.com/s/ref=nb_ss?url=search-alias%3Dstripbooks&amp;field-keywords=ShaderX&amp;x=0&amp;y=0"&gt;ShaderX - ShaderX7&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://developer.nvidia.com/page/home.html"&gt;GPU Gems - GPU Gems 3&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://www.akpeters.com/product.asp?ProdCode=4728"&gt;GPU Pro&lt;/a&gt; and &lt;a href="http://gpupro.blogspot.com/"&gt;GPU Pro Blog&lt;/a&gt; &lt;br /&gt;&lt;br /&gt;To start learning DirectX 10 API + Shader Programming:&lt;br /&gt;- &lt;a href="http://www.amazon.com/Introduction-3D-Game-Programming-DirectX/dp/1598220535/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258330039&amp;sr=8-1"&gt;Introduction to 3D Programming with DirectX 10&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://wiki.gamedev.net/index.php/D3DBook:Book_Cover"&gt;Programming Vertex, Geometry and Pixel Shaders&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To start learning OpenGL &amp; OpenGL ES:&lt;br /&gt;- &lt;a href="http://www.khronos.org/"&gt;Khronos group&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For general overview:&lt;br /&gt;- &lt;a href="http://www.amazon.com/Real-Time-Rendering-Third-Tomas-Akenine-Moller/dp/1568814240/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258330141&amp;sr=1-1"&gt;Real-Time Rendering&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://www.amazon.com/Fundamentals-Computer-Graphics-Peter-Shirley/dp/1568814690/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258330205&amp;sr=1-1"&gt;Fundamentals of Computer Graphics (this one also belongs in the math section)&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To get started with C:&lt;br /&gt;- &lt;a href="http://www.amazon.com/Programming-Language-2nd-Brian-Kernighan/dp/0131103628/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258330464&amp;sr=1-1"&gt;C Programming Language&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To learn C++&lt;br /&gt;- &lt;a href="http://www.amazon.com/C-Game-Programmers-Development/dp/1584504528/ref=sr_1_2?ie=UTF8&amp;s=books&amp;qid=1258331002&amp;sr=8-2"&gt;C++ for Game Developers&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://www.amazon.com/Cookbook-Cookbooks-OReilly-Ryan-Stephens/dp/0596007612/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1258331060&amp;sr=1-1"&gt;C++ Cookbook&lt;/a&gt;&lt;br /&gt;- there is a long list of more advanced C++ books ...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4444559633463429798?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4444559633463429798/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4444559633463429798' title='22 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4444559633463429798'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4444559633463429798'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/11/you-want-to-be-graphics-programmer.html' title='You want to become a Graphics Programmer ...'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>22</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4212479198664237325</id><published>2009-10-22T20:54:00.000-07:00</published><updated>2009-10-22T20:57:02.740-07:00</updated><title type='text'>River of Lights II</title><content type='html'>More work-in-progress shots.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_2YU3pmPHKN4/SuEpWKavNQI/AAAAAAAAAbA/gu3QB_tBFwg/s1600-h/ss3.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 225px;" src="http://3.bp.blogspot.com/_2YU3pmPHKN4/SuEpWKavNQI/AAAAAAAAAbA/gu3QB_tBFwg/s400/ss3.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5395639289296925954" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_2YU3pmPHKN4/SuEpPzeCskI/AAAAAAAAAa4/YZkM94l808c/s1600-h/ss1.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 225px;" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/SuEpPzeCskI/AAAAAAAAAa4/YZkM94l808c/s400/ss1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5395639180057555522" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4212479198664237325?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4212479198664237325/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4212479198664237325' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4212479198664237325'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4212479198664237325'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/10/river-of-lights-ii.html' title='River of Lights II'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_2YU3pmPHKN4/SuEpWKavNQI/AAAAAAAAAbA/gu3QB_tBFwg/s72-c/ss3.jpg' height='72' width='72'/><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5247695988191285811</id><published>2009-10-15T14:35:00.000-07:00</published><updated>2009-10-18T19:31:55.091-07:00</updated><title type='text'>BitMasks / Packing Data into fp Render Targets</title><content type='html'>Recently I had the need to pack bit fields into 32-bit channels of a 32:32:32:32 fp render target.&lt;br /&gt;First of all we can assume that all registers in the pixel shader operate in 32-bit precision and output data is written into a 32-bit fp render target. The 32-bit (or single-precision) floating point format uses 1 sign, 8-bits of exponent, and 23 bits of mantissa following the IEEE 754 standard. &lt;div&gt;&lt;br /&gt;&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 400px; DISPLAY: block; HEIGHT: 80px; CURSOR: hand" id="BLOGGER_PHOTO_ID_5391556586446327314" border="0" alt="" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/StKoJlGNDhI/AAAAAAAAAaw/hv6Ce8TFNps/s400/32-bitIEEE754.jpg" /&gt;&lt;/div&gt;&lt;br /&gt;To maintain maximum precision during floating-point computations, most computations use normalized values. Keeping floating-point numbers normalized is beneficial because it maintains the maximum number of bits of precision in a computation. If several higher-order bits of the mantissa are all zero, the mantissa has that many fewer bits of precision available for computation. Therefore a floating-point computation will be more accurate if it involves only normalized values whose higher-order mantissa bit contains one.&lt;br /&gt;&lt;br /&gt;The IEEE 754 32-bit floating-point format specifies special cases in case the bits in the exponent are all set to zeros or ones. If all exponent bits are set, then the number represents either =/- infinity or a NaN (not-a-number), depending on the mantissa value. If all exponent bits are zero, then the number is denormalized and automatically gets flushed to zero as specified in the Direct3D 10 single-precision floating-point specifications (see Nicolas Thibieroz, "Packing Arbitrary Bit Fields into 16-bit Floating-Point Render Targets in DirectX10", ShaderX7).&lt;br /&gt;&lt;br /&gt;When packing bit values, those cases need to be avoided.&lt;br /&gt;&lt;br /&gt;&lt;div&gt;// Pack three positive normalized numbers between 0.0 and 1.0 into a 32-bit fp&lt;/div&gt;&lt;div&gt;// channel of a render target&lt;/div&gt;&lt;div&gt;float Pack3PNForFP32(float3 channel)&lt;/div&gt;&lt;div&gt;{&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// layout of a 32-bit fp register&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// 1 sign bit; 8 bits for the exponent and 23 bits for the mantissa&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;uint uValue;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// pack x&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;uValue = ((uint)(channel.x * 65535.0 + 0.5)); // goes from bit 0 to 15&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// pack y in EMMMMMMM&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;uValue |= ((uint)(channel.y * 255.0 + 0.5)) &lt;&lt; 16 &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// pack z in SEEEEEEE&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// the last E will never be 1b because the upper value is 254&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// max value is 11111110 == 254&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// this prevents the bits of the exponents to become all 1&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// range is 1.. 254&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// to prevent an exponent that is 0 we add 1.0&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;uValue |= ((uint)(channel.z * 253.0 + 1.5)) &lt;&lt; 24&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;return asfloat(uValue);&lt;/div&gt;&lt;div&gt;}&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;// unpack three positive normalized values from a 32-bit float&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;float3 Unpack3PNFromFP32(float fFloatFromFP32)&lt;/div&gt;&lt;div&gt;{&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;float a, b, c, d;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;uint uValue;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;uint uInputFloat = asuint(fFloatFromFP32);&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// unpack a&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// mask out all the stuff above 16-bit with 0xFFFF &lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;a = ((uInputFloat) &amp;amp; 0xFFFF) / 65535.0;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt;  &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;b = ((uInputFloat &gt;&gt; 16) &amp;amp; 0xFF) / 255.0;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// extract the 1..254 value range and subtract 1&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;// ending up with 0..253&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;c = (((uInputFloat &gt;&gt; 24) &amp;amp; 0xFF) - 1.0) / 253.0;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space:pre"&gt; &lt;/span&gt;return float3(a, b, c);&lt;/div&gt;&lt;div&gt;}&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5247695988191285811?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5247695988191285811/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5247695988191285811' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5247695988191285811'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5247695988191285811'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/10/bitmasks-packing-data-into-fp-render.html' title='BitMasks / Packing Data into fp Render Targets'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2YU3pmPHKN4/StKoJlGNDhI/AAAAAAAAAaw/hv6Ce8TFNps/s72-c/32-bitIEEE754.jpg' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-1575903519704746632</id><published>2009-09-30T00:11:00.000-07:00</published><updated>2009-09-30T00:14:19.209-07:00</updated><title type='text'>River of Lights</title><content type='html'>Work in progress shot here. More than 8000 lights attached to particles in this hallway.&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 400px; DISPLAY: block; HEIGHT: 225px; CURSOR: hand" id="BLOGGER_PHOTO_ID_5387154993014544674" border="0" alt="" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/SsME7HyamSI/AAAAAAAAAao/2NiBlfZshtY/s400/Demo+2009-09-30+00-10-35-11.jpg" /&gt;Resolution is 1280x720 and the GPU still runs with 158 frames per second. The whole level has about 16k lights.&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-1575903519704746632?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/1575903519704746632/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=1575903519704746632' title='16 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1575903519704746632'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1575903519704746632'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/09/river-of-lights.html' title='River of Lights'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2YU3pmPHKN4/SsME7HyamSI/AAAAAAAAAao/2NiBlfZshtY/s72-c/Demo+2009-09-30+00-10-35-11.jpg' height='72' width='72'/><thr:total>16</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-96526423432386028</id><published>2009-08-11T08:14:00.000-07:00</published><updated>2009-08-11T10:16:29.149-07:00</updated><title type='text'>SIGGRAPH 2009 Impressions: Inferred Lighting</title><content type='html'>There is a new lighting approach that extends the Light Pre-Pass idea. It is called Inferred Lighting and it was presented by Scott Kircher and Alan Lawrence from Volition. Here is the link&lt;br /&gt;&lt;br /&gt;&lt;a href="http://graphics.cs.uiuc.edu/~kircher/publications.html"&gt;http://graphics.cs.uiuc.edu/~kircher/publications.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;They assume a Light Pre-pass concept as covered here on this blog with three passes. The geometry pass where they fill up the buffer, the lighting pass where light properties are rendered into a light buffer and a material pass in which the whole scene is rendered again, this time re-constructing different materials.&lt;br /&gt;Their approach adds several new techniques to the toolset used to do deferred lighting / Light Pre-Pass.&lt;br /&gt;&lt;br /&gt;1. They use a much smaller G-Buffer and Light buffer with a size of 800x540 on the XBOX 360. This way their memory bandwidth usage and pixel shading cost should be greatly reduced.&lt;br /&gt;&lt;br /&gt;2. To upscale the final light buffer, they use Discontinuity Sensitive Filtering. During the geometry pass, one 16 bit channel of the DSF buffer is ﬁlled with the linear depth of the pixel, the other 16 bit channel is ﬁlled with an ID value that semi-uniquely identiﬁes continuous regions. The upper 8 bits are an object ID, assigned per-object (renderable instance) in the scene. Since 8 bits only allows 256 unique object IDs, scenes with more than this number of ob-jects will have some objects sharing the same ID.&lt;br /&gt;The lower 8 bits of the channel contain a normal-group ID. This ID is pre-computed and assigned to each face of the mesh. Anywhere the mesh has continuous normals, the ID is also continuous. A normal is continuous across an edge if and only if the two triangles share the same normal at both vertices of the edge.&lt;br /&gt;By comparing normal-group IDs the discontinuity sensitive ﬁlter can detect normal discontinuities without actually having to reconstruct and compare normals. Both the object ID and normal-group ID must exactly match the material pass polygon being rendered before the light buffer sample can be used (depth must also match withinan adjustable threshold).&lt;br /&gt;During the material pass, the pixel shader computes the locations of the four light buffer texels that would normally be accessed if regular bilinear ﬁltering would be used. These four locations are point sampled from the DSF buffer. The depth and ID values retrieved from the DSF buffer are compared against the depth and ID of the object being rendered. The results of this comparison are used to bias the usual bilinear ﬁltering weights so as to discard samples that do not belong to the surface currently rendering. These biased weights are then used in custom bilinear ﬁltering of the light buffer. Since the ﬁlter only uses the light buffer samples that belong to the object being rendered, the resulting lighting gives the illusion of being at full resolution. This same method works even when the framebuffer is multisampled (hardware MSAA), however sub-pixel artifacts can occur, due to the pixel shader only being run once per pixel, rather than once per sample.&lt;br /&gt;The authors report that such sub-pixel artifacts are typically not noticeable.&lt;br /&gt;&lt;br /&gt;3. The authors of this paper also implemented a technique that allows to render alpha polygons with the Light Pre-Pass / Deferred lighting. It is based on stippling and the usage of the DSF filtering.&lt;br /&gt;During the geometry pass the alpha polygons are rendered using a stipple pattern, so that their G-Buffer samples are interleaved with opaque polygon samples.&lt;br /&gt;In the material pass the DSF for opaque polygons will automatically reject stippled alpha pixels, and alpha polygons are handled by ﬁnding the four closest light buffer samples in the same stipple pattern, again using DSF to make sure the samples were not overwritten by some other geometry.&lt;br /&gt;Since the stipple pattern is a 2x2 regular pattern, the effect is that the alpha polygon gets lit at half the resolution of opaque objects. Opaque objects covered by one layer of alpha have a slightly reduced lighting resolution (one out of every four samples cannot be used).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-96526423432386028?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/96526423432386028/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=96526423432386028' title='18 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/96526423432386028'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/96526423432386028'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/08/siggraph-2009-impressions-inferred.html' title='SIGGRAPH 2009 Impressions: Inferred Lighting'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>18</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2894711471860966330</id><published>2009-07-28T15:20:00.000-07:00</published><updated>2009-07-28T15:35:05.048-07:00</updated><title type='text'>SIGGRAPH 2009</title><content type='html'>SIGGRAPH is next week and I am still preparing my talk. If you are around please come by and say hi. My talks title is "Light Pre-Pass Renderer Mark III" and it is part of the "Advances in Real-Time Rendering in 3D Graphics and Games" day on Monday next week:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.siggraph.org/s2009/sessions/courses/details/?id=12"&gt;http://www.siggraph.org/s2009/sessions/courses/details/?id=12&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I collected all the new development in this area, and added a few new things I found out while working on DirectX 10 / 11 implementations and will post a link to the slides here. Especially on the PS3 there is lots of new and interesting development (Judging from the number of games that will ship with this approach I want to believe that it is the most popular way to apply lots of lights in games now). I received a first draft of an article for ShaderX8 / GPU Pro from Steven Tovey about how they implemented the Light Pre-Pass in the upcoming game Blur on the PS3. They based their approach on work done by Matt Swoboda. The results look very cool. You can check out the screenshots on their website.&lt;br /&gt;&lt;br /&gt;There is lots of progress happening with the Oolong Engine for the iPhone / iPod Touch. Check out the change list on&lt;br /&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/oolongengine"&gt;http://code.google.com/p/oolongengine&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;We got OpenGL ES 2.0 running and there is a new tutorial series that looks really cool.&lt;br /&gt;&lt;br /&gt;In other news somehow my name was mentioned on "The Escapist". Here is the link for your entertainment:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.escapistmagazine.com/articles/view/columns/publishers-note/6250-Publishers-Note-Made-By-People.2"&gt;http://www.escapistmagazine.com/articles/view/columns/publishers-note/6250-Publishers-Note-Made-By-People.2&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2894711471860966330?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2894711471860966330/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2894711471860966330' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2894711471860966330'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2894711471860966330'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/07/siggraph-2009.html' title='SIGGRAPH 2009'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7504453085829973572</id><published>2009-07-03T07:30:00.001-07:00</published><updated>2009-07-03T07:43:22.942-07:00</updated><title type='text'>MSAA on the PS3 with Light Pre-Pass on the SPU</title><content type='html'>In the previous "MSAA on the PS3" thread Matt Swoboda jumped in and mentioned that they implemented MSAA on the SPU in the Phyre Engine. I knew that they implemented the Light Pre-Pass on the SPU but I completely forgot that they also had a solution to do MSAA on the SPU.&lt;br /&gt;You can find the presentation "Deferred Lighting and Post Processing on PLAYSTATION®" &lt;a href="http://www.technology.scee.net/files/presentations/gdc2009/DeferredLightingandPostProcessingonPS3.ppt"&gt;here&lt;/a&gt;.&lt;br /&gt;Because it is possible to read and write per sample with the SPU, they can achieve a similar functionality as the per-sample frequency of DirectX 10.1-class graphics hardware where each sample can be treated separately. So they can calculate the lighting for each of the sample values and write the results into each of the samples in the light buffer.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7504453085829973572?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7504453085829973572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7504453085829973572' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7504453085829973572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7504453085829973572'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/07/msaa-on-ps3-with-light-pre-pass-on-spu.html' title='MSAA on the PS3 with Light Pre-Pass on the SPU'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3397419977766510124</id><published>2009-06-29T15:55:00.000-07:00</published><updated>2009-06-29T16:29:39.799-07:00</updated><title type='text'>Ambient Occlusion in Screen-Space</title><content type='html'>Screen-Space Ambient Occlusion (SSAO) is quite popular in the moment. ShaderX7 had several articles and there are lots of approaches to gradually improve the effect.&lt;br /&gt;A good way to look at SSAO or any similar approach is to consider it part of a whole pipeline of effects that can share resources and extend the idea to include one diffuse (and specular) indirect bounce of light by re-using resources.&lt;br /&gt;The overall issues with SSAO are:&lt;br /&gt;1. quite expensive for the image quality improvement. Using the astonishing high amount of frame-time for other effects is an intriguing idea. In other words the performance / quality-improvement ratio is not very good compared to e.g. PostFX where a bunch of effects consumes a similar amount of time.&lt;br /&gt;2. a typical problem is that lighting is ignored by SSAO. Using the classical SSAO implementation under varying illumination introduces objectionable artifacts because the ambient term is darkened equally (obviously you can apply SSAO to the diffuse and specular term like a shadow term ... but then it isn't ambient anymore). If you have a "global ambient" light term like skylights, SSAO will diminish the effect. It also leads to problems with dynamic shadows.&lt;br /&gt;&lt;br /&gt;Overall I believe a fundamental shift to more generic method is necessary to solve those issues. This is one of the things I am looking into ... so expect an update at some point in the future.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3397419977766510124?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3397419977766510124/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3397419977766510124' title='21 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3397419977766510124'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3397419977766510124'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/06/ambient-occlusion-in-screen-space.html' title='Ambient Occlusion in Screen-Space'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>21</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4594567931167151760</id><published>2009-06-17T11:42:00.001-07:00</published><updated>2009-06-17T15:27:53.787-07:00</updated><title type='text'>MSAA on the PS3 with Deferred Lighting / Shading / Light Pre-Pass</title><content type='html'>&lt;p&gt;The Killzone 2 team came up with an interesting way to use MSAA on the PS3. You can find it on page 39 of the following slides:&lt;/p&gt;&lt;p&gt;&lt;a onmousedown="'UntrustedLink.bootstrap($(this)," href="http://www.dimension3.sk/mambo/Articles/Deferred-Rendering-In-Killzone/View-category.php" rel="nofollow" target="_blank"&gt;http://www.dimension3.sk/mambo/Articles/Deferred-Rendering-In-Killzone/View-category.php&lt;/a&gt;&lt;/p&gt;&lt;p&gt;What they do is read both samples in the multisampled render target, do the lighting calculations for both of them and then average the result and write it into the multi-sampled (... I assume it has to be multi-sampled because the depth buffer is multisampled) accumulation buffer. That somehow decreases the effectiveness of MSAA because the pixel averages all samples regardless of whether they actually pass the depth-stencil test. The multisampled accumulation buffer may therefore contain different values per sample when it was supposed to contain a unique value representing the average of all sample. Then on the other side they might only store a value in one of the samples and resolve afterwards ... which would mean the pixel shader runs only once.&lt;br /&gt;This is also called "on-the-fly resolves".&lt;/p&gt;&lt;p&gt;It is better to write into each sample a dedicated value by using the sampling mask but then you run in case of 2xMSAA your pixel shader 2x ... DirectX10.1+ has the ability to run the pixel shader per sample. That doesn't mean it fully runs per sample. The MSAA unit seems to replicate the color value accordingly. That's faster but not possible on the PS3. I can't remember if the XBOX 360 has the ability to run the pixel shader per-sample but this is possible.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4594567931167151760?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4594567931167151760/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4594567931167151760' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4594567931167151760'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4594567931167151760'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/06/msaa-on-ps3-with-deferred-lighting.html' title='MSAA on the PS3 with Deferred Lighting / Shading / Light Pre-Pass'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-1528641723873885570</id><published>2009-06-13T07:04:00.000-07:00</published><updated>2009-06-13T11:26:20.772-07:00</updated><title type='text'>Multisample Anti-Aliasing</title><content type='html'>Utilizing the Multisample Anti-Aliasing (MSAA) functionality of graphics hardware for deferred lighting can be challenging. Nicolas Thibieroz wrote an excellent article about MSAA published in ShaderX7 with the title "Deferred Shading with Multisampling Anti-Aliasing in DirectX10".&lt;br /&gt;The following figure from the ShaderX7 article shows how MSAA works:&lt;br /&gt;&lt;p&gt;&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 400px; DISPLAY: block; HEIGHT: 235px; CURSOR: hand" id="BLOGGER_PHOTO_ID_5346817797673931410" border="0" alt="" src="http://2.bp.blogspot.com/_2YU3pmPHKN4/SjO2dVUXUpI/AAAAAAAAAQ8/4YAuTRjLdfA/s400/FIGURE+2+-+Multisampling+Anti-Aliasing.jpg" /&gt;The pixel represented by a square has two triangles (blue and yellow) crossing some of its sample points. The black dot represents the pixel sample location (pixel center); this is were the pixel shader is executed. The cross symbol corresponds to the location of the multisamples where the depth tests are performed. Samples passing the depth test receive the output of the pixel shader. Those samples are replicated by the MSAA back-end into a multisampled render target that represents each pixel with -in that case- four samples. That means the render target size for an intended resolution of 1280x720 would be 2560x1440 representing each pixel with four samples but the pixel shader only writes 1280x720 times (assuming there is no overdraw) while the MSAA back-end replicates for each pixel four samples into the multisampled render target.&lt;br /&gt;With deferred lighting there can be several of those multi-sampled render targets as part of a Multiple-Render-Target (MRT). In the so called Geometry stage, data is written into this MRT; therefore called G-Buffer. In case of 4xMSAA each of the render targets of the G-Buffer would be 2560x1440 in size.&lt;br /&gt;In case of Deferred Lighting / Light Pre-Pass the G-Buffer holds normal and depth data. This data can never be resolved because resolving it would lead to incorrect results as shown by Nicolas in his article.&lt;br /&gt;After the Geometry phase comes the Lighting or Shading phase in a Deferred Lighting/Light Pre-Pass/Deferred Shading renderer. In an ideal world you could blit each sample (not pixel) into the multisampled render target -that holds the result of the Shading phase- by reading the G-Buffer sample and performing all the calculations necessary on it.&lt;br /&gt;In other words to achieve the best possible MSAA quality with those renderer designs, lighting equations would need to be applied on a per-sample basis into a multisampled render target and then later resolved.&lt;br /&gt;This is possible with DirectX 10.1 graphics hardware (AMD's 10.1 capable cards; didn't try if S3 cards that support 10.1 can do this as well) that allows to execute a pixel shader at sample frequency.&lt;br /&gt;To make this a viable option, this operation needs to be restricted to samples that belong to pixel edges. There are two passes necessary to make this work. One pass will use the pixel shader that runs operations performed on samples and in a second pass the pixel shader is run that performs operations per-pixel, which means the result of the pixel shader calculation is output to all samples passing the depth-stencil test.&lt;br /&gt;To restrict the pixel shader that performs operations per-sample, a stencil test is used.&lt;br /&gt;One interesting idea covered in the article is to detect edges with centroid sampling (available already on DirectX9 class graphics hardware). During the G-Buffer phase the vertex shader writes a variable unique to every pixel (e.g. pixel position data) into two outputs, while the associated pixel shader declares two inputs: one without and one with centroid sampling enabled. The pixel shader then compares the centroid-enabled input with the one without it. Differing values mean that samples were only partially covered by the triangle, indicating an edge pixel. A "centroid value" of 1.0 is then written out to a selected area of the G-Buffer (previously cleared to 0.0) to indicate that the covered samples belong to an edge pixel. Those values are then averaged while being resolved to find out the value per pixel. If the result is not exactly 0, then the current pixel is an edge pixel. This is shown in the following image from the article.&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_2YU3pmPHKN4/SjPLyu0foaI/AAAAAAAAARE/pajXSrth8JA/s1600-h/FIGURE+4+-+Centroid+Sampling.jpg"&gt;&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 400px; DISPLAY: block; HEIGHT: 294px; CURSOR: hand" id="BLOGGER_PHOTO_ID_5346841255041016226" border="0" alt="" src="http://3.bp.blogspot.com/_2YU3pmPHKN4/SjPLyu0foaI/AAAAAAAAARE/pajXSrth8JA/s400/FIGURE+4+-+Centroid+Sampling.jpg" /&gt;&lt;/a&gt; On the left the pixel shader input will always be evaluated at the center of the pixel regardless of whether it is covered by the triangle. On the right with centroid sampling, the two rightmost depth samples are covered by the triangle. The comparison of the values in the pixel shader will lead to the result that the samples were only partially covered by the triangle, indicating an edge pixel.&lt;br /&gt;Because DirectX10 capable graphics hardware does not support the pixel shader running at sample frequency, a different solution needs to be developed here.&lt;br /&gt;The best MSAA quality in that case is achieved by running the pixel shader multiple times per pixel, only enabling output to a single sample each pass. This can be achieved by using the OMSetBlendState() API. The results of this method would be identical to the DirectX 10.1 method but obviously due to the increased number of rendering passes and slightly reduced texture cache effectiveness more expensive. &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-1528641723873885570?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/1528641723873885570/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=1528641723873885570' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1528641723873885570'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1528641723873885570'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/06/multisample-anti-aliasing.html' title='Multisample Anti-Aliasing'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_2YU3pmPHKN4/SjO2dVUXUpI/AAAAAAAAAQ8/4YAuTRjLdfA/s72-c/FIGURE+2+-+Multisampling+Anti-Aliasing.jpg' height='72' width='72'/><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3344054382409161876</id><published>2009-05-23T11:39:00.000-07:00</published><updated>2009-05-23T11:51:18.884-07:00</updated><title type='text'>Deferred Lighting / Particle System</title><content type='html'>Here is a shot of a GPU based particle system with lights attached to each particle. I used Emil Persson's example Deferred Shading program as a basis to implement a Light Pre-Pass renderer with 4k lights and 4k particles. It runs fairly well on a GeForce 9600 GT here:&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 400px; DISPLAY: block; HEIGHT: 225px; CURSOR: hand" id="BLOGGER_PHOTO_ID_5339093769479491602" border="0" alt="" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/ShhFfuBTkBI/AAAAAAAAAQo/n0NNscJ0r-E/s400/Screenshot00.jpg" /&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3344054382409161876?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3344054382409161876/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3344054382409161876' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3344054382409161876'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3344054382409161876'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/05/deferred-lighting-particle-system.html' title='Deferred Lighting / Particle System'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2YU3pmPHKN4/ShhFfuBTkBI/AAAAAAAAAQo/n0NNscJ0r-E/s72-c/Screenshot00.jpg' height='72' width='72'/><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4895129740071607876</id><published>2009-05-18T10:15:00.000-07:00</published><updated>2009-05-18T11:24:52.497-07:00</updated><title type='text'>Light Pre-Pass: Knee-Deep</title><content type='html'>Several companies adopted the Light Pre-Pass idea, modified it or came up with similar ideas:&lt;div&gt;&lt;ul&gt;&lt;li&gt;Crytek: they call it Deferred lighting contrary to Deferred shading. The technique is mentioned in the new Cry Engine 3 presentation &lt;a href="http://www.crytek.com/technology/presentations/"&gt;here&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Garagegames in their new Torque 3D engine currently in beta. Read the article from Pat Wilson in ShaderX7 and the garagegames website&lt;/li&gt;&lt;li&gt;Insomniac came up with a Pre-lighting approach that is similar to this. See Mark Lee's presentation from GDC 2009 &lt;a href="http://www.gdconf.com/conference/Tutorial%20Handouts/200_insomniac/gdc09_insomniac_prelighting.pdf" style="text-decoration: none;"&gt;here&lt;/a&gt;&lt;/li&gt;&lt;li&gt;DICE is using it since a long time already&lt;/li&gt;&lt;li&gt;I believe EA used it in Dead Space :-)&lt;/li&gt;&lt;li&gt;Carsten Dachsbacher described a similar idea in his article "Splatting of Indirect Illumination" &lt;a href="http://www.vis.uni-stuttgart.de/~dachsbcn/download/sii.pdf"&gt;here&lt;/a&gt; and in ShaderX5&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;One of the interesting areas in this context is the ability to implement a one-bounce global illumination effect with the data in the G-Buffer and the light buffer ...&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4895129740071607876?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4895129740071607876/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4895129740071607876' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4895129740071607876'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4895129740071607876'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/05/light-pre-pass-knee-deep.html' title='Light Pre-Pass: Knee-Deep'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-8539185554096531152</id><published>2009-04-30T18:17:00.000-07:00</published><updated>2009-05-03T08:21:56.497-07:00</updated><title type='text'>3D Supershape</title><content type='html'>Over the last few years I was looking into the 3D Supershape formula described by Paul Bourke &lt;a href="http://local.wasp.uwa.edu.au/~pbourke/geometry/supershape3d/"&gt;here&lt;/a&gt; and originally developed by Johan Gielis. I love the shape of the objects that are a result of those and therefore I always wanted to use it to create my own demos after I saw the one from Jetro Lauha (&lt;a href="http://jet.ro/creations"&gt;http://jet.ro/creations&lt;/a&gt;). Here is my first attempt to generate C source out of the equations: &lt;div&gt;&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 400px; DISPLAY: block; HEIGHT: 55px; CURSOR: hand" id="BLOGGER_PHOTO_ID_5330652904004402626" border="0" alt="" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/SfpIkrwWPcI/AAAAAAAAAQQ/m9U7N630Iz4/s400/equation1.jpg" /&gt;&lt;/div&gt;&lt;br /&gt;Suitable C pseudo code could be: &lt;p&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;float r = pow(pow(fabs(cos(m * o / 4)) / a, n2) + pow(fabs(sin(m * o / 4)) / b, n3), 1 / n1);&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;The result of this calculation is in polar coordinates. Please note the difference between the equation and the C code. The equation has a negative power value, the C doesn't. To extend this result into 3D, the spherical product of several superformulas is used. For example, the 3D parametric surface is obtained multiplying two superformulas &lt;i&gt;S1&lt;/i&gt;and &lt;i&gt;S2&lt;/i&gt;. The coordinates are defined by the relations:&lt;/p&gt;&lt;p&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_2YU3pmPHKN4/SfpMORWidrI/AAAAAAAAAQY/u3Bo-U2bYfc/s1600-h/equation2.jpg"&gt;&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 230px; DISPLAY: block; HEIGHT: 118px; CURSOR: pointer" id="BLOGGER_PHOTO_ID_5330656917006218930" border="0" alt="" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/SfpMORWidrI/AAAAAAAAAQY/u3Bo-U2bYfc/s400/equation2.jpg" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;The sphere mapping code uses two r values:&lt;/p&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;point-&gt;x = (float)(cosf(t) * cosf(p) / r1 / r2);&lt;br /&gt;point-&gt;y = (float)(sinf(t) * cosf(p) / r1 / r2);&lt;br /&gt;point-&gt;z = (float)(sinf(p) / r2);&lt;/span&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Because r1 and r2 had a positive power value in the C code above we have to divide by those variables here. Here is a Mathematica render of this code:&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_2YU3pmPHKN4/SfplihFi-XI/AAAAAAAAAQg/dkDwHkP7Igg/s1600-h/Ship.jpg"&gt;&lt;img style="TEXT-ALIGN: center; MARGIN: 0px auto 10px; WIDTH: 400px; DISPLAY: block; HEIGHT: 336px; CURSOR: pointer" id="BLOGGER_PHOTO_ID_5330684752617994610" border="0" alt="" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/SfplihFi-XI/AAAAAAAAAQg/dkDwHkP7Igg/s400/Ship.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-8539185554096531152?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/8539185554096531152/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=8539185554096531152' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8539185554096531152'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8539185554096531152'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/04/3d-supershape.html' title='3D Supershape'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2YU3pmPHKN4/SfpIkrwWPcI/AAAAAAAAAQQ/m9U7N630Iz4/s72-c/equation1.jpg' height='72' width='72'/><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7482326665547507108</id><published>2009-04-29T18:05:00.001-07:00</published><updated>2009-04-29T18:09:14.235-07:00</updated><title type='text'>Rockstar Games</title><content type='html'>Today GTA IV was launched a year ago and it is my last day where I am employed at Rockstar Games. After fantastic more than four years I felt like I should get a break to go back to some research topics and see my kids growing for a while :-), so I gave my notice two weeks ago. &lt;div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7482326665547507108?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7482326665547507108/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7482326665547507108' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7482326665547507108'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7482326665547507108'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/04/rockstar-games.html' title='Rockstar Games'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3598828495898729039</id><published>2009-04-29T17:58:00.001-07:00</published><updated>2009-04-29T18:04:39.636-07:00</updated><title type='text'>Beagle Board</title><content type='html'>I got the whole development environment going and wrote a few small little graphics demos for it. All the PowerVR demos I tried ran on it nicely. Very cool!&lt;div&gt;If you are interested in a next-gen mobile development platform I would defitely recommend looking into this at &lt;p&gt;&lt;a href="http://beagleboard.org/"&gt;http://beagleboard.org/&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Any further development has now moved to lowest priority ... maybe at some point I will play around more with Angstroem. There is an online image builder &lt;/p&gt;&lt;p&gt;&lt;a href="http://amethyst.openembedded.net/~koen/narcissus/" target="_blank" style="color: rgb(51, 51, 204); "&gt;http://amethyst.openembedded.&lt;wbr&gt;net/~koen/narcissus/&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3598828495898729039?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3598828495898729039/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3598828495898729039' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3598828495898729039'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3598828495898729039'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/04/beagle-board.html' title='Beagle Board'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2776860224925766802</id><published>2009-04-20T20:51:00.000-07:00</published><updated>2009-04-22T07:07:29.476-07:00</updated><title type='text'>BeagleBoard.org Ubuntu 8.04</title><content type='html'>In the last few days I setup a development environment for a BeagleBoard (see beagleboard.org). I wanted to hold the next-gen environment for future phones and the OpenPandora in my hands today. Overall the size of the board is astonishingly small and you can power it with the USB port. The board runs Angstroem -a Linux OS-, it has the OMAP3530 processor on there. It has a dedicated video decode DSP, the PowerVR SGX chipset, a sound chip and a few other things that I haven't used so far. You can even plug in a keyboard and a mouse and you have a full-blown computer with 256 MB RAM and 256 MB SDRAM.&lt;br /&gt;To get this going I had to install a Linux OS on one of my PCs; Ubuntu 8.04. To relieve the pain of having to google all the Linux commands again and again I try to write down a few notes for myself here:&lt;br /&gt;- minicom is not installed by default. You have to install it yourself. To do this you have to open up Applications -&gt; Add/Remove and refresh the package list (you need an internet connection for this) and then install the build essentials first and then minicom by typing into a terminal:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;sudo apt-get install build-essential&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;sudo apt-get install minicom&lt;/span&gt;&lt;br /&gt;- to look for the RS232 serial device you can use&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;dmesg | grep tty&lt;/span&gt;&lt;br /&gt;I found adding environment variables to the PATH statement different on Ubuntu 8.04. You can set an environment variable by using&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;export VARNAME=some_string&lt;/span&gt;&lt;br /&gt;e.g&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;export PATH=$PATH:some/other/path&lt;/span&gt;&lt;br /&gt;To check if it is set you can use&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;echo $PATH&lt;/span&gt;&lt;br /&gt;For the PLATFORM you set it by typing&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;export PLATFORM=LinuxOMAP3&lt;/span&gt;&lt;br /&gt;you use&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;echo $PLATFORM&lt;/span&gt;&lt;br /&gt;to check if it is correct.&lt;br /&gt;Similar for library pathes you type&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;export LIBDIR=$PWD &lt;/span&gt;&lt;br /&gt;from the directory where the lib files are. To check that this works you can use&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;echo $LIBDIR&lt;/span&gt;&lt;br /&gt;To make all those variable values persistent you can copy those statements at the end of the .bashrc file. Some other things I found convenient were:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt; gksudo gedit&lt;/span&gt;&lt;br /&gt;start the editor with sudo.&lt;br /&gt;Copying a file from one in another directory can be done by using the cp command like this&lt;br /&gt;&lt;span id="intelliTXT" name="intelliTxt"  style="font-family:courier new;"&gt;&lt;strong&gt;&lt;span style="font-weight: normal;"&gt;$ cp -i goulash recipes/hungarian&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;cp: overwrite recipes/hungarian/goulash (y/n)?&lt;/span&gt;&lt;br /&gt;You can copy a directory path in the terminal by dragging the file from the file browser into the terminal command line.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2776860224925766802?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2776860224925766802/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2776860224925766802' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2776860224925766802'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2776860224925766802'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/04/beagleboardorg-ubuntu-804.html' title='BeagleBoard.org Ubuntu 8.04'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-8151241452537747112</id><published>2009-03-21T10:49:00.000-07:00</published><updated>2009-03-21T10:56:45.833-07:00</updated><title type='text'>ShaderX7 on Sale</title><content type='html'>ShaderX7 has more than 800 pages. I like the following screenshot from Amazon.com:&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_2YU3pmPHKN4/ScUqOMZlMVI/AAAAAAAAAPs/6HXdzFZwpBw/s1600-h/ShaderX7Amazon.jpg"&gt;&lt;img id="BLOGGER_PHOTO_ID_5315701358515794258" style="DISPLAY: block; MARGIN: 0px auto 10px; WIDTH: 400px; CURSOR: hand; HEIGHT: 183px; TEXT-ALIGN: center" alt="" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/ScUqOMZlMVI/AAAAAAAAAPs/6HXdzFZwpBw/s400/ShaderX7Amazon.jpg" border="0" /&gt;&lt;/a&gt; ShaderX8 is already announced. Proposals are due by May 19th, 2009. Please send them to wolf at shaderx.com. An example proposal, writing guidelines and a FAQ can be downloaded from &lt;a href="http://www.shaderx6.com/ShaderX6.zip"&gt;www.shaderx6.com/ShaderX6.zip&lt;/a&gt;. The schedule is available on &lt;a href="http://www.shaderx8.com/"&gt;http://www.shaderx8.com/&lt;/a&gt;. &lt;div&gt;&lt;br /&gt;&lt;p&gt;Thanks to Eric Haines for reminding me to add this to this page :-)&lt;br /&gt;&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-8151241452537747112?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/8151241452537747112/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=8151241452537747112' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8151241452537747112'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8151241452537747112'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/03/shaderx8-on-sale.html' title='ShaderX7 on Sale'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2YU3pmPHKN4/ScUqOMZlMVI/AAAAAAAAAPs/6HXdzFZwpBw/s72-c/ShaderX7Amazon.jpg' height='72' width='72'/><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5420073418126117977</id><published>2009-03-18T22:37:00.000-07:00</published><updated>2009-03-18T22:54:19.501-07:00</updated><title type='text'>Mathematica</title><content type='html'>I switched from Maple to Mathematica last week. One of my small little projects is to store all the graphics algorithms I liked to visualize in the last few years in one file. A kind of condensed memory of the things I worked on. Here is an example for a simple Depth of Field effect (as already covered in my GDC 2007 talk):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_2YU3pmPHKN4/ScHbgxpEuII/AAAAAAAAAPc/o8e7nrhLSwM/s1600-h/GraphicsMathVisualization.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 359px; height: 200px;" src="http://3.bp.blogspot.com/_2YU3pmPHKN4/ScHbgxpEuII/AAAAAAAAAPc/o8e7nrhLSwM/s400/GraphicsMathVisualization.jpg" alt="" id="BLOGGER_PHOTO_ID_5314770391401805954" border="0" /&gt;&lt;/a&gt;Distance runs on the axis called Z value. So 0 is close to the camera and 1.0 is far away. You can see how the near and far blur plane fade in and out with increasing of the value called Range. The equation to plot this in mathematica is rather simple. In practice it is a quite efficient approach to achieve the effect.&lt;br /&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;Plot3D[R*Abs[0.5 - z], {z, 1, 0}, {R, 0, 1}, &lt;/span&gt;&lt;span style="font-size:85%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;PlotStyle -&gt; Directive[Pink, Specularity[White, 50], Opacity[0.8]], &lt;/span&gt;&lt;span style="font-size:85%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;PlotLabel -&gt; "Depth of Field", AxesLabel -&gt; {"Z value", "Range"}]&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;My plan is to develop a few new algorithms and show the results here. It will be an exercise in thinking about new things for me. If you have any suggestions on what I should cover, please do not hesitate to post them in the comment line.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5420073418126117977?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5420073418126117977/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5420073418126117977' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5420073418126117977'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5420073418126117977'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/03/mathematica.html' title='Mathematica'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_2YU3pmPHKN4/ScHbgxpEuII/AAAAAAAAAPc/o8e7nrhLSwM/s72-c/GraphicsMathVisualization.jpg' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3222634932308852339</id><published>2009-02-22T16:09:00.000-08:00</published><updated>2009-02-23T15:54:51.653-08:00</updated><title type='text'>Team Leadership in the Game Industry</title><content type='html'>A few of my friends contributed to the book "Team Leadership in the Game Industry" by Seth Spaulding II. So I was curious what you can write about leaders in this industry. Having spent most of my professional life outside of the game industry I believe I developed a different frame of reference than many of my colleagues.&lt;br /&gt;&lt;br /&gt;First of all: the book is great and definitely worth a read. It is written in a very informative, instructive and entertaining way (... if you know the guys that contributed to it you know that it is worth it :-) ).&lt;br /&gt;&lt;br /&gt;With that being said, let's start with the review by looking at the Table of Content. I know that I usually spent more time than other people with reading the TOC. This is the best way for me to figure out what a book has to offer. A good TOC shows you the big picture of a book and allows you to see the pattern that the author chose on how to approach the topic. In most cases it even allows you to proof the underlying logic.&lt;br /&gt;The book consists of 9 chapters. Each chapter consists of a analysis of facts by the author followed by an interview of a game industry veteran. The topics span from "How We got here" over "Anatomy of a Game-Dev Company", "How Leaders are Chosen ...", "A Litmus Test for Leads", "Leadership Types and Traits ..." and then they go into more detail with the "The Project Team Leader ...", "The Department Leader ...", "Difficult Employees ...", "The Effect of Great Team Leadership" followed by a "Sample Skill Ladder" for artists in the appendix.&lt;br /&gt;&lt;br /&gt;You might feel the need to discuss some of the details covered in each chapter but it is clear that this is the right formal approach to slice up the delicate topic of leadership in our industry.&lt;br /&gt;&lt;br /&gt;When I first skipped through the book I wanted to figure out what kind of values the author has. After all a good leader makes it clear what kind of values he/she follows. I found it in the introduction. Here is the quote: "As will be seen, a major cause of people leaving a company is the perceived poor quality of their supervisors and senior management. The game business is a talent-based industry -the stronger and deeper your talent is, the better chances are of creating a great game. It is very difficult, in any hiring environment, to build the right mix of cross-disciplinary talent who function as a team at a high level; indeed, most companies never manage it. Once you get talented individuals on board, it's critical not to lose them. Finding and nurturing compentent leaders who have the trust of the team will generate more retention than any addition of pool tables, movie nights, or verbal commitments to the value of "quality of life"."&lt;br /&gt;You might think this is the most obvious thing to say in the game industry.&lt;br /&gt;&lt;br /&gt;Obviously the book wants to cover the process to setup a creative and great environment for all humans involved in the process of creating great games. Creating a great working environment starts with picking the right leaders that enable people by helping them to give their best. A great leader serves his/her people. He/she sees the best in everyone and has the ability to expose this talent. Many interviewees in the book also mention that humor is a leadership skill. I trained junior managers for BMW, Daimler, ABB and other companies back in Germany for two years on weekends and I always thought this is a strong skill. Making people laugh starts a lot of processes in the body that make people more relaxed and in general brighten up their day. Whoever can do this can certainly improve the morale and therefore efficiency of a team in seconds ... priceless.&lt;br /&gt;&lt;br /&gt;Managing a creative team is a completely different story than -for example- a sales team. The human factor in the relationship between people plays an important role. They have to create something together, while a sales person is on his own out in the field and comes back with a number and relies on a relationship with a potential customer that only lasts a few hours face-to-face time, a creative team stays together for years and has to overcome all the things that come up when humans have to live in a small space together. There is a complex social network in place that defines the relationships between those humans and it is important to keep the team running with all the constantly changing love/hate -and in-between- relationships on board. People on the team might even deal with difficult personal relationships and you end up with a mixture of chaos and randomness typical for family or close friends scenarios. In that context it was interesting to see what the interviewees thought about the question if leaders are born and / or can be trained to be successful in the game industry. Obviously someone who was active as a boy-scout leader, speaker/president of the students association at his university or volunteered to work with other people in general, already showed some level of social committment that is a good starting point for a leader ship role in our industry.&lt;br /&gt;&lt;br /&gt;So defining and following the right values is a fundamental requirement for a book on leadership. Obviously after having set the values comes the part where those values need to be applied and used and this is where the book shines. It is hands-down and even if you do not agree with the author in every detail the fact that he wrote all this down earns the highest respect.&lt;br /&gt;&lt;br /&gt;So now that I made it obvious that I am excited about this book, let's think about how it might be improved in the future. A potential improvement I could see is to start the book with a target description. Not that the author fails to describe a target but I would appreciate it to go into more detail in this area.&lt;br /&gt;What is the company you would want to work for? What is the environment you want to offer to make people as productive as possible? Obviously it is a hen / egg problem. Good people want to work in good teams and good teams consist of good people ... there are social -soft skills- and knowledge -hard skills- attached to each person of that team.&lt;br /&gt;A good team starts with a good leader who sets values and standards and hires the right people.&lt;br /&gt;&lt;br /&gt;Assuming you are the leader of this future team, how would you create the environment for your dream team? How do you want people to feel when they are part of this team? What should they take home every night when they are exhausted? What do you want them to tell their wifes / better halves how it is to work with you as their leader?&lt;br /&gt;A happy employee -fully enforced to be creative :-) - should tell his wife/girlfriend that he works very hard but is treated fair and enjoys the family related benefits of the company.&lt;br /&gt;He should tell his friends that he is working in a team where information is shared and where his potential is not only used as much as possible but also amplified. He needs to feel like he is growing with the team and the tasks.&lt;br /&gt;He should tell his colleagues that he enjoys working with them and the team and that he enjoys coming into work every day and that he is excited about the project he is working on ...&lt;br /&gt;&lt;br /&gt;So if we make that into a list of items we could describe how an employee should feel about working in a company with good Leaders. Might be a great starting point for discussing leader core abilities.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3222634932308852339?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3222634932308852339/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3222634932308852339' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3222634932308852339'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3222634932308852339'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/02/team-leadership-in-game-industry.html' title='Team Leadership in the Game Industry'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7130082272516365078</id><published>2009-02-02T17:53:00.000-08:00</published><updated>2009-02-02T17:55:53.268-08:00</updated><title type='text'>Larrabee on GDC</title><content type='html'>I am really looking forward to Mike Abrash's and Tom Forsyth's talks at GDC about Larrabee:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://software.intel.com/en-us/articles/intel-at-gdc/"&gt;http://software.intel.com/en-us/articles/intel-at-gdc/&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Talking about the Larrabee instruction set will be super cool ... can't wait to see this.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7130082272516365078?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7130082272516365078/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7130082272516365078' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7130082272516365078'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7130082272516365078'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/02/larrabee-on-gdc.html' title='Larrabee on GDC'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5992441766359165729</id><published>2009-02-01T19:55:00.000-08:00</published><updated>2009-02-01T20:04:34.296-08:00</updated><title type='text'>ShaderX7 Update</title><content type='html'>I updated the ShaderX7 website at&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.shaderx7.com/"&gt;http://www.shaderx7.com/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There is now the first draft of the cover and the Table of Content. Enjoy! :-)&lt;br /&gt;&lt;br /&gt;As before I will rest for a second when the new book comes out and think about what happened since I founded the series now eight years ago ... my perception of time slows down for this second :-) and I hear myself saying:"Chewbacca start the hyperdrive, let's go to the next planet, I need to play cards, drink alcohol and find some entertainment ... how about Tantoine?"&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5992441766359165729?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5992441766359165729/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5992441766359165729' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5992441766359165729'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5992441766359165729'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/02/shaderx7-update.html' title='ShaderX7 Update'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-795958262780747528</id><published>2009-01-25T10:48:00.000-08:00</published><updated>2009-01-25T18:04:48.699-08:00</updated><title type='text'>iP* programming tip #9</title><content type='html'>This issue of the iPhone / iPod Touch programmig tips series focuses on some aspects of VFP assembly programming. My friend Noel Llopis brought an oversight in the VFP math library to my attention, that I still need to fix. So I start with the description of the problem here and promise to fix it soon in the VFP library :-)&lt;br /&gt;First let's start with the references. My friend Aaron Leiby has a blog entry on how to start programming the VFP unit here:&lt;br /&gt;&lt;div&gt;&lt;a href="http://aleiby.blogspot.com/2008/12/iphone-vfp-for-n00bs.html"&gt;&lt;br /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://aleiby.blogspot.com/2008/12/iphone-vfp-for-n00bs.html"&gt;http://aleiby.blogspot.com/2008/12/iphone-vfp-for-n00bs.html&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;A typical inline assembly template might look like this:&lt;br /&gt;&lt;pre&gt;asm ( assembler template&lt;br /&gt;         : output operands                  /* optional */&lt;br /&gt;         : input operands                   /* optional */&lt;br /&gt;         : list of clobbered registers      /* optional */&lt;br /&gt;         );&lt;/pre&gt;The last two lines of code hold the input and output operands and the so called clobbers, that are used to inform the compiler on which registers are used.&lt;br /&gt;Here is a simple GCC assembly example -that doesn't use VFP assembly- that shows how the input and output operands are specified:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;asm("mov %0, %1, ror #1" : "=r" (result) " : "r" (value));&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The idea is that "=r" holds the result and "r" is the input. %0 refers to "=r" and %1 refers to "r".&lt;br /&gt;Each operand is referenced by numbers. The first output operand is numbered 0, continuing in increasing order. There is a max number of operands ... I don't know what the max number is for the iPhone platform.&lt;br /&gt;&lt;br /&gt;Some instructions clobber some hardware registers. We have to list those  registers in the clobber-list, ie the field after the third ’&lt;b&gt;:&lt;/b&gt;’ in the  asm function. So GCC will not assume that the values it loads into these  registers will be valid.&lt;br /&gt;In other words a clobber list tells the compiler which registers were used but not passed as operands. If a register is used as a scratch register this register need to be mentioned in there. Here is an example:&lt;br /&gt;&lt;pre class="coding"&gt;asm volatile("ands    r3, %1, #3"     "\n\t"&lt;br /&gt;          "eor     %0, %0, r3"     "\n\t"&lt;br /&gt;          "addne   %0, #4"      &lt;br /&gt;          : "=r" (len)        &lt;br /&gt;          : "0" (len)         &lt;br /&gt;          : "cc", "r3"&lt;br /&gt;         );&lt;br /&gt;&lt;/pre&gt;r3 is used as a scratch register here. It seems the cc pseudo register tells the compiler about the clobber list. If the asm code changes memory the "memory" pseudo register informs the compiler about this.&lt;br /&gt;&lt;br /&gt;&lt;pre class="coding"&gt;asm volatile("ldr     %0, [%1]"         "\n\t"&lt;br /&gt;           "str     %2, [%1, #4]"     "\n\t"&lt;br /&gt;           : "=&amp;amp;r" (rdv)&lt;br /&gt;           : "r" (&amp;amp;table), "r" (wdv)&lt;br /&gt;           : "memory"&lt;br /&gt;          );&lt;/pre&gt;This special clobber informs the compiler that the assembler code may                   modify any memory location. Btw. the volatile attribute instructs the compiler not to optimize your assembler code.&lt;br /&gt;&lt;br /&gt;If you want to add something to this tip ... please do not hesitate to write it in the comment line. I will add it then with your name.&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-795958262780747528?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/795958262780747528/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=795958262780747528' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/795958262780747528'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/795958262780747528'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/01/ip-programming-tip-9.html' title='iP* programming tip #9'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3817660837338137585</id><published>2009-01-09T17:24:00.000-08:00</published><updated>2009-01-10T10:51:22.374-08:00</updated><title type='text'>Partial Derivative Normal Maps</title><content type='html'>To make my collection of normal map techniques more complete on this blog I also have to mention a special normal mapping technique that Insomniac's Mike Acton brought to my attention a long time ago (I wasn't sure if I am allowed to publish it ... but now they have slides on their website). &lt;div&gt;The idea is to store the paritial derivate of the normal in two channels of the map like this&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;dx = (-nx/nz);&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;dy = (-ny/nz);&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then you can reconstruct the normal like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nx = -dx;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;ny = -dy;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;nz = 1;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;normalize(n);&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The advantage is that you do not have to reconstruct Z, so you can skip one instruction in each pixel shader that uses normal maps.&lt;/div&gt;&lt;div&gt;This is especially cool on the PS3 while on the XBOX 360 you can also create a custom texture format to let the texture fetch unit do the scale and bias and save a cycle there.&lt;/div&gt;&lt;div&gt;More details can be found at&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://www.insomniacgames.com/tech/articles/1108/files/Ratchet_and_Clank_WWS_Debrief_Feb_08.pdf"&gt;http://www.insomniacgames.com/tech/articles/1108/files/Ratchet_and_Clank_WWS_Debrief_Feb_08.pdf&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Look for Partial Derivative Normal Maps. &lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3817660837338137585?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3817660837338137585/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3817660837338137585' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3817660837338137585'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3817660837338137585'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/01/partial-derivative-normal-maps.html' title='Partial Derivative Normal Maps'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3798199598216710903</id><published>2009-01-04T10:45:00.000-08:00</published><updated>2009-01-08T08:27:59.962-08:00</updated><title type='text'>Handling Scene Geometry</title><content type='html'>I recently bumped into a post by Roderic Vicaire on the www.gamedev.net forums. It is h&lt;a href="http://www.gamedev.net/community/forums/topic.asp?topic_id=515082&amp;amp;whichpage=1&amp;amp;#3351070"&gt;ere.&lt;/a&gt;&lt;br /&gt;Obviously there is no generic solution to handle all scene geometry in the same way but depending on the game his naming conventions make a lot of sense (read "Scenegraphs say no" in Tom Forsyth's &lt;a href="http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BScene%20Graphs%20-%20just%20say%20no%5D%5D"&gt;blog&lt;/a&gt;).&lt;br /&gt;- SpatialGraph: used for finding out what is visible and should be drawn. Should make culling fast&lt;br /&gt;- SceneTree: used for hierarchical animations, e.g. skeletal animation or a sword held in a character's hand&lt;br /&gt;- RenderQueue: is filled by the SpatialGraph. Renders visible stuff fast. It sorts sub arrays per key, each key holding data such as depth, shaderID etc. (see Christer Ericson's blog entry "Sort based-draw call bucketing" for &lt;a href="http://realtimecollisiondetection.net/blog/?p=86"&gt;this&lt;/a&gt;)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3798199598216710903?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3798199598216710903/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3798199598216710903' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3798199598216710903'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3798199598216710903'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2009/01/handling-scene-geometry.html' title='Handling Scene Geometry'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-771025800941759786</id><published>2008-12-28T20:01:00.000-08:00</published><updated>2008-12-28T20:05:31.857-08:00</updated><title type='text'>Major Oolong Update</title><content type='html'>Two days ago I commited a major Oolong update. Please check out the Oolong Engine blog at&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.oolongengine.com/"&gt;http://www.oolongengine.com&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I updated the memory manager, the math library, upgraded to the latest POWERVR POD format and added to each example VBO support. Please also note that in previous updates a new memory manager was added, the VFP math library was added and a bunch of smaller changes were done as well.&lt;br /&gt;The things on my list are: looking into the sound manager ... it seems like the current version allocates memory in the frame and adding the DOOM III level format as a game format. Obviously zip support would be nice as well ... let's see how far I get.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-771025800941759786?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/771025800941759786/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=771025800941759786' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/771025800941759786'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/771025800941759786'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/major-oolong-update.html' title='Major Oolong Update'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7503241779628886656</id><published>2008-12-25T07:38:00.001-08:00</published><updated>2008-12-25T20:34:31.396-08:00</updated><title type='text'>Programming Vertex, Geometry and Pixel Shaders</title><content type='html'>A christmas present: we just went public with "Programming Vertex, Geometry and Pixel Shaders". I am a co-author of this book and we published it free on www.gamedev.net at&lt;br /&gt;&lt;br /&gt;&lt;a href="http://wiki.gamedev.net/index.php/D3DBook:Book_Cover"&gt;http://wiki.gamedev.net/index.php/D3DBook:Book_Cover&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If you have any suggestions, comments or additions to this book, please give me a sign or write it into the book comment pages.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7503241779628886656?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7503241779628886656/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7503241779628886656' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7503241779628886656'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7503241779628886656'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/programming-vertex-geometry-and-pixel.html' title='Programming Vertex, Geometry and Pixel Shaders'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5953177596911048606</id><published>2008-12-24T10:45:00.000-08:00</published><updated>2008-12-24T10:46:37.782-08:00</updated><title type='text'>Good Middleware</title><content type='html'>&lt;div&gt;Kyle Wilson wrote up a summary about how good middleware should be:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a href="http://gamearchitect.net/2008/09/19/good-middleware/"&gt;http://gamearchitect.net/2008/09/19/good-middleware/&lt;/a&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;An interesting read.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5953177596911048606?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5953177596911048606/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5953177596911048606' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5953177596911048606'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5953177596911048606'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/good-middleware.html' title='Good Middleware'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3794279591361320251</id><published>2008-12-23T23:33:00.000-08:00</published><updated>2008-12-23T23:36:31.291-08:00</updated><title type='text'>Quake III Arena for the iPhone</title><content type='html'>Just realized that one of the projects I contributed some code to went public in the meantime. You can get the source code at&lt;br /&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/quake3-iphone/"&gt;http://code.google.com/p/quake3-iphone/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There is a list of issues. If you have more spare time than me, maybe you can help out.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3794279591361320251?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3794279591361320251/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3794279591361320251' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3794279591361320251'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3794279591361320251'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/quake-iii-arena-for-iphone.html' title='Quake III Arena for the iPhone'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-886982659611548507</id><published>2008-12-23T11:45:00.001-08:00</published><updated>2008-12-23T11:57:58.096-08:00</updated><title type='text'>iP* programming tip #8</title><content type='html'>This is the christmas issue of the iPhone / iPod touch programming tips. This time we deal with the touch interface. The main challenge I found with the touch screen support is that it is hard to use it to track for example forward / backward / left / right and fire at the same time. Let's say the user presses fire and then he presses forward, what happens when he accidentally slides his finger a bit?&lt;br /&gt;The problem is that each event is defined by the region it happens on the screen. When the user  slides his finger, he is leaving this region. In other words if you handle on-screen touches as touch is on and finger lifted is off, if the finger is moved away and then lifted, the event is still on.&lt;br /&gt;The work around is that if the user slides away with his finger the previous location of this finger is used to check if the current location is in the even region. If it is not, it defaults to switch off.&lt;br /&gt;Touch-screen support for a typical shooter might work like this:&lt;br /&gt;In touchesBegan, touchesMoved and touchesEnd there is a function call like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        // Enumerates through all touch objects&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        for (UITouch *touch in touches)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            [self _handleTouch:touch];&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            touchCount++;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;_handleTouch might look like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;- (void)_handleTouch:(UITouch *)touch&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;    CGPoint location = [touch locationInView:self];&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;    CGPoint previousLocation;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;    // if we are in a touchMoved phase use the previous location but then check if the current&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;    // location is still in there&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;    if (touch.phase == UITouchPhaseMoved)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        previousLocation = [touch previousLocationInView:self];&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;    else&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        previousLocation = location;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;...&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        // fire event&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        // lower right corner .. box is 40 x 40&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        if (EVENTREGIONFIRE(previousLocation))&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            if (touch.phase == UITouchPhaseBegan)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                // only trigger once&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                if (_bitMask ^ Q3Event_Fire)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                    [self _queueEventWithType:Q3Event_Fire value1:K_MOUSE1 value2:1];&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                    _bitMask|= Q3Event_Fire;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            else if (touch.phase == UITouchPhaseEnded)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                if (_bitMask &amp;amp; Q3Event_Fire)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                    [self _queueEventWithType:Q3Event_Fire value1:K_MOUSE1 value2:0];&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                    _bitMask^= Q3Event_Fire;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            else if (touch.phase == UITouchPhaseMoved)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                if (!(EVENTREGIONFIRE(location)))&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                    if (_bitMask &amp;amp; Q3Event_Fire)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                    {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                        [self _queueEventWithType:Q3Event_Fire value1:K_MOUSE1 value2:0];&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                        _bitMask^= Q3Event_Fire;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                    }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;                }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;            }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;        }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;...&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;Tracking if the switch is on or off can be done with a bit mask. The event is send off to the game with a separate _queueEventWithType method.&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-886982659611548507?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/886982659611548507/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=886982659611548507' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/886982659611548507'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/886982659611548507'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/ip-programming-tip-8.html' title='iP* programming tip #8'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3185243266630361093</id><published>2008-12-14T18:50:00.000-08:00</published><updated>2008-12-14T20:36:04.734-08:00</updated><title type='text'>iP* programming tip #7</title><content type='html'>This time I will cover Point Sprites in the iPhone / iPod touch programming tip. The idea is that a set of points -as the simplest primitive in OpenGL ES rendering- describes the positions of Point Sprites, and their appearance comes from the current texture map. This way, Point Sprites are screen-aligned sprites that offer a reduced geometry footprint and transform cost because they are represented by one point == vertex. This is useful for particle systems, lens flare, light glow and other 2-D effects.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;glEnable(GL_POINT_SPRITES_OES) - this is the global switch that turns point sprites on. Once enabled, all points will be drawn as point sprites.&lt;/li&gt;&lt;li&gt;glTexEnvi(GL_POINT_SPRITES_OES,  GL_COORD_REPLACE_OES, GL_TRUE) - this enables  [0..1] texture coordinate generation for the four corners of the point sprite. It can be set per-texture unit.  If disabled, all corners of the quad have the same texture coordinate.&lt;/li&gt;&lt;li&gt;glPointParametervf(GLenum pname, const GLfloat * params) - this is used to set the point attenuation as described below.&lt;/li&gt;&lt;/ul&gt;The point size of a point sprite can be derived with the formula:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_2YU3pmPHKN4/SUXapAE8YII/AAAAAAAAAPI/kyEKyqrMVuU/s1600-h/DerivedPointSize.gif"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 323px; height: 41px;" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/SUXapAE8YII/AAAAAAAAAPI/kyEKyqrMVuU/s400/DerivedPointSize.gif" alt="" id="BLOGGER_PHOTO_ID_5279866536092000386" border="0" /&gt;&lt;/a&gt;user_clamp represents GL_POINT_SIZE_MIN and GL_POINT_SIZE_MIN settings of the glPointParametervf(). impl_clamp represents an implementation-dependent point size range.&lt;br /&gt;GL_POINT_DISTANCE_ATTENUATION is used to pass in params as an array containing the distance attenuation coefficients a, b, and c, in that order.&lt;br /&gt;In case multisampling is used (not officially supported), the point size is clamped to have a minimum threshold, and the alpha value of the point is modulated by the following equation:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_2YU3pmPHKN4/SUXdieSWCHI/AAAAAAAAAPQ/assJBDqdFaQ/s1600-h/minimumthreshold.gif"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 157px; height: 36px;" src="http://3.bp.blogspot.com/_2YU3pmPHKN4/SUXdieSWCHI/AAAAAAAAAPQ/assJBDqdFaQ/s400/minimumthreshold.gif" alt="" id="BLOGGER_PHOTO_ID_5279869722477070450" border="0" /&gt;&lt;/a&gt;GL_POINT_FADE_THRESHOLD_SIZE specifies the point alpha fade threshold.&lt;br /&gt;Check out the Oolong engine example Particle System for an implementation. It uses 600 point sprites with nearly 60 fps. Increasing the number of point sprites to 3000 lets the framerate drop to around 20 fps.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3185243266630361093?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3185243266630361093/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3185243266630361093' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3185243266630361093'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3185243266630361093'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/ip-programming-tip-7.html' title='iP* programming tip #7'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2YU3pmPHKN4/SUXapAE8YII/AAAAAAAAAPI/kyEKyqrMVuU/s72-c/DerivedPointSize.gif' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-725423560616826947</id><published>2008-12-12T06:51:00.000-08:00</published><updated>2008-12-12T06:59:45.035-08:00</updated><title type='text'>Free ShaderX Books</title><content type='html'>Eric Haines provided a home for the three ShaderX books that are now available for free. Thanks so much for this! Here is the URL&lt;br /&gt;&lt;br /&gt;&lt;a href="http://tog.acm.org/resources/shaderx/"&gt;http://tog.acm.org/resources/shaderx/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-725423560616826947?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/725423560616826947/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=725423560616826947' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/725423560616826947'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/725423560616826947'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/free-shaderx-books.html' title='Free ShaderX Books'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-364222467954568843</id><published>2008-12-11T19:32:00.000-08:00</published><updated>2008-12-14T20:38:46.662-08:00</updated><title type='text'>iP* programming tip #6</title><content type='html'>This time we are covering another fixed-function technique used in DirectX 7/8 times: Matrix Palettes support is an extension of OpenGL ES 1.1 that is supported on the iPhone.&lt;br /&gt;It allows the usage of a set of matrices to transform the vertices and the normals. Each vertex has  a set of indices into the palette, and a corresponding set of n weights.&lt;br /&gt;The vertex is transformed by the modelview matrices specified by the vertices respective indices. These results are subsequently scaled by the weights of the respective units and then summed to create the eyespace vertex.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_2YU3pmPHKN4/SUKRoLI5OCI/AAAAAAAAAOg/hjQe4oXxO7M/s1600-h/MatrixPalette.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 44px;" src="http://2.bp.blogspot.com/_2YU3pmPHKN4/SUKRoLI5OCI/AAAAAAAAAOg/hjQe4oXxO7M/s400/MatrixPalette.jpg" alt="" id="BLOGGER_PHOTO_ID_5278941832602531874" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;A similar procedure is followed for normals. They are transformed by the inverse transpose of the modelview matrix.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_2YU3pmPHKN4/SUKRxeZkUaI/AAAAAAAAAOo/qtv-wiAGnwM/s1600-h/MatrixPalette2.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 42px;" src="http://2.bp.blogspot.com/_2YU3pmPHKN4/SUKRxeZkUaI/AAAAAAAAAOo/qtv-wiAGnwM/s400/MatrixPalette2.jpg" alt="" id="BLOGGER_PHOTO_ID_5278941992391561634" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;The main OpenGL ES functions that support Matrix Palette are&lt;br /&gt;&lt;ul&gt;&lt;li&gt;glMatrixMode(GL_MATRIX_PALETTE) - Set the matrix mode to palette&lt;/li&gt;&lt;li&gt;glCurrentPaletteMatrix(n) - Set the currently active palette matrix and loads each matrix in the palette&lt;/li&gt;&lt;li&gt;To enable vertex arrays&lt;br /&gt;glEnableClientState(MATRIX_INDEX_ARRAY)&lt;br /&gt;glEnableClientState(WEIGHT_ARRAY)&lt;/li&gt;&lt;li&gt;To load the index and weight per-vertex data&lt;br /&gt;glWeightPointer()&lt;br /&gt;glMatrixIndexPointer()&lt;/li&gt;&lt;/ul&gt;On the iPhone there are up to nine bones per sub-mesh supported (check GL_MAX_PALETTE_MATRICES_OES). Check out the Oolong example MatrixPalette for an implementation.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-364222467954568843?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/364222467954568843/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=364222467954568843' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/364222467954568843'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/364222467954568843'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/ip-programming-tip-6.html' title='iP* programming tip #6'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_2YU3pmPHKN4/SUKRoLI5OCI/AAAAAAAAAOg/hjQe4oXxO7M/s72-c/MatrixPalette.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-1454586240169619200</id><published>2008-12-11T11:51:00.001-08:00</published><updated>2008-12-11T11:58:15.775-08:00</updated><title type='text'>GDC Talk</title><content type='html'>My GDC talk was accepted. I am happy ... yeaaahhh :-)&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_2YU3pmPHKN4/SUFvvzeyQvI/AAAAAAAAAOY/kz9UIyi_-Vk/s1600-h/12-11-2008+11-48-37+AM.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 385px; height: 400px;" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/SUFvvzeyQvI/AAAAAAAAAOY/kz9UIyi_-Vk/s400/12-11-2008+11-48-37+AM.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5278623105318798066" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-1454586240169619200?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/1454586240169619200/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=1454586240169619200' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1454586240169619200'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1454586240169619200'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/gdc-talk.html' title='GDC Talk'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2YU3pmPHKN4/SUFvvzeyQvI/AAAAAAAAAOY/kz9UIyi_-Vk/s72-c/12-11-2008+11-48-37+AM.jpg' height='72' width='72'/><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5051788821535181374</id><published>2008-12-09T15:08:00.001-08:00</published><updated>2008-12-10T07:49:37.545-08:00</updated><title type='text'>Cached Shadow Maps</title><content type='html'>A friend just asked me about how to design a shadow map system for many lights with shadows. A quite good explanation was given in the following post already in 2003:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 238); text-decoration: underline;"&gt;http://www.gamedev.net/community/forums/viewreply.asp?ID=741199&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Yann Lombard explains on how to pick a light source first that should cast a shadow. He is using distance, intensity, influence and other parameters to pick light sources.&lt;br /&gt;&lt;br /&gt;He has a cache of shadow maps that can have different resolutions. His cache solution is pretty generic. I would build a more dedicated cache just for shadow maps.&lt;br /&gt;After having picked the light sources that should cast shadows, I would only constantly update shadows in that cache that change. This depends on if there is an object with a dynamic flag in the shadow view frustum.&lt;br /&gt;If you think about it how it happens when you approach a scene with lights that cast shadows:&lt;br /&gt;1. the lights are picked that are close enough and appropriate to cast shadows -&gt; shadow maps are updated&lt;br /&gt;2. then while we move on, for the lights in 1. we only update shadow maps if there is an object in shadow view that is moving / dynamic; we start than with the next bunch of shadows while the shadows in 1 are still in view&lt;br /&gt;3. and so on.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5051788821535181374?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5051788821535181374/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5051788821535181374' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5051788821535181374'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5051788821535181374'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/cached-shadow-maps.html' title='Cached Shadow Maps'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-8315980132116591425</id><published>2008-12-06T18:52:00.000-08:00</published><updated>2008-12-10T07:34:30.764-08:00</updated><title type='text'>Dual-Paraboloid Shadow Maps</title><content type='html'>Here is an interesting post on Dual-Paraboloid Shadow maps.  Pat Wilson describes a single pass approach here&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.gamedev.net/community/forums/topic.asp?topic_id=517022"&gt;http://www.gamedev.net/community/forums/topic.asp?topic_id=517022&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This is pretty cool. Culling stuff into the two hemispheres is obsolete here. Other than this the usual comparison between cube maps and dual-paraboloid maps applies:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt; the number of drawcalls is the same ... so you do not save on this front&lt;/li&gt;&lt;li&gt; you loose memory bandwidth with cube maps because in worst case you render everything into six maps that are probably bigger than 256x256 ... in reality you won't render six times and therefore have less drawcalls than dual-paraboloid maps&lt;/li&gt;&lt;li&gt; the quality is much better for cube maps&lt;/li&gt;&lt;li&gt; the speed difference is not that huge because dual paraboloid maps use things like texkill or alpha test to pick the right map and therefore rendering is pretty slow without Hierarchical Z.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;I think both techniques are equivalent for environment maps .. for shadows you might prefer cube maps; if you want to save memory dual-paraboloid maps is the only way to go.&lt;br /&gt;&lt;br /&gt;Update: just saw this article on dual-paraboloid shadow maps:&lt;br /&gt;&lt;br /&gt;http://osman.brian.googlepages.com/dpsm.pdf&lt;br /&gt;&lt;br /&gt;The basic idea is that you do the WorldSpace -&gt; Paraboloid transformation in the pixel shader during your lighting pass. That avoids having the paraboloid co-ordinates interpolated incorrectly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-8315980132116591425?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/8315980132116591425/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=8315980132116591425' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8315980132116591425'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8315980132116591425'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/dual-paraboloid-shadow-maps.html' title='Dual-Paraboloid Shadow Maps'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3936203197518685249</id><published>2008-12-06T17:07:00.000-08:00</published><updated>2008-12-06T18:07:27.833-08:00</updated><title type='text'>iP* programming tip #5</title><content type='html'>Let's look today at the "pixel shader" level of the hardware functionality. The iPhone Application programming guide says that the application should not use more than 24 MB for textures and surfaces. It seems like those 24 MB are not in video card memory. I assume that all of the data is stored in system memory and the graphics card memory is not used.&lt;br /&gt;Overall the iP* platform supports&lt;br /&gt;&lt;ul&gt;&lt;li&gt; The maximum texture size is 1024x1024 &lt;/li&gt;&lt;li&gt; 2D texture are supported; other texture formats are not &lt;/li&gt;&lt;li&gt; Stencil buffers aren’t available &lt;/li&gt;&lt;/ul&gt;As far as I know stencil buffer support is available in hardware. That means the Light Pre-Pass renderer can only be implemented with the help of the scissor (hopefully available). As a side note: one of the other things that do not seem to be exposed is MSAA rendering. With the unofficial SDK it seems like you can use MSAA.&lt;br /&gt;Texture filtering is described on page 99 of the iPhone Application programming guide. There is also an extension for anisotropic filtering supported, that I haven't tried.&lt;br /&gt;&lt;br /&gt;The pixel shader of the iP* platform is programmed via texture combiners. There is an overview on all OpenGL ES 1.1 calls at&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.khronos.org/opengles/sdk/1.1/docs/man/"&gt;http://www.khronos.org/opengles/sdk/1.1/docs/man/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The texture combiners are described in the page on glTexEnv. Per-Pixel Lighting is a popular example:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;glTexEnvf(GL_TEXTURE_ENV,&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// N.L&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_TEXTURE_ENV_MODE, GL_COMBINE);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_COMBINE_RGB, GL_DOT3_RGB); // Blend0 = N.L&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_SOURCE0_RGB, GL_TEXTURE); // normal map&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_OPERAND0_RGB, GL_SRC_COLOR); &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_SOURCE1_RGB, GL_PRIMARY_COLOR); // light vec&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_OPERAND1_RGB, GL_SRC_COLOR);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// N.L * color map &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_TEXTURE_ENV_MODE, GL_COMBINE);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_COMBINE_RGB, GL_MODULATE); // N.L * color map&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_SOURCE0_RGB, GL_PREVIOUS); // previous result: N.L&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_OPERAND0_RGB, GL_SRC_COLOR); &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_SOURCE1_RGB, GL_TEXTURE); // color map&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;.. GL_OPERAND1_RGB, GL_SRC_COLOR);&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;Check out the Oolong example "Per-Pixel Lighting" in the folder Examples/Renderer for a full implementation.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3936203197518685249?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3936203197518685249/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3936203197518685249' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3936203197518685249'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3936203197518685249'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/ip-programming-tip-5.html' title='iP* programming tip #5'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4849033801290320282</id><published>2008-12-05T08:08:00.000-08:00</published><updated>2008-12-05T08:55:17.801-08:00</updated><title type='text'>iP* programming tip #4</title><content type='html'>All of the source code presented in this series is based on the Oolong engine. I will refer to the examples when it is appropriate so that everyone can look the code up or try it on its own. This tip covers the very simple basics of a iP* app. Here is the most basic piece of code to start a game:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// “View” for games in applicationDidFinishLaunching&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// get screen rectangle&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;CGRect rect = [[UIScreen mainScreen] bounds];&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// create one full-screen window&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;_window = [[UIWindow alloc] initWithFrame:rect];&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// create OpenGL view&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;_glView = [[EAGLView alloc] initWithFrame: rect pixelFormat:GL_RGB565_OES depthFormat:GL_DEPTH_COMPONENT16_OES preserveBackBuffer:NO];&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// attach the view to the window&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;[_window addSubView:_glView];&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// show the window&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;[_window makeKeyAndVisible];&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The screen dimensions are retrieved from a screen object. Erica Sadun compares the UIWindow functionality to a TV set and the UIView to actors in a TV show. I think this is a good way to memorize the functionality. In our case EAGLView, that comes with the Apple SDK, inherits from UIView and adds all the OpenGL ES functionality to it. We attach this view than to the window and make everything visible.&lt;br /&gt;Oolong assumes a full-screen window that does not rotate. It is always in widescreen view. The reason for this is that otherwise the accelerometer usage -to drive a camera with the accelerometer for example- wouldn't be possible.&lt;br /&gt;There is a corresponding dealloc method to this code that frees all the allocated resources again.&lt;br /&gt;The anatomy of a Oolong engine example uses mainly two files. A file with "delegate" in the name and the main application file. The main application file has the following methods:&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;- InitApplication()&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;- QuitApplication()&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;- UpdateScene()&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;- RenderScene()&lt;/span&gt;&lt;br /&gt;The first pair of methods do one-time device dependent resource allocations and deallocations, while the UpdateScene() prepares scene rendering and the last method actually does what the name says. If you would like to extend this framework to handle orientation changes, you would add a pair of methods with names like InitView() and ReleaseView() and handle all orientation dependent code in there. Those methods would always been called when the orientation changes -only once- and at the start of the application.&lt;br /&gt;&lt;br /&gt;One other basic topic is the usage of C++. In Apple speak this is called Objective-C++. Cocoa Touch wants to be addressed with Obj-C. So native C or C++ code is not possible. For game developers there is lots of existing C/C++ code to be re-used and its usage makes games easier to port to several platforms (quite common to launch an IP on several platforms at once). The best solution to this dilemma is to use Objective-C where necessary and then wrap to C/C++.&lt;br /&gt;If a file has the postfix *.mm, the compiler can handle Objective-C, C and C++ code pieces at the same time to a certain degree. If you look in Oolong for files with such a postfix you will find many of them. There are whitepapers and tutorials available for Objective-C++ that describe the limitations of the approach. Because garbage collection is not used on the iP* device I want to believe that the challenges to make this work on this platform are smaller. Here are a few examples on how the bridge between Objective-C and C/C++ is build in Oolong. In our main application class in every Oolong example we bridge from the Objective-C code used in the "delegate" file to the main application file like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// in Application.h&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;class CShell&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;    ..&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;    bool UpdateScene();&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// in Application.mm&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;bool CShell::UpdateScene()&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;..&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// in Delegate.mm&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;static CShell *shell = NULL;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;if(!shell-&gt;Update()) printf(“Update error\n”);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;An example on how to call an Objective-C method from C++ can look like this (C wrapper):&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// in PolarCamera.mm -&gt; C wrapper&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;void UpdatePolarCamera()&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;    [idFrame UpdateCamera];&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;-(void) UpdateCamera&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;    ..&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;// in Application.mm&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;bool Cshell::UpdateScene()&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;{&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;    UpdatePolarCamera();&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;    ..&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The idea is to retrieve the id for a class and then use this id to address a function in the class from the outside.&lt;br /&gt;If you want to see all this in action, open up the skeleton example in the Oolong Engine source code. You can find it at&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;Examples/Renderer/Skeleton&lt;/span&gt;&lt;br /&gt;Now that we are at the end of this tip I would like to refer to a blog that my friend Canis wrote. He talks about memory management here. This blog entry applies to the iP* platforms quite well:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.wooji-juice.com/blog/cocoa-6-memory.html"&gt;http://www.wooji-juice.com/blog/cocoa-6-memory.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4849033801290320282?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4849033801290320282/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4849033801290320282' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4849033801290320282'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4849033801290320282'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/ip-programming-tip-4.html' title='iP* programming tip #4'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2694798315953096345</id><published>2008-12-03T08:37:00.000-08:00</published><updated>2008-12-03T13:21:30.217-08:00</updated><title type='text'>iP* programming tip #3</title><content type='html'>Today I will cover the necessary files of an iP* application and the folders that potentially hold data on the device from your application.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;.app folder holds everything without required hierarchy&lt;/li&gt;&lt;li&gt;.lproj language support&lt;/li&gt;&lt;li&gt;Executable&lt;/li&gt;&lt;li&gt;Info.plist – XML property list holds product identifier &gt; allows communicate with other apps and register with Springboard&lt;/li&gt;&lt;li&gt;Icon.png (57x57) set UIPrerenderedIcon to true in Info.plist to not receive the gloss / shiny effect &lt;/li&gt;&lt;li&gt;Default.png … should match game background; no “Please wait” sign ... smooth fade&lt;br /&gt;&lt;/li&gt;&lt;li&gt;XIB (NIB) files precooked addressable user interface classes &gt;remove NSMainNibFile key from Info.plist if you do not use it&lt;/li&gt;&lt;li&gt;Your files; for example in demoq3/quake3.pak&lt;/li&gt;&lt;/ul&gt;If the game boots very fast a good mobile phone experience could be guaranteed by making a screenshot when the user ends the app and then using that screenshot while booting up the game and bringing it to the state it was before.&lt;br /&gt;Every iP* app is sandboxed. That means that only certain folders, network resources and hardware can be accessed. Here is a list of folders that might be affected by your application:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Preferences files are in var/mobile/Library/Preferences based on the product identifier (e.g. com.engel.Quake.plist); updated when you use something like NSUserDefaults to add persistance to game data like save and load&lt;br /&gt;&lt;/li&gt;&lt;li&gt;App plug-in /System/Library (not available)&lt;/li&gt;&lt;li&gt;Documents in /Documents&lt;/li&gt;&lt;li&gt;Each app has a tmp folder&lt;/li&gt;&lt;li&gt;Sandbox spec e.g. in /usr/share/sandbox &gt; don’t touch &lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;The sandbox paradigm is also responsible for a mechanism that stops your game if it eats up too many resources of the iPhone. I wonder under which conditions this is going to happen.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2694798315953096345?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2694798315953096345/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2694798315953096345' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2694798315953096345'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2694798315953096345'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/ip-programming-tip-3.html' title='iP* programming tip #3'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2174803925804543789</id><published>2008-12-02T11:16:00.001-08:00</published><updated>2008-12-06T18:20:33.762-08:00</updated><title type='text'>HLSL 5.0 OOP / Dynamic Shader Linking</title><content type='html'>I just happen to bump into a few slides on the new HLSL 5.0 syntax. The slides are at&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color: rgb(85, 26, 139); text-decoration: underline;"&gt;http://www.microsoft.com/downloads/details.aspx?FamilyId=32906B12-2021-4502-9D7E-AAD82C00D1AD&amp;amp;displaylang=en&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I thought I comment on those slides because I do not get the main idea. The slides mention a combinatiorial explosion for shaders. They show on slide 19 three arrows that go in all three directions. One is called Number of Lights, another one Environmental Effects and the third one is called Number of Materials. &lt;/div&gt;&lt;div&gt;Regarding the first one: even if one has never worked on a game, everyone knows the words Deferred Lighting. If you want many lights you want to do the lighting in a way that the same shader is used for each light type. Assuming that we have a directional, point and spot light this brings me to three shaders (I actually use currently three but I might increase this to six).&lt;/div&gt;&lt;div&gt;One arrow talks about Environmental Effects. Most environmental effects nowadays are part of PostFX or a dedicated sky dome system. That adds two more shaders.&lt;/div&gt;&lt;div&gt;The last arrow says Number of Materials. Usually we have up to 20 different shaders for different materials.&lt;/div&gt;&lt;div&gt;This brings me to -let's say 30 - 40- different shaders in a game. I can't consider this a combinatorial explosion so far.&lt;/div&gt;&lt;div&gt;On slide 27 it is mentioned that the major driving point for introducing OOP is the dynamic shader linkage. It seems like there is a need for dynamic shader linkage because of the combinatorial explosion of the shaders.&lt;/div&gt;&lt;div&gt;So in essence the language design of the HLSL language is driven by the fact that we have too many shaders and someone assumes that we can't cope with the shear quantity. To fix this we need dynamic shader linkage and to make this happen we need OOP in HLSL.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It is hard for me to follow this logic. It looks to me like we are doing a huge step back here. Not focusing on the real needs and adding code bloat.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Dynamic shader linkers are proven to be useless since a long time in game development; the previous attempts in this area were buried with DirectX 9 SDKs. The reason for this is that they do not allow to hand-optimize code which is a very important thing to do to make your title competitive. As soon as you change one of the shader fragments this has impact on the performance of other shaders. Depending on if you hit a performance sweetspot or not you can get a very different performance out of graphics cards.&lt;/div&gt;&lt;div&gt;Because the performance of your code base becomes less predictable, you do not want to use a dynamic shader linker if you want to create competitive games in the AAA segment.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Game developers need more control over the performance of the underlying hardware. We are already forced to use NV API and other native APIs to ship games on the PC platform with acceptable feature set and performance (especially SLI configs) because DirectX does not expose the functionality. For the DirectX 9 platform we look into Cuda and Cal support for PostFX.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This probably does not have much impact on the HLSL syntax but in general I would prefer having more abilities to squeeze out more performance from graphics cards over any OOP extension that does not sound like it increases performance. At the end of the day the language is a tool to squeeze out as much performance as possible from the hardware. What else do you want to do with it?&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2174803925804543789?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2174803925804543789/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2174803925804543789' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2174803925804543789'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2174803925804543789'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/hlsl-50-oop-dynamic-shader-linking.html' title='HLSL 5.0 OOP / Dynamic Shader Linking'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6180298719615545145</id><published>2008-12-02T08:13:00.000-08:00</published><updated>2008-12-02T10:03:27.197-08:00</updated><title type='text'>iP* programming tip #2</title><content type='html'>Today's tip will deal with the setup of your development environment. As a Mac newbie I was having a hard time to get used to the environment more than a year ago -when I started Mac development- and I still suffer under windowitis. I know that Apple does not want to copy MS's Visual Studio but most people who are used to work with Visual Studio would put that on their holiday wishlist :-)&lt;br /&gt;Here are a few starting points to get used to the environment:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;To work in one window only, use the "All-in-One" mode if you miss Visual Studio (&lt;a href="http://developer.apple.com/tools/xcode/newinxcode23.html"&gt;http://developer.apple.com/tools/xcode/newinxcode23.html&lt;/a&gt;)&lt;br /&gt;You have to load Xcode, but not load any projects.  Go straight to Preferences/General Tab, and you'll see "Layout: Default".   Switch that to "Layout: All-In-One".  Click OK.  Then, you can load your projects.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Apple+tilde – cycle between windows in the foreground&lt;/li&gt;&lt;li&gt;Apple+w - closes the front window in most apps&lt;/li&gt;&lt;li&gt;Apple+tab – cycle through windows&lt;/li&gt;&lt;/ul&gt;Please note that Apple did a revolutionary thing on the new MacBook Pro's (probably also the new MacBook's) ... there is no Apple key anymore.  It is now called command key.&lt;br /&gt;&lt;br /&gt;For everyone who prefers hotkeys to start applications you might check out Quicksilver. Automatically hiding and showing the Dock gives you more workspace. If you are giving presentations about your work, check out Stage Hand for the iPod touch / iPhone.&lt;br /&gt;&lt;br /&gt;For reference you should have POWERVR SDK for Linux downloaded. It is a very helpful reference regarding the MBX chip in your target platforms.&lt;br /&gt;&lt;br /&gt;Not very game or graphics programming related but very helpful is Erica Sadun's book "The iPhone Developer's Cookbook". She does not waste your time with details you are not interested in and comes straight to the point. Just reading the first section of the book is already pretty cool.&lt;br /&gt;You want to have this book if you want to dive into any form of Cocoa interface programming.&lt;br /&gt;The last book I want to recommend is Andrew M. Duncan's "Objective-C Pocket Reference". I have this usually lying on my table if I stumble over Objective-C syntax. If you are a C/C++ programmer you probably do not need more than this. There are also Objective-C tutorials on the iPhone developer website and on the general Apple website.&lt;br /&gt;&lt;br /&gt;If you have any other tip that I can add to the website I would mention it with your name.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;Update: PpluX send me the following link:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://www.pplux.com/2008/11/24/the-return-to-the-dark-side/"&gt;http://www.pplux.com/2008/11/24/the-return-to-the-dark-side/&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;He describes here how he disables deep sleep mode and modifies the usage of spaces.&lt;br /&gt;&lt;br /&gt;The next iP* programming tip will be more programming related ... I promise :-)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6180298719615545145?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6180298719615545145/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6180298719615545145' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6180298719615545145'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6180298719615545145'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/12/ip-programming-tip-2.html' title='iP* programming tip #2'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2456206633963702932</id><published>2008-11-30T16:32:00.000-08:00</published><updated>2008-11-30T16:41:54.801-08:00</updated><title type='text'>iP* programming tip #1</title><content type='html'>This is the first of a series of iPhone / iPod programming tips.&lt;br /&gt;Starting iPhone development requires first the knowledge of the underlying hardware and what it can do for you. Here are the latest hardware specs I am aware of (a rumour was talking about iPods that run the CPU with 532 MHz ... I haven't found any evidence for this).&lt;br /&gt;&lt;ul&gt;&lt;li&gt;GPU: PowerVR MBXLite with VGPLite with 103 Mhz&lt;/li&gt;&lt;li&gt;~DX8 hardware with vs_1_1 and ps_1_1 functionality&lt;/li&gt;&lt;li&gt;Vertex shader is not exposed &lt;/li&gt;&lt;li&gt;Pixel shader is programmed with texture combiners&lt;/li&gt;&lt;li&gt;16 MB VRAM – not mentioned anywhere&lt;/li&gt;&lt;li&gt;CPU: ARM 1176 with 412 Mhz (can do 600 Mhz)&lt;/li&gt;&lt;li&gt;VFP unit 128-bit Multimedia unit ~= SIMD unit&lt;/li&gt;&lt;li&gt;128 MB RAM; only 24 MB for apps allowed&lt;/li&gt;&lt;li&gt;320x480 px at 163 ppi screen&lt;/li&gt;&lt;li&gt;LIS302DL, a 3-axis accelerometer with 412 Mhz (?) update rate&lt;/li&gt;&lt;li&gt;Multi-Touch: up to five fingers&lt;br /&gt;&lt;/li&gt;&lt;li&gt;PVRTC texture compression: color map 2-bit per pixel and normal map 4-bit per-pixel&lt;/li&gt;&lt;/ul&gt;The interesting part is that the CPU can do up to 600 Mhz, so it would be possible to increase the performance here in the future.&lt;br /&gt;I wonder how the 16 MB VRAM are handled. I assume that this is the place where the VBO and textures are stored. Regarding the max size of apps of 24 MB; I wonder what happens if an application generates geometry and textures dynamically ... when does the sandbox of the iPhone / iPod touch stop the application. I did not find any evidence for this.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2456206633963702932?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2456206633963702932/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2456206633963702932' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2456206633963702932'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2456206633963702932'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/11/ip-programming-tip-1.html' title='iP* programming tip #1'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2249894271570221681</id><published>2008-11-30T16:18:00.000-08:00</published><updated>2008-11-30T16:31:18.994-08:00</updated><title type='text'>WARP - Running DX10 and DX11 Games on CPUs</title><content type='html'>As a MVP I was involved into testing this new Windows Advanced Rasterization Platform. They just published the first numbers&lt;br /&gt;&lt;br /&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/dd285359.aspx"&gt;http://msdn.microsoft.com/en-us/library/dd285359.aspx&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Running Crysis on a 8 core CPU with a resolution of 800x600 at 7.2 fps is an achievement. If this would be hand-optimized very well, it would be the best way to write code for. 4 - 8 cores will be a common target platform in the next two years. Because it can be switched off if there is a GPU, this is a perfect target for game developers. What this means is that you can write a game with the DirectX 10 API and not only target all the GPUs out there but also machines without GPU ... this is one of the best developments for the PC market since a long time. I am excited!&lt;br /&gt;&lt;br /&gt;The other interesting consequence from this development is: if INTELs "Bread &amp;amp; Butter" chips run games with the most important game API, it would be a good idea if INTEL would put a bunch of engineers behind this and optimize WARP (in case they haven't already done so). This is the big game market consisting of games like "The Sims" and "World of Warcraft" and similar games that we are talking about here. The high-end PC gaming market is much smaller.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2249894271570221681?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2249894271570221681/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2249894271570221681' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2249894271570221681'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2249894271570221681'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/11/warp-running-dx10-and-dx11-games-on.html' title='WARP - Running DX10 and DX11 Games on CPUs'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7527893703750196380</id><published>2008-11-06T07:17:00.000-08:00</published><updated>2008-11-06T07:31:21.577-08:00</updated><title type='text'>iPhone ARM VFP code</title><content type='html'>The iPhone has a kind of SIMD unit. It is called VFP unit and it is pretty hard to figure out how to program it. Here is a place where you can find soon lots of VFP asm code. &lt;div&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/vfpmathlibrary/"&gt;http://code.google.com/p/vfpmathlibrary/&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;With help from Matthias Grundmann I wrote my first piece of VFP code. Here it is: &lt;p&gt;&lt;code&gt;&lt;span class="Apple-style-span"  style="font-size:x-small;"&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;void MatrixMultiplyF(&lt;br /&gt;MATRIXf &amp;amp;mOut,&lt;br /&gt;const MATRIXf &amp;amp;mA,&lt;br /&gt;const MATRIXf &amp;amp;mB)&lt;br /&gt;{&lt;br /&gt;#if 0&lt;br /&gt;MATRIXf mRet;&lt;br /&gt;&lt;br /&gt;/* Perform calculation on a dummy matrix (mRet) */&lt;br /&gt;mRet.f[ 0] = mA.f[ 0]*mB.f[ 0] + mA.f[ 1]*mB.f[ 4] + mA.f[ 2]*mB.f[ 8] + mA.f[ 3]*mB.f[12];&lt;br /&gt;mRet.f[ 1] = mA.f[ 0]*mB.f[ 1] + mA.f[ 1]*mB.f[ 5] + mA.f[ 2]*mB.f[ 9] + mA.f[ 3]*mB.f[13];&lt;br /&gt;mRet.f[ 2] = mA.f[ 0]*mB.f[ 2] + mA.f[ 1]*mB.f[ 6] + mA.f[ 2]*mB.f[10] + mA.f[ 3]*mB.f[14];&lt;br /&gt;mRet.f[ 3] = mA.f[ 0]*mB.f[ 3] + mA.f[ 1]*mB.f[ 7] + mA.f[ 2]*mB.f[11] + mA.f[ 3]*mB.f[15];&lt;br /&gt;&lt;br /&gt;mRet.f[ 4] = mA.f[ 4]*mB.f[ 0] + mA.f[ 5]*mB.f[ 4] + mA.f[ 6]*mB.f[ 8] + mA.f[ 7]*mB.f[12];&lt;br /&gt;mRet.f[ 5] = mA.f[ 4]*mB.f[ 1] + mA.f[ 5]*mB.f[ 5] + mA.f[ 6]*mB.f[ 9] + mA.f[ 7]*mB.f[13];&lt;br /&gt;mRet.f[ 6] = mA.f[ 4]*mB.f[ 2] + mA.f[ 5]*mB.f[ 6] + mA.f[ 6]*mB.f[10] + mA.f[ 7]*mB.f[14];&lt;br /&gt;mRet.f[ 7] = mA.f[ 4]*mB.f[ 3] + mA.f[ 5]*mB.f[ 7] + mA.f[ 6]*mB.f[11] + mA.f[ 7]*mB.f[15];&lt;br /&gt;&lt;br /&gt;mRet.f[ 8] = mA.f[ 8]*mB.f[ 0] + mA.f[ 9]*mB.f[ 4] + mA.f[10]*mB.f[ 8] + mA.f[11]*mB.f[12];&lt;br /&gt;mRet.f[ 9] = mA.f[ 8]*mB.f[ 1] + mA.f[ 9]*mB.f[ 5] + mA.f[10]*mB.f[ 9] + mA.f[11]*mB.f[13];&lt;br /&gt;mRet.f[10] = mA.f[ 8]*mB.f[ 2] + mA.f[ 9]*mB.f[ 6] + mA.f[10]*mB.f[10] + mA.f[11]*mB.f[14];&lt;br /&gt;mRet.f[11] = mA.f[ 8]*mB.f[ 3] + mA.f[ 9]*mB.f[ 7] + mA.f[10]*mB.f[11] + mA.f[11]*mB.f[15];&lt;br /&gt;&lt;br /&gt;mRet.f[12] = mA.f[12]*mB.f[ 0] + mA.f[13]*mB.f[ 4] + mA.f[14]*mB.f[ 8] + mA.f[15]*mB.f[12];&lt;br /&gt;mRet.f[13] = mA.f[12]*mB.f[ 1] + mA.f[13]*mB.f[ 5] + mA.f[14]*mB.f[ 9] + mA.f[15]*mB.f[13];&lt;br /&gt;mRet.f[14] = mA.f[12]*mB.f[ 2] + mA.f[13]*mB.f[ 6] + mA.f[14]*mB.f[10] + mA.f[15]*mB.f[14];&lt;br /&gt;mRet.f[15] = mA.f[12]*mB.f[ 3] + mA.f[13]*mB.f[ 7] + mA.f[14]*mB.f[11] + mA.f[15]*mB.f[15];&lt;br /&gt;&lt;br /&gt;/* Copy result in pResultMatrix */&lt;br /&gt;mOut = mRet;&lt;br /&gt;#else&lt;br /&gt;#if (TARGET_CPU_ARM)&lt;br /&gt;const float* src_ptr1 = &amp;amp;mA.f[0];&lt;br /&gt;const float* src_ptr2 = &amp;amp;mB.f[0];&lt;br /&gt;float* dst_ptr = &amp;amp;mOut.f[0];&lt;br /&gt;&lt;br /&gt;asm volatile(&lt;br /&gt;// switch on ARM mode&lt;br /&gt;// involves uncoditional jump and mode switch (opcode bx)&lt;br /&gt;// the lowest bit in the address signals whether are (bit cleared)&lt;br /&gt;// or tumb should be selected (bit set)&lt;br /&gt;".align 4 \n\t"&lt;br /&gt;"mov r0, pc \n\t"&lt;br /&gt;"bx r0 \n\t"&lt;br /&gt;".arm \n\t"&lt;br /&gt;&lt;br /&gt;// set vector length to 4&lt;br /&gt;// example fadds s8, s8, s16 means that the content s8 - s11&lt;br /&gt;// is added to s16 - s19 and stored in s8 - s11&lt;br /&gt;"fmrx r0, fpscr \n\t" // loads fpscr status reg to r4&lt;br /&gt;"bic r0, r0, #0x00370000 \n\t" // bit clear stride and length&lt;br /&gt;"orr r0, r0, #0x00030000 \n\t" // set length to 4 (11)&lt;br /&gt;"fmxr fpscr, r0 \n\t" // upload r4 to fpscr&lt;br /&gt;// Note: this stalls the FPU&lt;br /&gt;&lt;br /&gt;// result[0][1][2][3] = mA.f[0][0][0][0] * mB.f[0][1][2][3]&lt;br /&gt;// result[0][1][2][3] = result + mA.f[1][1][1][1] * mB.f[4][5][6][7]&lt;br /&gt;// result[0][1][2][3] = result + mA.f[2][2][2][2] * mB.f[8][9][10][11]&lt;br /&gt;// result[0][1][2][3] = result + mA.f[3][3][3][3] * mB.f[12][13][14][15]&lt;br /&gt;// s0 - s31&lt;br /&gt;// if Fd == s0 - s7 -&gt; treated as scalar all the other treated like vector&lt;br /&gt;// load the whole matrix into memory - transposed -&gt; second operand first&lt;br /&gt;"fldmias %2, {s8-s23} \n\t"&lt;br /&gt;// load first column to scalar bank&lt;br /&gt;"fldmias %1!, {s0 - s3} \n\t"&lt;br /&gt;// first column times matrix&lt;br /&gt;"fmuls s24, s8, s0 \n\t"&lt;br /&gt;"fmacs s24, s12, s1 \n\t"&lt;br /&gt;"fmacs s24, s16, s2 \n\t"&lt;br /&gt;"fmacs s24, s20, s3 \n\t"&lt;br /&gt;// save first column&lt;br /&gt;"fstmias %0!, {s24-s27} \n\t"&lt;br /&gt;&lt;br /&gt;// load second column to scalar bank&lt;br /&gt;"fldmias %1!, {s4-s7} \n\t"&lt;br /&gt;// second column times matrix&lt;br /&gt;"fmuls s28, s8, s4 \n\t"&lt;br /&gt;"fmacs s28, s12, s5 \n\t"&lt;br /&gt;"fmacs s28, s16, s6 \n\t"&lt;br /&gt;"fmacs s28, s20, s7 \n\t"&lt;br /&gt;// save second column&lt;br /&gt;"fstmias %0!, {s28-s31) \n\t"&lt;br /&gt;&lt;br /&gt;// load third column to scalar bank&lt;br /&gt;"fldmias %1!, {s0-s3} \n\t"&lt;br /&gt;// third column times matrix&lt;br /&gt;"fmuls s24, s8, s0 \n\t"&lt;br /&gt;"fmacs s24, s12, s1 \n\t"&lt;br /&gt;"fmacs s24, s16, s2 \n\t"&lt;br /&gt;"fmacs s24, s20, s3 \n\t"&lt;br /&gt;// save third column&lt;br /&gt;"fstmias %0!, {s24-s27} \n\t"&lt;br /&gt;&lt;br /&gt;// load fourth column to scalar bank&lt;br /&gt;"fldmias %1!, {s4-s7} \n\t"&lt;br /&gt;// fourth column times matrix&lt;br /&gt;"fmuls s28, s8, s4 \n\t"&lt;br /&gt;"fmacs s28, s12, s5 \n\t"&lt;br /&gt;"fmacs s28, s16, s6 \n\t"&lt;br /&gt;"fmacs s28, s20, s7 \n\t"&lt;br /&gt;// save fourth column&lt;br /&gt;"fstmias %0!, {s28-s31} \n\t"&lt;br /&gt;&lt;br /&gt;// reset vector length to 1&lt;br /&gt;"fmrx r0, fpscr \n\t" // loads fpscr status reg to r4&lt;br /&gt;"bic r0, r0, #0x00370000 \n\t" // bit clear stride and length&lt;br /&gt;"fmxr fpscr, r0 \n\t" // upload r4 to fpscr&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;// switch to tumb mode&lt;br /&gt;// lower bit of destination is set to 1&lt;br /&gt;"add r0, pc, #1 \n\t"&lt;br /&gt;"bx r0 \n\t"&lt;br /&gt;".thumb \n\t"&lt;br /&gt;&lt;br /&gt;// binds variables to registers&lt;br /&gt;: "=r" (dst_ptr), "=r" (src_ptr1), "=r" (src_ptr2)&lt;br /&gt;: "0" (dst_ptr), "1" (src_ptr1), "2" (src_ptr2)&lt;br /&gt;: "r0"&lt;br /&gt;);&lt;br /&gt;#endif&lt;br /&gt;#endif&lt;br /&gt;} &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;/code&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7527893703750196380?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7527893703750196380/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7527893703750196380' title='23 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7527893703750196380'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7527893703750196380'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/11/iphone-arm-vfp-code.html' title='iPhone ARM VFP code'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>23</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7704436545539805256</id><published>2008-10-20T14:02:00.000-07:00</published><updated>2008-10-20T14:04:13.591-07:00</updated><title type='text'>Midnight Club: Los Angeles</title><content type='html'>Tomorrow is the day. Midnight Club Los Angeles will launch tomorrow. This is the third game I worked on for Rockstar. If you are into racing games you need to check it out :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7704436545539805256?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7704436545539805256/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7704436545539805256' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7704436545539805256'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7704436545539805256'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/10/midnight-club-los-angeles.html' title='Midnight Club: Los Angeles'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7418859464831293267</id><published>2008-10-16T09:35:00.000-07:00</published><updated>2008-10-16T09:45:25.515-07:00</updated><title type='text'>Hardware GPU / SPU / CPU</title><content type='html'>I follow all the discussions about the future of game hardware with talks about Larrabee and GPUs and the death of 3D APIs and -depending on the view point- different hardware designs. &lt;div&gt;&lt;br /&gt;&lt;div&gt;&lt;div&gt;The thing I figure is that all this is quite interesting and inspiring but our cycles of change in computer graphics and graphics programming are pretty long. Most of the stuff we do is based on research papers that were released more than 30 years ago and written on typewriters.&lt;/div&gt;&lt;div&gt;Why should any new piece of hardware change all this in a very short amount of time?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There is a game market out there that grows in double digit percentage numbers on all kind of hardware. How much of this market and its growth would be influenced by any new hardware?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Some of the best distributed game hardware is pretty old and following most standards, sub-performant. Nevertheless it offers entertainment that people enjoy.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So how important is it if we program a CPU/SPU/GPU or whatever we call the next thing. Give me a washing machine with a display and I make an entertainment machine with robo rumble out of it.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7418859464831293267?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7418859464831293267/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7418859464831293267' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7418859464831293267'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7418859464831293267'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/10/hardware-gpu-spu-cpu.html' title='Hardware GPU / SPU / CPU'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-8076600201702914301</id><published>2008-10-02T20:35:00.000-07:00</published><updated>2008-10-03T05:23:10.301-07:00</updated><title type='text'>S3 Graphics Chrome 440 GTX</title><content type='html'>I bought a new S3 Chrome 440 GTX in the S3 online store. I wanted to know how this card is doing, especially because it is DirectX 10.1 compatible. The other reason why I bought it was that it has a HDMI output. Just putting it into my desktop machine was interesting. I removed a 8800 GTS which was really heavy and than this card that was so small and didn't even need an extra power supply. It looks like some of my graphics cards from the end of the 90th when they started to put fans on the cards. With the small fan it should be possible to passively cool that card easily.&lt;br /&gt;&lt;br /&gt;I just went through the DirectX 10 SDK examples. Motion Blur is about 5.8 fps and NBodyGravity is about 1.8 fps. The instancing example runs with 11.90 fps. I use the VISTA 64-bit beta drivers 7.15.12.0217-18.05.03. The other examples run fast enough. The CPU does not seem to become overly busy.&lt;br /&gt;Just saw that there is a newer driver. The latest driver which is WHQL'ed has the version number 248. The motion blur example runs with 6.3 fps with some artefacts (the beta driver had that as well), Instancing ran with 11.77 fps and the NBodyGravity example with 1.83 fps ... probably not an accurate way to measure this stuff at all but at least it gives a rough idea.&lt;br /&gt;&lt;br /&gt;The integrated INTEL chip 4500 MHD in my notebook is slower than this but then it supports at least DX10 and the notebook is super light :-) ... for development it just depends for me on the feature support (Most of the time I prototype effects on PCs).&lt;br /&gt;While playing around with the two chipsets I just found out that the mobile INTEL chip also runs the new DirectX 10.1 SDK example Depth of Field with more than 20 fps. This is quite impressive. The Chrome 440 GTX is running this example with more than 100 fps. The new Raycast Terrain example runs with 19.6 fps on the Chrome and with less 7.6 fps on the Mobile INTEL chip set. The example that is not running on the Mobile INTEL chip is the ProceduralMaterial example. It runs with less than 1 fps on the Chrome 440 GTX.&lt;br /&gt;Nevertheless it seems like both companies did their homework with the DirectX SDK.&lt;br /&gt;So I just ran a bunch of ShaderX7 example programs against the cards. While the INTEL Mobile chip shows errors in some of the DirectX9 examples and crashes in some of the DirectX 10 stuff, the Chrome seems to even take the DirectX 10.1 examples that I have, that usually only run on ATI hardware ... nice!&lt;br /&gt;One thing that I haven't thought of is GLSL support. I thought that only ATI and NVIDIA have GLSL support but S3 seems to have it as well. INTEL's mobile chip does not have it so ...&lt;br /&gt;&lt;br /&gt;I will try out the 3D Futuremark Vantage Benchmark. It seems a Chrome 400 Series is in there with a score of 222. Probably not too bad considering the fact that they probably not pay Futuremark for being a member of their program.&lt;br /&gt;Update October 4th: the S3 Chrome 440 GTX did 340 as the Graphics score in the trial version of the 3D Mark Vantage.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-8076600201702914301?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/8076600201702914301/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=8076600201702914301' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8076600201702914301'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8076600201702914301'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/10/s3-graphics-chrome-440-gtx.html' title='S3 Graphics Chrome 440 GTX'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3092376272284919577</id><published>2008-10-01T16:06:00.000-07:00</published><updated>2008-10-01T16:14:51.135-07:00</updated><title type='text'>Old Interview</title><content type='html'>Just bumped into an old interview I gave to Gamedev.net. I still think everything in there is valid&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://www.gamedev.net/reference/business/features/wolfgang/wolfgang.asp"&gt;http://www.gamedev.net/reference/business/features/wolfgang/wolfgang.asp&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While reading it I thought it is kind of boring. Many of my answers are so obvious ... maybe this is just my perception. How can you make it into the game industry? Probably on the same way you can make it into any industry. Lots of education or luck or just being at the right time at the right place and then being creative, a good thinker etc.. There is no magic trick I think ... it all comes with lots of sweat.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3092376272284919577?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3092376272284919577/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3092376272284919577' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3092376272284919577'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3092376272284919577'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/10/old-interview.html' title='Old Interview'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-8076358883483826567</id><published>2008-09-30T18:20:00.000-07:00</published><updated>2008-09-30T18:34:31.987-07:00</updated><title type='text'>64-bit VISTA Tricks</title><content type='html'>I got a new notebook today with 64-bit VISTA pre-installed. It will replace a Desktop that had 64-bit VISTA on there. My friend Andy Firth provided me with the following tricks to make my life easier (it has a 64 GB solid state in there, so no hard-drive optimizations):&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Switch Off User Account Control&lt;/div&gt;&lt;div&gt;This gets rid of the on-going "are you sure" questions.&lt;/div&gt;&lt;div&gt;Go to Control Panel. Click on User Account and switch it off.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Disable Superfetch&lt;/div&gt;&lt;div&gt;Press Windows key + R. Start services.msc and scroll down until you find Superfetch. Double click on it and change the startup type to Disabled.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-8076358883483826567?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/8076358883483826567/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=8076358883483826567' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8076358883483826567'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/8076358883483826567'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/64-bit-vista-tricks.html' title='64-bit VISTA Tricks'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5632530319554432608</id><published>2008-09-28T09:46:00.000-07:00</published><updated>2008-09-28T10:02:28.356-07:00</updated><title type='text'>Light Pre-Pass: More Blood</title><content type='html'>I spent some more time with the Light Pre-Pass renderer. Here are my assumptions:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-family: courier new;"&gt;N.H^n = (N.L * N.H^n * Att) / (N.L * Att)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This division happens in the forward rendering path. The light source has its own shininess value in there == the power n value. With the specular component extracted, I can apply the material shininess value like this.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-family: courier new;"&gt;(N.H^n)^nm&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Then I can re-construct the Blinn-Phong lighting equation. The data stored in the Light Buffer is treated like one light source. As a reminder, the first three channels of the light buffer hold:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-family: courier new;"&gt;N.L * Att * DiffuseColor&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: courier new;"&gt;Color = Ambient + (LightBuffer.rgb * MatDiffInt) + MatSpecInt * (N.H^n)^mn * N.L * Att&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So how could I do this :-)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-family: courier new;"&gt;N.H^n = (N.L * N.H^n * Att) / (N.L * Att)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;N.L * Att is not in any channel of the Light buffer. How can I get this? The trick here is to convert the first three channels of the Light Buffer to luminance. The value should be pretty close to N.L * Att.&lt;br /&gt;This also opens up a bunch of ideas for different materials. Every time you need the N.L * Att term you replace it with luminance. This should give you a wide range of materials.&lt;br /&gt;The results I get are very exciting. Here is a list of advantages over a Deferred Renderer:&lt;br /&gt;- less cost per light (you calculate much less in the Light pass)&lt;br /&gt;- easier MSAA&lt;br /&gt;- more material variety&lt;br /&gt;- less read memory bandwidth -&gt; fetches only two instead of the four textures it takes in a Deferred Renderer&lt;br /&gt;- runs on hardware without ps_3_0 and MRT -&gt; runs on DX8.1 hardware&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5632530319554432608?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5632530319554432608/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5632530319554432608' title='30 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5632530319554432608'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5632530319554432608'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/light-pre-pass-more-blood.html' title='Light Pre-Pass: More Blood'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>30</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5334309980957477800</id><published>2008-09-21T12:10:00.000-07:00</published><updated>2008-09-21T12:12:42.009-07:00</updated><title type='text'>Shader Workflow - Why Shader Generators are Bad</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: 'courier new';"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;[quote]As far as I can tell from this discussion, no one has really proposed an alternative to shader permutations, merely they've been proposing ways of managing those permutations.[/quote]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;If you define shader permutations as having lots of small differences but using the same code than you have to live with the fact that whatever is send to the hardware is a full-blown shader, even if you have exactly the same skinning code in every other shader.&lt;br /&gt;So the end result is always the same ... whatever you do on the level above that.&lt;br /&gt;What I describe is a practical approach to handle shaders with a high amount of material variety and a good workflow.&lt;br /&gt;Shaders are some of the most expensive assets in production value and time spend of the programming team. They need to be the highest optimized piece of code we have, because it is much harder to squeeze out performance from a GPU than from a CPU.&lt;br /&gt;Shader generators or a material editor (.. or however you call it) are not an appropriate way to generate or handle shaders because they are hard to maintain, offer not enough material variety and are not very efficient because it is hard to hand optimize code that is generated on the fly.&lt;br /&gt;This is why developers do not use them and do not want to use them. It is possible that they play a role in indie or non-profit development so because those teams are money and time constraint and do not have to compete in the AAA sector.&lt;br /&gt;In general the basic mistake people make that think that ueber-shaders or material editors or shader generators would make sense is that they do not understand how to program a graphics card. They assume it would be similar to programming a CPU and therefore think they could generate code for those cards.&lt;br /&gt;It would make more sense to generate code on the fly for CPUs (... which also happens in the graphics card drivers) and at other places (real-time assemblers) than for GPUs because GPUs do not have anything close to linear performance behaviours. The difference between a performance hotspot and a point where you made something wrong can be 1:1000 in time (following a presentation from Matthias Wloka). You hand optimize shaders to hit those hotspots and the way you do it is that you analyze the results provided by PIX and other tools to find out where the performance hotspot of the shader is.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5334309980957477800?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5334309980957477800/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5334309980957477800' title='21 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5334309980957477800'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5334309980957477800'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/shader-workflow-why-shader-generators.html' title='Shader Workflow - Why Shader Generators are Bad'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>21</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2003705697639121896</id><published>2008-09-18T12:16:00.000-07:00</published><updated>2008-09-18T12:18:59.565-07:00</updated><title type='text'>ARM VFP ASM development</title><content type='html'>Following Matthias Grundmann's invitation to join forces I setup a Google code repository for this:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/vfpmathlibrary/"&gt;here&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The idea is to have a math library that is optimized for the VFP unit of an ARM processor. This should be useful on the iPhone / iPod touch.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2003705697639121896?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2003705697639121896/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2003705697639121896' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2003705697639121896'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2003705697639121896'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/arm-vfp-asm-development.html' title='ARM VFP ASM development'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-991214721257782829</id><published>2008-09-12T10:40:00.000-07:00</published><updated>2008-09-17T10:54:48.861-07:00</updated><title type='text'>More Mobile Development</title><content type='html'>Now that I had so much fun with the iPhone I am thinking about new challenges in the mobile phone development area. The Touch HD looks like a cool target. It has a DX8-class ATI graphics card in there. Probably on par with the iPhone graphics card and you can program it in C/C++ which is important for the performance.&lt;br /&gt;Depending on how easy it will be to get Oolong running on this I will extend Oolong to support this platform as well.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-991214721257782829?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/991214721257782829/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=991214721257782829' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/991214721257782829'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/991214721257782829'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/iphone-ipod-touch-input-that-maps-on.html' title='More Mobile Development'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6816656782574061927</id><published>2008-09-10T12:26:00.000-07:00</published><updated>2008-09-10T12:28:00.138-07:00</updated><title type='text'>Shader Workflow</title><content type='html'>I just posted a forum message about what I consider an ideal shader workflow in a team. I thought I share it here:&lt;br /&gt;&lt;br /&gt;Setting up a good shader workflow is easy. You just setup a folder that is called shaderlib, then you setup a folder that is called shader. In shaderlib there are files like lighting.fxh, utility.fxh, normals.fxh, skinning.fxh etc. and in the directory shader there are files like metal.fx, skin.fx, stone.fx, eyelashes.fx, eyes.fx. In each of those *.fx files there is a technique for whatever special state you need. You might have in there techniques like lit, depthwrite etc..&lt;br /&gt;All the "intelligence" is in the shaderlib directory in the *.fxh files. The fx files just stitch together function calls. The HLSL compiler resolves those function calls by inlining the code.&lt;br /&gt;So it is easy to just send someone the shaderlib directory with all the files in there and share your shader code this way.&lt;br /&gt;In the lighting.fxh include file you will have all kinds of lighting models like Ashikhmin-Shirley, Cook-Torrance or Oren-Nayar and obviously Blinn-Phong or just a different BRDF that can mimic a certain material especially good. In normals.fxh you have routines that can fetch normals in different ways and unpack them. Obviously all the DXT5 and DXT1 tricks are in there but also routines that let you fetch height data to generate normals from it. In utility.fxh you have support for different color spaces, special optimizations for different platforms, like special texture fetches etc. In skinning.fxh you have all code related to skinning and animation ... etc.&lt;br /&gt;If you give this library to a graphics programmer he obviously has to put together the shader on his own but he can start looking at what is requested and use different approaches to see what fits best for the job. He does not have to come up with ways on how to generate a normal from height or color data or how to deal with different color spaces.&lt;br /&gt;For a good, efficient and high quality workflow in a game team, this is what you want.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6816656782574061927?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6816656782574061927/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6816656782574061927' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6816656782574061927'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6816656782574061927'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/shader-workflow.html' title='Shader Workflow'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6898646628118198725</id><published>2008-09-09T13:07:00.000-07:00</published><updated>2008-09-09T16:30:55.637-07:00</updated><title type='text'>Calculating Screen-Space Texture Coordinates for the 2D Projection of a Volume</title><content type='html'>Calculating screen space texture coordinates for the 2D projection of a volume is more complicated than for an already transformed full-screen quad. Here is a step-by-step approach on how to achieve this:&lt;br /&gt;&lt;br /&gt;1.  Transforming position into projection space is done in the vertex shader by multiplying the concatenated World-View-Projection matrix.&lt;br /&gt;&lt;br /&gt;2. The Direct3D run-time will now divide those values by Z; stored in the W component. The resulting position is then considered in clipping space, where the x and y value is clipped to the [-1.0, 1.0] range.&lt;br /&gt;&lt;br /&gt;xclip = xproj / wproj&lt;br /&gt;yclip = yproj / wproj&lt;br /&gt;&lt;br /&gt;3. Then the Direct3D run-time transforms position into viewport space from the value range [-1.0, 1.0] to the range [0.0, ScreenWidth/ScreenHeight].&lt;br /&gt;&lt;br /&gt;xviewport = xclipspace * ScreenWidth / 2 + ScreenWidth / 2&lt;br /&gt;yviewport = -yclipspace * ScreenHeight / 2 + ScreenHeight / 2&lt;br /&gt;&lt;br /&gt;This can be simplified to:&lt;br /&gt;&lt;br /&gt;xviewport = (xclipspace + 1.0) * ScreenWidth / 2&lt;br /&gt;yviewport = (1.0 - yclipspace )  * ScreenHeight / 2&lt;br /&gt;&lt;br /&gt;The result represents the position on the screen. The y component need to be inverted because in world / view / projection space it increases in the opposite direction than in screen coordinates.&lt;br /&gt;&lt;br /&gt;4. Because the result should be in texture space and not in screen space, the coordinates need to be transformed from clipping space to texture space. In other words from the range [-1.0, 1.0] to the range [0.0, 1.0].&lt;br /&gt;&lt;br /&gt;u = (xclipspace + 1.0) * 1 / 2&lt;br /&gt;v = (1.0 - yclipspace )  * 1 / 2&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;5. Due to the texturing algorithm used by Direct3D, we need to adjust texture coordinates by half a texel:&lt;br /&gt;&lt;br /&gt;u = (xclipspace + 1.0) * ½ + ½ * TargetWidth&lt;br /&gt;v = (1.0 - yclipspace )  * ½ + ½ * TargetHeight&lt;br /&gt;&lt;br /&gt;Plugging in the x and y clipspace coordinates results from step 2:&lt;br /&gt;&lt;br /&gt;u = (xproj / wproj + 1.0) * ½ + ½ * TargetWidth&lt;br /&gt;v = (1.0 - yproj / wproj )  * ½ + ½ * TargetHeight&lt;br /&gt;&lt;br /&gt;6. Because the final calculation of this equation should happen in the vertex shader results will be send down through the texture coordinate interpolator registers. Interpolating 1/ wproj is not the same as 1 / interpolated wproj. Therefore the term 1/ wproj needs to be extracted and applied in the pixel shader. &lt;br /&gt;&lt;br /&gt;u = 1/ wproj * ((xproj + wproj) * ½ + ½ * TargetWidth * wproj)&lt;br /&gt;v = 1/ wproj * ((wproj - yproj)  * ½ + ½ * TargetHeight* wproj)&lt;br /&gt;&lt;br /&gt;The vertex shader source code looks like this:&lt;br /&gt;&lt;br /&gt;Float4 vPos = float4(0.5 * (float2(p.x + p.w, p.w – p.y) + p.w * inScreenDim.xy), pos.zw)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The equation without the half pixel offset would start at No. 4 like this:&lt;br /&gt;&lt;br /&gt;u = (xclipspace + 1.0) * 1 / 2&lt;br /&gt;v = (1.0 - yclipspace )  * 1 / 2&lt;br /&gt;&lt;br /&gt;Plugging in the x and y clipspace coordinates results from step 2:&lt;br /&gt;&lt;br /&gt;u = (xproj / wproj + 1.0) * ½ &lt;br /&gt;v = (1.0 - yproj / wproj )  * ½ &lt;br /&gt;&lt;br /&gt;Moving 1 / wproj to the front leads to:&lt;br /&gt;&lt;br /&gt;u = 1/ wproj * ((xproj + wproj) * ½)&lt;br /&gt;v = 1/ wproj * ((wproj - yproj)  * ½)&lt;br /&gt;&lt;br /&gt;Because the pixel shader is doing the 1 / wproj, this would lead to the following vertex shader code:&lt;br /&gt;&lt;br /&gt;Float4 vPos = float4(0.5 * (float2(p.x + p.w, p.w – p.y)), pos.zw)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;All this is based on a response of mikaelc in the following thread:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.gamedev.net/community/forums/topic.asp?topic_id=482654"&gt;Lighting in a Deferred Renderer&lt;/a&gt; and a response by Frank Puig Placeres in the following thread:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.gamedev.net/community/forums/topic.asp?topic_id=506573"&gt;Reconstructing Position from Depth Data&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6898646628118198725?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6898646628118198725/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6898646628118198725' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6898646628118198725'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6898646628118198725'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/calculating-screen-space-texture.html' title='Calculating Screen-Space Texture Coordinates for the 2D Projection of a Volume'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6870099530983508072</id><published>2008-09-07T09:44:00.000-07:00</published><updated>2008-09-09T07:28:55.086-07:00</updated><title type='text'>Gauss Filter Kernel</title><content type='html'>Just found a good tutorial on how to setup a Gauss filter kernel here:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://prideout.net/bloom/index.php"&gt;OpenGL Bloom Tutorial&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The interesting part is that he shows a way on how to generate the offset values and he also mentions a trick that I use for a long time. He reduces the filter kernel size by utilizing the hardware linear filtering. So he can go down from 5 to 3 taps. I usually use bilinear filtering to go down from 9 to 4 taps or 25 to 16 taps (with non-separable filter kernels) ... you got the idea.&lt;br /&gt;&lt;br /&gt;Eric Haines just reminded me of the fact that this is also described in ShaderX2 - Tips and Tricks on page 451. You can find the -now free- book at&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.gamedev.net/reference/programming/features/shaderx2/Tips_and_Tricks_with_DirectX_9.pdf"&gt;http://www.gamedev.net/reference/programming/features/shaderx2/Tips_and_Tricks_with_DirectX_9.pdf&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;BTW: Eric Haines contacted all the authors of this book to get permission to make it "open source". I would like to thank him for this.&lt;br /&gt;Check out his blog at&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.realtimerendering.com/blog/"&gt;http://www.realtimerendering.com/blog/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6870099530983508072?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6870099530983508072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6870099530983508072' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6870099530983508072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6870099530983508072'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/09/gauss-filter-kernel.html' title='Gauss Filter Kernel'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4603652014380269070</id><published>2008-08-18T14:11:00.000-07:00</published><updated>2008-08-18T16:51:59.173-07:00</updated><title type='text'>Beyond Programmable Shading</title><content type='html'>I was on SIGGRAPH to attend the "Beyond Programmable Shading" day. I spent the whole morning there and left during the last talk in the morning.&lt;br /&gt;Here is the URL for the Larrabee day:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://s08.idav.ucdavis.edu/"&gt;http://s08.idav.ucdavis.edu/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The talks are quite inspiring. I was hoping to see actual Larrabee hardware in action but they did not have any.&lt;br /&gt;I liked Chas Boyd's DirectX 11 talk because he made it clear that there are different software designs for different applications and having looked into DirectX 11 now for a while it seems like there is a great API coming up soon that solves some of the outstanding issues we had with DirectX9 (DirectX 10 will be probably skipped by many in the industry).&lt;br /&gt;&lt;br /&gt;The other thing that impressed me is AMD's CAL. The source code looks very elegant for the amount of performance you can unlock with it. Together with Brook+ it lets you control a huge number of cards. It seems like Cuda will be able to easier handle many GPUs at once soon too. PostFX are a good candidate for those APIs. CAL and CUDA can live in harmony with DirectX9/10 and DirectX 11 will even have a compute shader model that is the equivalent to CAL and CUDA. Compute shaders are written in HLSL … so a consistent environment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4603652014380269070?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4603652014380269070/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4603652014380269070' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4603652014380269070'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4603652014380269070'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/08/beyond-programmable-shading.html' title='Beyond Programmable Shading'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4999220427617857456</id><published>2008-07-31T14:46:00.001-07:00</published><updated>2008-07-31T14:49:06.663-07:00</updated><title type='text'>ARM Assembly</title><content type='html'>So I decided to increase my relationship to iPhone programming a bit and bought an ARM assembly book to learn how to program ARM assembly. The target is to figure out how to program the MMX like instruction set that comes with the processor. Then I would create a vectorized math library ... let's see how this goes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4999220427617857456?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4999220427617857456/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4999220427617857456' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4999220427617857456'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4999220427617857456'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/07/arm-assembly.html' title='ARM Assembly'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6108395575124364976</id><published>2008-07-29T06:15:00.000-07:00</published><updated>2008-12-29T10:49:05.487-08:00</updated><title type='text'>PostFX - The Nx-Gen Approach</title><content type='html'>More than three years ago I wrote a PostFX pipeline (with a large number of effects) that I constantly improved up until the beginning of last year (&lt;a href="http://www.coretechniques.info/"&gt;www.coretechniques.info&lt;/a&gt; .. look for the outline of algorithms in the PostFX talk from 2007). Now it shipped in a couple of games. So what is nx-gen here?&lt;br /&gt;On my main target platforms (360 and PS3) it will be hard to squeeze out more performance. There is probably lots of room in everything related to HDR but overall I wouldn't expect any fundamental changes. The main challenge with the pipeline was not on a technical level, but to explain to the artists how they can use it. Especially the tone mapping functionality was hard to explain and it was also hard to give them a starting point where they can work from.&lt;br /&gt;So I am thinking about making it easier for the artists to use this pipeline. The main idea is to follow the camera paradigm. Most of the effects (HDR, Depth of Field, Motion Blur, color filters) of the pipeline are expected to mimic a real-world camera so why not make it use like a real-world camera?&lt;br /&gt;The idea is to only expose functionality that is usually exposed by a camera and name all the sliders accordingly. Furthermore there will be different camera models with different basic properties as a starting point for the artists. It should also be possible to just switch between those on the fly. So a whole group of properties changes on the flip of a switch. This should make it easier to use cameras for cut scenes etc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6108395575124364976?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6108395575124364976/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6108395575124364976' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6108395575124364976'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6108395575124364976'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/07/postfx-nx-gen-approach.html' title='PostFX - The Nx-Gen Approach'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3153733496858398433</id><published>2008-07-29T06:00:00.000-07:00</published><updated>2008-07-29T06:10:45.347-07:00</updated><title type='text'>iPhone development - Oolong Engine</title><content type='html'>Just read that John Carmack likes the iPhone as a dev platform. That reminds me of the fact how I started my iPhone engine &lt;a href="http://www.oolongengine.com"&gt;Oolong Engine&lt;/a&gt; in September 2007. Initially I wanted to do some development for the Dreamcast. I got a Dreamcast devkit, a CD burner and all the manuals from friends to start with this. My idea behind all this was to do graphics demos on this platform because I was looking for a new challenge. When I had all the pieces together to start my Dreamcast graphics demo career, a friend told me the specs of the iPhone ... and it became obvious that this would be even a better target :-) ... at the time everyone assumed that Apple will never allow to program for this platform. This was exactly what I was looking for. What can be better than a restricted platform that can't be used by everyone that I can even take with me and show it to the geekiest of my friends :-)&lt;br /&gt;With some intial help from a friend (thank you Andrew :-)) I wrote the initial version of the Oolong engine and had lots of fun figuring out what is possible on the platform and what not. Then at some point Steve Jobs surprised us with the announcement that there will be an SDK and judging from Apple's history I was believing that they probably won't allow to develop games for the platform. &lt;br /&gt;So now that we have an official SDK I am surprised how my initial small scale geek project turned out :-) ... suddenly I am the maintainer of a small little engine that is used in several productions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3153733496858398433?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3153733496858398433/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3153733496858398433' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3153733496858398433'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3153733496858398433'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/07/iphone-development-oolong-engine.html' title='iPhone development - Oolong Engine'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-723458188478959497</id><published>2008-07-29T05:48:00.000-07:00</published><updated>2008-07-29T06:11:39.024-07:00</updated><title type='text'>Light Pre-Pass - First Blood :-)</title><content type='html'>I was looking for a simple way to deal with different specular values coming from different materials. It seems that one of the most obvious ways is the most efficient way to deal with this. If you are used to start with a math equation first -as I do- it is not easy to see this solution.&lt;br /&gt;To recap: what ends up in the four channels of the light buffer for a point light is the following:&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;br /&gt;&lt;span style="font-size:78%;"&gt;&lt;span style="font-family: courier new;"&gt;Diffuse.r * N.L * Att | Diffuse.g * N.L * Att | Diffuse.b * N.L * Att | N.H^n * N.L * Att&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So n represents the shininess value of the light source. My original idea to apply now different specular values in the forward rendering pass later was to divide by N.L * Att like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:78%;"&gt;&lt;span style="font-family: courier new;"&gt;(N.H^n * N.L * Att) \ (N.L * Att)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This way I would have re-constructed the N.H^n term and I could easily do something like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:78%;"&gt;(N.H^n)^mn&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;where mn represents the material specular. Unfortunately this requires to store the N.L * Att term in a separate render target channel. The more obvious way to deal with it is to just do this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:78%;"&gt;&lt;span style="font-family: courier new;"&gt;(N.H^n * N.L * Att)^mn&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;... maybe not quite right but it looks good enough for what I would want to achieve.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-723458188478959497?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/723458188478959497/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=723458188478959497' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/723458188478959497'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/723458188478959497'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/07/light-pre-pass-first-blood.html' title='Light Pre-Pass - First Blood :-)'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-2160809260487635700</id><published>2008-06-13T09:07:00.000-07:00</published><updated>2008-06-15T08:07:06.824-07:00</updated><title type='text'>Stable Cascaded Shadow Maps</title><content type='html'>I really like Michal Valient's article "Stable Cascaded Shadow Maps". It is a very practical approach to make Cascaded Shadow Maps more stable.&lt;br /&gt;What I also like about it is the ShaderX idea. I wrote an article in ShaderX5 describing a first implementation (.... btw. I re-wrote that three times since than), Michal picks up from there and brings it to the next level.&lt;br /&gt;There will be now a ShaderX7 article in which I will describe a slight improvement to Michal's approach. Michal picks the right shadow map with a rather cool trick. Mine is a bit different but it might be more efficient. So what I do to pick the right map is send down the sphere that is constructed for the light view frustum. I then check if the pixel is in the sphere. If it is I pick that shadow map, if it isn't I go to the next sphere. I also early out if it is not in a sphere by returning white.&lt;br /&gt;At first sight it does not look like a trick but if you think about the spheres lined up along the view frustum and the way they intersect, it is actually pretty efficient and fast.&lt;br /&gt;On my target platforms, especially on the one that Michal likes a lot, this makes a difference.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-2160809260487635700?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/2160809260487635700/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=2160809260487635700' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2160809260487635700'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/2160809260487635700'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/06/stable-cascaded-shadow-maps.html' title='Stable Cascaded Shadow Maps'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-1155256234458286409</id><published>2008-06-12T21:02:00.000-07:00</published><updated>2008-06-12T21:05:53.751-07:00</updated><title type='text'>Screen-Space Global Illumination</title><content type='html'>I am thinking a lot about Crytek's Screen-Space Ambient Occlusion (SSAO) and the idea of extending this into a global illumination term.&lt;br /&gt;When combined with a Light Pre-Pass renderer, there is the light buffer with all the N.L * Att values that can be used as intensity and then there is the end-result of opaque rendering pass and we have a normal map lying around. Doing the light bounce along the normal and using the N.L*Att entry in the light buffer as intensity should do the trick. The way the values are fetched would be similar to SSAO.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-1155256234458286409?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/1155256234458286409/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=1155256234458286409' title='22 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1155256234458286409'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1155256234458286409'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/06/screen-space-global-illumination.html' title='Screen-Space Global Illumination'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>22</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3242859505819163793</id><published>2008-05-28T06:46:00.000-07:00</published><updated>2008-05-31T09:47:09.666-07:00</updated><title type='text'>UCSD Talk on Light Pre-Pass Renderer</title><content type='html'>So the Light Pre-Pass renderer had its first public performance :-) ... I talked yesterday at UCSD about this new renderer design. There will be a ShaderX7 article as well.&lt;br /&gt;&lt;br /&gt;Pat Wilson from Garagegames is sharing his findings with me. He came up with an interesting way to store LUV colors.&lt;br /&gt;Renaldas Zioma told me that a similar idea was used in Battlezone 2.&lt;br /&gt;&lt;br /&gt;This is exciting :-)&lt;br /&gt;&lt;br /&gt;The link to the slides is at the end of the March 16th post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3242859505819163793?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3242859505819163793/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3242859505819163793' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3242859505819163793'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3242859505819163793'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/05/ucsd-talk-on-light-pre-pass-renderer.html' title='UCSD Talk on Light Pre-Pass Renderer'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7062399822693129757</id><published>2008-05-15T08:46:00.000-07:00</published><updated>2008-05-15T12:26:53.742-07:00</updated><title type='text'>DX 10 Graphics demo skeleton</title><content type='html'>&lt;span style="font-family: arial;font-family:arial;font-size:100%;"  &gt;I setup a google code website with one of my small little side projects that I worked on more than a year ago. To compete in graphics demo competitions you need a very small exe. I wanted to figure out how to do this with DX10 and this is the result :-) ... follow the link&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;a style="font-family: arial;" href="http://code.google.com/p/graphicsdemoskeleton/"&gt;http://code.google.com/p/graphicsdemoskeleton/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family: arial;font-size:100%;" class="normaltextarial" &gt;What is it: it is just a minimum skeleton to start creating your own small-size apps with DX10. At some point I had a particle system running in 1.5kb this way (that was with DX9). If you think about the concept of small exes there is one interesting thing I figured out. When I use DX9 and I compile HLSL shader code to a header file and include it to use it, it is smaller than the equivalent C code. So what I was thinking was: hey let's write a math library in HLSL and use the CPU only with the stub code to launch everything and let it run on the GPU :-) &lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7062399822693129757?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7062399822693129757/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7062399822693129757' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7062399822693129757'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7062399822693129757'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/05/dx-10-graphics-demo-skeleton.html' title='DX 10 Graphics demo skeleton'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-1442933821920760626</id><published>2008-04-29T08:25:00.001-07:00</published><updated>2008-04-29T08:27:58.013-07:00</updated><title type='text'>Today is the day: GTA IV is released</title><content type='html'>&lt;span style="font-family:arial;"&gt;I am really excited about this. This is the second game I worked on for Rockstar and it is finally coming out ...&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-1442933821920760626?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/1442933821920760626/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=1442933821920760626' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1442933821920760626'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/1442933821920760626'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/04/today-is-day-gta-iv-is-coming-out.html' title='Today is the day: GTA IV is released'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4023451548648633185</id><published>2008-04-21T10:19:00.000-07:00</published><updated>2008-04-21T10:26:51.755-07:00</updated><title type='text'>RGB -&gt; XYZ conversion</title><content type='html'>&lt;span style="font-family:arial;"&gt;Here is the official way to do it:&lt;/span&gt;&lt;br /&gt;http://www.w3.org/Graphics/Color/sRGB&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;They use&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;// 0.4125 0.3576 0.1805&lt;br /&gt;// 0.2126 0.7152 0.0722&lt;br /&gt;// 0.0193 0.1192 0.9505&lt;br /&gt;&lt;br /&gt;to convert from RGB to XYZ and&lt;br /&gt;&lt;br /&gt;//  3.2410 -1.5374 -0.4986&lt;br /&gt;// -0.9692  1.8760  0.0416&lt;br /&gt;//  0.0556 -0.2040  1.0570   &lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;to convert back.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;Here is how I do it:&lt;/span&gt;&lt;br /&gt;const FLOAT3x3 RGB2XYZ = {0.5141364, 0.3238786,  0.16036376,&lt;br /&gt;                         0.265068,  0.67023428, 0.06409157,&lt;br /&gt;                         0.0241188, 0.1228178,  0.84442666};                               &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;Here is how I convert back:&lt;/span&gt;&lt;br /&gt;const float3x3 XYZ2RGB  = { 2.5651,-1.1665,-0.3986,&lt;br /&gt;                          -1.0217, 1.9777, 0.0439,&lt;br /&gt;                           0.0753, -0.2543, 1.1892};&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;You should definitely try out different ways to do this :-)&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4023451548648633185?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4023451548648633185/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4023451548648633185' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4023451548648633185'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4023451548648633185'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/04/rgb-xyz-conversion.html' title='RGB -&gt; XYZ conversion'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-23040888989115134</id><published>2008-04-14T15:34:00.000-07:00</published><updated>2008-04-14T15:36:48.891-07:00</updated><title type='text'>Ported my iPhone Engine to OS 2.0</title><content type='html'>&lt;span style="font-family:arial;"&gt;I spend three days last week to port the Oolong engine over to the latest iPhone / iPod touch OS. &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;a href="http://www.oolongengine.com/"&gt;&lt;span style="font-family:arial;"&gt;http://www.oolongengine.com&lt;/span&gt;&lt;/a&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;My main development device is still a iPod touch because I am worried about not being able to make phone calls anymore.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-23040888989115134?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/23040888989115134/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=23040888989115134' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/23040888989115134'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/23040888989115134'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/04/ported-my-iphone-engine-to-os-20.html' title='Ported my iPhone Engine to OS 2.0'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4984982669618601767</id><published>2008-04-08T12:10:00.000-07:00</published><updated>2008-04-08T12:13:13.491-07:00</updated><title type='text'>Accepted for the iPhone Developer Program</title><content type='html'>&lt;span style="font-family:arial;"&gt;Whooo I am finally accepted! I have access to the iPhone developer program. Now I can start to port my Oolong Engine over :-)&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4984982669618601767?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4984982669618601767/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4984982669618601767' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4984982669618601767'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4984982669618601767'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/04/accepted-for-iphone-developer-program.html' title='Accepted for the iPhone Developer Program'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7846427075034514245</id><published>2008-03-25T20:03:00.000-07:00</published><updated>2008-03-25T20:11:40.367-07:00</updated><title type='text'>Some Great Links</title><content type='html'>&lt;span style="font-family:arial;"&gt;I just came accross some cool links today while looking for material that shows multi-core programming and how to generate an indexed triangle list from a triangle soup.&lt;br /&gt;I did not know that you can setup a virtual Cell chip on your PC. This course looks interesting:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;a href="http://www.cc.gatech.edu/~bader/CellProgramming.html"&gt;&lt;span style="font-family:arial;"&gt;http://www.cc.gatech.edu/~bader/CellProgramming.html&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;&lt;br /&gt;John Ratcliff's Code Suppository is a great place to find fantastic code snippets:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;a href="http://www.codesuppository.blogspot.com/"&gt;&lt;span style="font-family:arial;"&gt;http://www.codesuppository.blogspot.com/&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;&lt;br /&gt;Here is a great paper to help with first steps in multi-core programming:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;a href="http://www.digra.org/dl/db/06278.34239.pdf"&gt;&lt;span style="font-family:arial;"&gt;http://www.digra.org/dl/db/06278.34239.pdf&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;A general graphics programming course is available here:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://users.ece.gatech.edu/~lanterma/mpg/"&gt;http://users.ece.gatech.edu/~lanterma/mpg/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I will provide this URL to people who ask me about how to learn graphics programming.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7846427075034514245?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7846427075034514245/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7846427075034514245' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7846427075034514245'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7846427075034514245'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/03/some-great-links.html' title='Some Great Links'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6177873158569097900</id><published>2008-03-16T18:40:00.000-07:00</published><updated>2008-05-31T09:44:38.449-07:00</updated><title type='text'>Light Pre-Pass Renderer</title><content type='html'>In June last year I had an idea for a new rendering design. I call it light pre-pass renderer.&lt;br /&gt;The idea is to fill up a Z buffer first and also store normals in a render target. This is like a G-Buffer with normals and Z values ... so compared to a deferred renderer there is no diffuse color, specular color, material index or position data stored in this stage.&lt;br /&gt;Next the light buffer is filled up with light properties. So the idea is to differ between light and material properties. If you look at a simplified light equation for one point light it looks like this:&lt;br /&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;Color = Ambient + Shadow * Att * (N.L * DiffColor * DiffIntensity * LightColor + R.V^n * SpecColor * SpecIntensity * LightColor)&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The light properties are:&lt;br /&gt;- N.L&lt;br /&gt;- LightColor&lt;br /&gt;- R.V^n&lt;br /&gt;- Attenuation&lt;br /&gt;&lt;br /&gt;So what you can do is instead of rendering a whole lighting equation for each light into a render target, you render into a 8:8:8:8 render target only the light properties. You have four channels so you can render:&lt;br /&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;LightColor.r * N.L * Att&lt;br /&gt;LightColor.g * N.L * Att&lt;br /&gt;LightColor.b * N.L * Att&lt;br /&gt;R.V^n * N.L * Att&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;That means in this setup there is no dedicated specular color ... which is on purpose (you can extend it easily).&lt;br /&gt;Here is the source code what I store in the light buffer.&lt;br /&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;half4 ps_main( PS_INPUT Input ) : COLOR&lt;br /&gt;{&lt;br /&gt;half4 G_Buffer = tex2D( G_Buffer, Input.texCoord );&lt;br /&gt;&lt;br /&gt;// Compute pixel position&lt;br /&gt;half Depth = UnpackFloat16( G_Buffer.zw );&lt;br /&gt;float3 PixelPos = normalize(Input.EyeScreenRay.xyz) * Depth;&lt;br /&gt;&lt;br /&gt;// Compute normal&lt;br /&gt;half3 Normal;&lt;br /&gt;Normal.xy = G_Buffer.xy*2-1;&lt;br /&gt;Normal.z = -sqrt(1-dot(Normal.xy,Normal.xy));&lt;br /&gt;&lt;br /&gt;// Computes light attenuation and direction&lt;br /&gt;float3 LightDir = (Input.LightPos - PixelPos)*InvSqrLightRange;&lt;br /&gt;half Attenuation = saturate(1-dot(LightDir / LightAttenuation_0, LightDir / LightAttenuation_0));&lt;br /&gt;LightDir = normalize(LightDir);&lt;br /&gt;&lt;br /&gt;// R.V == Phong&lt;br /&gt;float specular = pow(saturate(dot(reflect(normalize(-float3(0.0, 1.0, 0.0)), Normal), LightDir)), SpecularPower_0);&lt;br /&gt;&lt;br /&gt;float NL = dot(LightDir, Normal)*Attenuation; &lt;/span&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;&lt;br /&gt;return float4(DiffuseLightColor_0.x*NL, DiffuseLightColor_0.y*NL, DiffuseLightColor_0.z*NL, specular * NL); &lt;/span&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;}&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;After all lights are alpha-blended into the light buffer, you switch to forward rendering and reconstruct the lighting equation. In its simplest form this might look like this&lt;br /&gt;&lt;br /&gt;&lt;span style=";font-family:courier new;font-size:85%;"  &gt;float4 ps_main( PS_INPUT Input ) : COLOR0&lt;br /&gt;{&lt;br /&gt;float4 Light = tex2D( Light_Buffer, Input.texCoord );&lt;br /&gt;float3 NLATTColor = float3(Light.x, Light.y, Light.z);&lt;br /&gt;float3 Lighting = NLATTColor + Light.www;&lt;br /&gt;&lt;br /&gt;return float4(Lighting, 1.0f);&lt;br /&gt;}&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;This is a direct competitor to the light indexed renderer idea described by Damian Trebilco at &lt;a href="http://lightindexed-deferredrender.googlecode.com/files/LightIndexedDeferredLighting1.1.pdf"&gt;Paper&lt;/a&gt; .&lt;br /&gt;I have a small example program that compares this approach to a deferred renderer but I have not compared it to Damian's approach. I believe his approach might be more flexible regarding a material system than mine but the Light Pre-Pass renderer does not need to do the indexing. It should even run on a seven year old ATI RADEON 8500 because you only have to do a Z pre-pass and store the normals upfront.&lt;br /&gt;&lt;br /&gt;The following screenshot shows four point-lights. There is no restriction in the number of light sources:&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_2YU3pmPHKN4/R93MVCsl4bI/AAAAAAAAABA/pFmdfubodSo/s1600-h/LightPrePass.jpg"&gt;&lt;img id="BLOGGER_PHOTO_ID_5178519808419160498" style="margin: 0px auto 10px; display: block; text-align: center;" alt="" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/R93MVCsl4bI/AAAAAAAAABA/pFmdfubodSo/s400/LightPrePass.jpg" border="0" /&gt;&lt;/a&gt; The following screenshots shows the same scene running with a deferred renderer. There should not be any visual differences to the Light Pre-Pass Renderer:&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_2YU3pmPHKN4/R93NIysl4cI/AAAAAAAAABI/WBF812V3x8Y/s1600-h/DeferredRenderer.jpg"&gt;&lt;img id="BLOGGER_PHOTO_ID_5178520697477390786" style="margin: 0px auto 10px; display: block; text-align: center;" alt="" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/R93NIysl4cI/AAAAAAAAABI/WBF812V3x8Y/s400/DeferredRenderer.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;The design here is very flexible and scalable. So I expect people to start from here and end up with quite different results. One of the challenges with this approach is to setup a good material system. You can store different values in the light buffer or use the values above and construct interesting materials. For example a per-object specular highlight would be done by taking the value stored in the alpha channel and apply a power function to it or you store the power value in a different channel.&lt;br /&gt;Obviously my intial approach is only scratching the surface of the possibilities.&lt;br /&gt;P.S: to implement a material system for this you can do two things: you can handle it like in a deferred renderer by storing a material id with the normal map ... maybe in the alpha channel, or you can reconstruct the diffuse and specular term in the forward rendering pass. The only thing you have to store to do this is N.L * Att in a separate channel. This way you can get back R.V^n by using the specular channel and dividing it by N.L * Att. So what you do is:&lt;br /&gt;&lt;br /&gt;(R.V^n * N.L * Att) / (N.L * Att)&lt;br /&gt;&lt;br /&gt;Those are actually values that represent all light sources.&lt;br /&gt;&lt;br /&gt;Here is a link to the slides of my UCSD &lt;a href="http://www.wolfgang-engel.info/RendererDesign.zip"&gt;Renderer Design&lt;/a&gt; presentation. They provide more detail.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6177873158569097900?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6177873158569097900/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6177873158569097900' title='43 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6177873158569097900'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6177873158569097900'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/03/light-pre-pass-renderer.html' title='Light Pre-Pass Renderer'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2YU3pmPHKN4/R93MVCsl4bI/AAAAAAAAABA/pFmdfubodSo/s72-c/LightPrePass.jpg' height='72' width='72'/><thr:total>43</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6419987061104308825</id><published>2008-03-07T11:10:00.000-08:00</published><updated>2008-03-07T14:50:24.105-08:00</updated><title type='text'>iPhone SDK so far</title><content type='html'>&lt;span style="font-family: arial;"&gt;Just setup a dev environment this morning with the iPhone  SDK ... overall it is quite disappointing for games :-). OpenGL ES is not  supported in the emulator but you can't run apps on the iPhone without OS 2.0  ... and this is not realeased so far. In other words, they have OpenGL ES  examples but you can't run them anywhere. I hope I get access to the 2.0 file  system somehow. Other than this I have now the old and the new SDK setup on  one machine and it works nicely.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;Now I have to wait until I get access to the iPhone OS 2.0 ... what a pain.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6419987061104308825?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6419987061104308825/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6419987061104308825' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6419987061104308825'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6419987061104308825'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/03/iphone-sdk-so-far.html' title='iPhone SDK so far'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-285283714732109569</id><published>2008-03-02T20:59:00.000-08:00</published><updated>2008-03-03T10:40:52.339-08:00</updated><title type='text'>Predicting the Future in Game Graphics</title><content type='html'>&lt;span style="font-family:arial;"&gt;So I was thinking about the next 1 - 2 years of graphics programming in the game industry on the XBOX 360 and the PS3. I think we can see a few very strong trends that will sustain over the next few years.&lt;br /&gt;&lt;br /&gt;HDR&lt;br /&gt;Rendering with high-dynamic range is realized in two areas: in the renderer and in the source data == textures of objects&lt;br /&gt;On current high-end platforms people run the lighting in the renderer in gamma 1.0 and they are using the 10:10:10:2 format whereever available or a 8:8:8:8 render target format that uses a non-standard color format that supports a larger range of values (&gt; 1) and more deltas. Typically these are the LogLuv or L16uv color formats.&lt;br /&gt;There are big developements for the source art. id Software published an article on a 8-bit per pixel color format -stored in a DXT5 format- that has a much better quality than the DXT1 format with 4-bit per pixel that we usually use. Before that there were numerous attempts by using scale and bias values in the hacked DXT header to use the available deltas in the texture better for -e.g.- rather dark textures. One of the challenges here was to make all this work with gamma 1.0.&lt;br /&gt;On GDC 2007 I suggested during Chas. Boyds DirectX Future talk to extend DX to support a HDR format with 8-bit that also supports gamma 1.0 better. It would be great if they could come up with a better compression scheme than DXT in the future but until then we will try to make combinations of DXT1 + L16 or DXT5 hacks scenarios work :-) or utilize id Software's solution.&lt;br /&gt;&lt;br /&gt;Normal Map Data&lt;br /&gt;Some of the most expensive data is normal map data. So far we are "mis-using" the DXT formats to compress vector data. If you generally store height data this opens up more options. Many future techniques like Parallax mapping or any "normal map blending" require height map data. So this is some area of practical interest :-) ... check out the normal vector talk of the GDC 2008 tutorial day I organized at &lt;/span&gt;&lt;a href="http://www.coretechniques.info/"&gt;&lt;span style="font-family:arial;"&gt;http://www.coretechniques.info/&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family:arial;"&gt;.&lt;br /&gt;&lt;br /&gt;Lighting Models&lt;br /&gt;Everyone is trying to find lighting models that allow to mimic a wider range of materials. The Strauss lighting model seems to be popular and some people come up with their own lighting models.&lt;br /&gt;&lt;br /&gt;Renderer Design&lt;br /&gt;To render opaque objects there are currently two renderer designs on the end of the spectrum. The so called deferred renderer and the so called forward renderer. The idea of the deferred renderer design came up to allow a higher number of lights. The advantage of a higher number of lights has to be bought by having lower quality settings in other areas.&lt;br /&gt;So people now start to research new renderer designs that have the advantages of both designs but none of the disadvantages. There is a Light indexed renderer and I am working on a Light pre-pass renderer. New renderer designs will allow more light sources ... but what is a light source without shadow? ...&lt;br /&gt;&lt;br /&gt;Shadows&lt;br /&gt;Lots of progress was made with shadows. Cascaded Shadow maps are now the favorite way to split up shadow data along the view frustum. Nevertheless there are more ways to distribute the shadow resolution. This is an interesting area of research.&lt;br /&gt;The other big area is using probability functions to replace the binary depth comparison. Then the next big thing will be soft shadows that become softer when the distance between the occluder and the receiver becomes bigger.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Global Illumination&lt;br /&gt;This is the area with the biggest growth potential currently in games :-) Like screen-space ambient occlusion that is super popular now because of Crysis, screen-space irradiance will offer lots of research opportunities.&lt;br /&gt;To target more advanced hardware, Carsten Dachsbacher approach in ShaderX5 looks to me like a great starting point.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-285283714732109569?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/285283714732109569/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=285283714732109569' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/285283714732109569'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/285283714732109569'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/03/predicting-future-in-graphics.html' title='Predicting the Future in Game Graphics'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7369401406293040911</id><published>2008-02-13T17:23:00.002-08:00</published><updated>2008-02-13T17:33:39.684-08:00</updated><title type='text'>Android</title><content type='html'>&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;span style="font-family: arial;"&gt;Just read the FAQ for Android. Here is the most important part:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;-----------------&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;Can I write code for Android using C/C++? &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;No. Android applications are written using the Java programming language.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;-----------------&lt;br /&gt;So no easy game ports for this platform. Additionally the language will eat up so many cycles that good looking 3D game do not make much sense. No business case for games then ... maybe they will start thinking about it :-) ... using Java also looks quite unprofessional to me but I heard other phone companies are doing this as well to protect their margin and keep control over the device.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7369401406293040911?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7369401406293040911/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7369401406293040911' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7369401406293040911'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7369401406293040911'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/02/android.html' title='Android'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5581956856273702926</id><published>2008-01-18T10:47:00.001-08:00</published><updated>2008-01-19T03:02:52.032-08:00</updated><title type='text'>gDEbugger for OpenGL ES</title><content type='html'>&lt;span style="font-family:arial;"&gt;So I decided to test the gDEbugger from graphicREMEDY for OpenGL ES. I got an error message indicating a second hand exception and there was not much I can do about it. I posted my problem in their online forum, but did not get any response so far.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;I guess with the decreasing market of OpenGL there is not much money in providing a debugger for this API. In games, less companies do PC games anymore and OpenGL is not used by any AAA title anymore on the main PC platform Windows.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;I was hoping that they target the upcoming OpenGL ES market, but this might be still in its infanty. If anyone knows a tool to debug OpenGL ES similar to PIX or GcmReplay, I would appreciate a hint. To debug I would work on the PC platform ... in other words I have a PC and an iPhone version of the game :-)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Update January 19th: graphicRemedy actually came back to me and asked me to send in source code ... very nice.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5581956856273702926?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5581956856273702926/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5581956856273702926' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5581956856273702926'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5581956856273702926'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/01/gdebugger-for-opengl-es.html' title='gDEbugger for OpenGL ES'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-231237985701114522</id><published>2008-01-08T13:38:00.000-08:00</published><updated>2008-01-18T11:01:08.508-08:00</updated><title type='text'>San Angeles Observation on the iPhone</title><content type='html'>&lt;span style="font-family: arial;" class="normaltextarial"&gt;I ported the San Angeles Observation demo from Jetro Lauha to the iPhone (&lt;a href="http://www.oolongengine.com/"&gt;www.oolongengine.com&lt;/a&gt;). You can find the original version here&lt;br /&gt;&lt;br /&gt;&lt;a href="http://iki.fi/jetro/" title="http://iki.fi/jetro/" target="_blank"&gt;http://iki.fi/jetro/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This demo throws about 60k polys onto the iPhone and runs quite slow :-(. I will double check with Apple that this is not due to a lame OpenGL ES implementation on the phone.&lt;br /&gt;I am thinking about porting other stuff now over to the phone or working on getting a more mature feature set for the engine ... porting is nearly more fun, because you can show a result afterwards :-) ... let's see.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-231237985701114522?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/231237985701114522/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=231237985701114522' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/231237985701114522'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/231237985701114522'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2008/01/san-angeles-observation-on-iphone.html' title='San Angeles Observation on the iPhone'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-6750729585960828071</id><published>2007-12-30T11:09:00.000-08:00</published><updated>2007-12-31T16:48:14.020-08:00</updated><title type='text'>Oolong Engine</title><content type='html'>&lt;span style="font-family:arial;"&gt;I renamed my iPhone / iPod touch engine to Oolong Engine and moved it to a new home. Its URL is now&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.oolongengine.com/"&gt;www.oolongengine.com&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I will add now a 3rd person camera model. This camera will be driven by the accelerometer and the touch screen.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-6750729585960828071?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/6750729585960828071/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=6750729585960828071' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6750729585960828071'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/6750729585960828071'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/oolong-engine.html' title='Oolong Engine'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7621677642215888312</id><published>2007-12-26T13:42:00.000-08:00</published><updated>2007-12-26T14:17:02.714-08:00</updated><title type='text'>Animating Normal (Maps)</title><content type='html'>&lt;span style="font-family: arial;"&gt;There seems to be an on-going confusion on how to animate normal maps. The best answer to this is: you don't :-).&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The obvious problem is to stream in two normal maps and then to modulate two normals. If you are on a console platform you just don't want to do this. So what would be a good way to animate a normal? You modulate height fields. Where both height fields have peaks, the result should also have a peak. Where one of the height fields is zero, the result should be also be zero, independent of the other height field.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;Usually, a normal map is formed by computing a relief of a height field (bump map) over a flat surface. If is the bump map’s height at the texture coordinates, the standard definition of the normal map is&lt;/span&gt;&lt;br /&gt;&lt;p style="font-family: arial;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_2YU3pmPHKN4/R3LNmzr640I/AAAAAAAAAAM/1TRCaPNit28/s1600-h/HeightToNormalMap.gif"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://2.bp.blogspot.com/_2YU3pmPHKN4/R3LNmzr640I/AAAAAAAAAAM/1TRCaPNit28/s320/HeightToNormalMap.gif" alt="" id="BLOGGER_PHOTO_ID_5148403390631043906" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style="font-family: arial;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style="font-family: arial;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style="font-family: arial;"&gt;&lt;br /&gt;The height fields are multiplied to form a combined height field like this&lt;/p&gt;&lt;p style="font-family: arial;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_2YU3pmPHKN4/R3LN9jr641I/AAAAAAAAAAU/nCDWevxCwpI/s1600-h/ModulateHeightMaps.gif"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://1.bp.blogspot.com/_2YU3pmPHKN4/R3LN9jr641I/AAAAAAAAAAU/nCDWevxCwpI/s320/ModulateHeightMaps.gif" alt="" id="BLOGGER_PHOTO_ID_5148403781473067858" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To determine the normal vector of this height field according to the first Equation, one needs the partial derivatives of this functions. This is a simple application of the product rule:&lt;br /&gt;&lt;/p&gt;&lt;p style="font-family: arial;"&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_2YU3pmPHKN4/R3LONzr642I/AAAAAAAAAAc/iQao6_JLXXQ/s1600-h/ProductRule.gif"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://2.bp.blogspot.com/_2YU3pmPHKN4/R3LONzr642I/AAAAAAAAAAc/iQao6_JLXXQ/s320/ProductRule.gif" alt="" id="BLOGGER_PHOTO_ID_5148404060645942114" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;And similarly for the partial derivative with respect to v. Thus:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_2YU3pmPHKN4/R3LOjzr643I/AAAAAAAAAAk/Uo1vEMSknQ8/s1600-h/Partial+Derivaties+for+two+heightmaps.gif"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://2.bp.blogspot.com/_2YU3pmPHKN4/R3LOjzr643I/AAAAAAAAAAk/Uo1vEMSknQ8/s320/Partial+Derivaties+for+two+heightmaps.gif" alt="" id="BLOGGER_PHOTO_ID_5148404438603064178" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style="font-family: arial;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style="font-family: arial;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family: arial;"&gt;BTW: to recover the height field’s partial derivatives from the normal map we can use:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_2YU3pmPHKN4/R3LO_Tr644I/AAAAAAAAAAs/7l8ZKXteYEI/s1600-h/OneHeightFieldsPartialDerivative.gif"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://4.bp.blogspot.com/_2YU3pmPHKN4/R3LO_Tr644I/AAAAAAAAAAs/7l8ZKXteYEI/s320/OneHeightFieldsPartialDerivative.gif" alt="" id="BLOGGER_PHOTO_ID_5148404911049466754" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7621677642215888312?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7621677642215888312/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7621677642215888312' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7621677642215888312'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7621677642215888312'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/animating-normal-maps.html' title='Animating Normal (Maps)'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_2YU3pmPHKN4/R3LNmzr640I/AAAAAAAAAAM/1TRCaPNit28/s72-c/HeightToNormalMap.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-328936581065349425</id><published>2007-12-26T13:27:00.000-08:00</published><updated>2007-12-26T14:17:21.012-08:00</updated><title type='text'>About Raytracing</title><content type='html'>&lt;span style="font-family: arial;"&gt;My friend Dean Calver published an article about raytracing that is full of wisdom. The title says it all &lt;/span&gt;&lt;a style="font-family: arial;" href="http://www.beyond3d.com/content/articles/94"&gt;Real-Time Ray Tracing: Holy Grail or Fool's Errand?&lt;/a&gt;&lt;span style="font-family: arial;"&gt;. This is straight to the point :-)&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-328936581065349425?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/328936581065349425/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=328936581065349425' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/328936581065349425'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/328936581065349425'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/about-raytracing.html' title='About Raytracing'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7008076333795664321</id><published>2007-12-26T13:21:00.000-08:00</published><updated>2007-12-26T14:17:37.936-08:00</updated><title type='text'>LogLuv HDR implementation in Heavenly Sword</title><content type='html'>&lt;span style="font-family: arial;"&gt;Heavenly Sword stores HDR data in 8:8:8:8 render targets. I talked to Marco about this before and saw a nice description in Christer Ericson's blog &lt;/span&gt;&lt;a style="font-family: arial;" href="http://realtimecollisiondetection.net/blog/?p=15"&gt;here&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;I came up with a similar idea that should be faster and bit more hardware friendly with a new compression format that I call L16uv. The name more or less says it all :-)&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7008076333795664321?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7008076333795664321/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7008076333795664321' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7008076333795664321'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7008076333795664321'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/logluv-hdr-implementation-in-heavenly.html' title='LogLuv HDR implementation in Heavenly Sword'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-3240158690442826620</id><published>2007-12-26T13:14:00.000-08:00</published><updated>2007-12-26T14:17:57.499-08:00</updated><title type='text'>Normal Map Data II</title><content type='html'>&lt;span style="font-family: arial;font-family:arial;" &gt;Here is one interesting normal data idea I missed. It is taken from &lt;/span&gt;&lt;a style="font-family: arial;" href="http://realtimecollisiondetection.net/blog/"&gt;Christer Ericson&lt;/a&gt;&lt;span style="font-family: arial;font-family:arial;" &gt; in his blog:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote style="font-family: arial;"&gt;One clever thing they do (as mentioned on these two slides) is to encode their normal maps so that you can feed either a DXT1 or DXT5 encoded normal map to a shader, and the shader doesn’t have to know. This is neat because it cuts down on shader permutations for very little shader cost. Their trick is for DXT1 to encode X in R and Y in G, with alpha set to 1. For DXT5 they encode X in alpha, Y in G, and set R to 1. Then in the shader, regardless of texture encoding format, they reconstruct the normal as X = R * alpha, Y = G, Z = Sqrt(1 - X^2 - Y^2).&lt;br /&gt;&lt;br /&gt;A DXT5-encoded normal map has much better quality than a DXT1-encoded one, because the alpha component of DXT5 is 8 bits whereas the red component of DXT1 is just 5 bits, but more so because the alpha component of a DXT5 texture is compressed independently from the RGB components (the three of which are compressed dependently for both DXT1 and DXT5) so with DXT5 we avoid co-compression artifacts. Of course, the cost is that the DXT5 texture takes twice the memory of a DXT1 texture (plus, on the PS3, DXT1 has some other benefits over DXT5 that I don’t think I can talk about). &lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-3240158690442826620?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/3240158690442826620/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=3240158690442826620' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3240158690442826620'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/3240158690442826620'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/normal-map-data-ii.html' title='Normal Map Data II'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5744849265181851909</id><published>2007-12-25T11:48:00.000-08:00</published><updated>2007-12-26T14:18:12.340-08:00</updated><title type='text'>Normal Data</title><content type='html'>&lt;span style="font-family: arial;"&gt;Normal data is one of the more expensive assets of games. Creating normal data in Z Brush or mudbox can easily make up for a few million dollars.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;Storing normal data in textures in a way that preserves the original data with the lowest error level is an art form that needs special attention.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;I am now aware of three ways to destroy normal data by storing it in a texture:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;1. Store the normal in a DXT1 compressed texture&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;2. Store the normal in a DXT5 compressed texture by storing the x value in alpha and the y value in the green channel .... and by storing some other color data in the red and blue channel.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;3. Store the normal in its original form -as a height map- in one color channel of a DXT1 compressed texture with two other color channels.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;They all have a common denominator: the DXT format was created to compress color data so that the resulting color is still perceived as similar. Perceiving 16 vectors as similar follows different rules than perceiving 16 colors as similar. Therefore the best -so far- solutions to store normals is to &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- not compress them at all&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- store y in the green channel of a DXT5 compressed texture and red in the alpha channel and color the two empty channels black&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- use the DXN format that consists of two DXT5 compressed alpha channels&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- store a height map in an alpha channel of a DXT5 compressed texture and generate the normal out of the height map.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The DXT5 solutions and the DXN solution occupy 8-bit per normal. The height map solution occupies 4-bit per normal. It is probably not as good looking as the 8-bit per normal solutions.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;There are lots of interesting areas regarding normals other than how they are stored. There are challenges when you want to scale, add, modulate, deform, blend or filter them. Then there is also anti-aliasing ... :-) ... food for thought.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5744849265181851909?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5744849265181851909/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5744849265181851909' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5744849265181851909'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5744849265181851909'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/normal-data.html' title='Normal Data'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-4128006808097266842</id><published>2007-12-19T15:14:00.000-08:00</published><updated>2007-12-26T14:18:27.237-08:00</updated><title type='text'>Renderer Design</title><content type='html'>&lt;span style="font-family: arial;"&gt;Renderer design is an interesting area. I recommend starting with the lighting equation, splitting it up in a material and light part and then move on from there. You can then think about what data you need to do a huge number of direct lights and shadows (shadows are harder than lights) and how you do all the global illumination part. Especially the integration of global illumination and many lights should get you thinking for a while.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;Here is an example:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;1. render shadow data from the cascaded shadow maps into a shadow collector that collects indoor, outdoor, cloud shadow data&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;2. render at the same time world-space normals in the other three channels of the render target&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;3. render all lights into a light buffer (only the light source properties not the material properties)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;4. render all colors and the material properties into a render target while applying all the lights from the light buffer (here you stitch together the Blinn-Phong lighting model or whatever you use)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;5. do global illumination with the normal map&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;6. do PostFX&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-4128006808097266842?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/4128006808097266842/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=4128006808097266842' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4128006808097266842'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/4128006808097266842'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/renderer-design.html' title='Renderer Design'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-5776091825625606893</id><published>2007-12-18T10:15:00.000-08:00</published><updated>2007-12-26T14:18:44.217-08:00</updated><title type='text'>How to design a Material System</title><content type='html'>&lt;span style="font-family: arial;"&gt;Here is my take on how a good meta-material / material systems should be written and how the art workflow works:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;I just came across the latest screenshots of Mass Effect that will ship soon:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;http://masseffect.bioware.com/gallery/&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The thing I found remarkable is the material / meta-material system. Skin looks like skin and then cloth has a completely different look and leather looks really like leather. Combined with great normal maps Mass Effects looks like the game with the best looking characters up-to-date.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The best way to a material / meta-material system is very close to what was done in Table-Tennis. I described the system in ShaderX4 some time before I joined Rockstar.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The idea is to dedicate to each material a specific *.fx file and also name the file accordingly. So you end up with eye.fx, skin.fx, metal.fx, brushed_metal.fx, leather.fx, water.fx etc. ... so you will want to end up with probably 15 - 20 *.fx files (they would holds different techniques for shadowed / unshadowed etc.). This low number of files makes shader switching a non-issue and also allows to sort objects according to their shaders, if this is a bottleneck. &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The *.fx files can be called meta-material (this naming convention was inspired by a similar system that was used in the Medal of Honor Games, as long as they were based on the Quake 3 engine).&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The material files hold the data while each of the meta-material files only hold code. So we differ between code (*.fx file) and data (*.mtl or something different). One of the things that can be done to reduce the data updates is to shadow the constant data and only update the data that changes. In pseudo-code this would be:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- draw eyes from first character in the game&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- apply the eye.fx file&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- load the data for this file from the material file&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- draw the eyes from the second character in the game&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- eye.fx is still applied, so we do not have to do it again ... shader code just stays in the Graphics card and does not need to be reloaded&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- load the data file for this pair of eyes and load only data into the graphics card that has changed by comparing the shadowed copy of the data from the previous draw call with the one that needs to be loaded now&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- move on like this for all eyes&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- then move on with hair ...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The underlying idea of the data shadowing and the sorting by shaders is, to reduce the amount of changes that need to be done if you want to apply very detailed and very different surfaces to a character. In other words if you setup a game with such a meta-material / material system there is a high chance that you can find a performance difference between sorting for shaders or sorting for any other criteria.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;For me the key elements of having a truly next-gen looking shader architecture for a next-gen game are:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- having a low number of *.fx files to reduce shader switching + good sorting of objects to reduce shader switching&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- letting the artist apply those *.fx files and change the settings, switching on and off features etc.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;- tailoring the low number of shaders very much to specific materials, so that they look very good (the opposite approach was taken in DOOM III were everything consists of metal and plastic and therefore the same lighting/shader model is applied to everything)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;The low number of *.fx files and the intelligence we put into the creation of the effect files should give us a well performing and good looking game and we are able to control this process and predict performance.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-5776091825625606893?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/5776091825625606893/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=5776091825625606893' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5776091825625606893'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/5776091825625606893'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/how-to-design-material-system.html' title='How to design a Material System'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-398682525365778708.post-7072212288323283910</id><published>2007-12-09T09:41:00.000-08:00</published><updated>2007-12-26T14:19:19.204-08:00</updated><title type='text'>Porting an Open-Source Engine to the iPhone?</title><content type='html'>&lt;span style="font-family: arial;"&gt;Evaluated several open source engines:&lt;/span&gt;&lt;br /&gt;&lt;ul style="font-family: arial;"&gt;&lt;li&gt;Ogre: the architecture and design is not very performance friendly. The usage of C++ makes the usage and re-design here quite difficult. An example: each material has its own C++ file and there is an inheritance chain from a base class ... &lt;/li&gt;&lt;li&gt;Irrlicht: the Mac OS X version I tried looks like a Quake 3 engine. It also seems to lack lots of design elements of a modern 3D engine. Other than this it looks quite good for a portable device. You might also use the original Quake 3 engine then ...&lt;/li&gt;&lt;li&gt;Quake 3: this is obviously a very efficient game engine with rock-solid tools, I worked with this engine in the Medal of Honor series before, but I wanted a bit more flexibility and I wanted to target more advanced hardware.&lt;/li&gt;&lt;li&gt;Crystal Space: why is everything a plug-in? Can't get my head around this.&lt;/li&gt;&lt;li&gt;C4: this is one of my favourite engines, but it is closed source :-(&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;So I want to write my own based on the low-level framework I have in place now.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/398682525365778708-7072212288323283910?l=diaryofagraphicsprogrammer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://diaryofagraphicsprogrammer.blogspot.com/feeds/7072212288323283910/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=398682525365778708&amp;postID=7072212288323283910' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7072212288323283910'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/398682525365778708/posts/default/7072212288323283910'/><link rel='alternate' type='text/html' href='http://diaryofagraphicsprogrammer.blogspot.com/2007/12/porting-open-source-engine-to-iphone.html' title='Porting an Open-Source Engine to the iPhone?'/><author><name>Wolfgang Engel</name><uri>http://www.blogger.com/profile/11031097395025597662</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/_2YU3pmPHKN4/S33Rp8uu1TI/AAAAAAAAAd0/uE4Hw49vwCs/S220/n579488995_8879.jpg'/></author><thr:total>2</thr:total></entry></feed>
