Comments on Diary of a Graphics Programmer: Calculating Screen-Space Texture Coordinates for the 2D Projection of a Volume

Oh and I am sure you know, that consoles provide a...

2008-09-10T16:34:00.000-07:00

Oh and I am sure you know, that consoles provide a render state (ones that are D3D9 based anyway) to turn off this annoying half pixel thing.

Hi Wolfgang,The "w" divide is done automatically i...

2008-09-10T16:09:00.000-07:00

Hi Wolfgang,

The "w" divide is done automatically in the texture2DProj call. I believe the D3D version is called tex2Dproj.

Sorry I just re-read that comment and realized non...

2008-09-10T15:33:00.000-07:00

Sorry I just re-read that comment and realized none of those thoughts were really complete. I was distracted.

For the g-buffer I am storing world space normals using spherical coordinates. For the 8:8:8:8 target case, if you store the normal.xy and reconstruct z, you need to know the sign of z, since it's really +/-sqrt(1 - normal.xy *normal.xy). So before I switched to spherical storage, I had to store stuff like this:
8:8:8:8
normal.xy_8_8|sign(z)_1|depthHi_7|depthLo_8

This added unpack time to retrieving the depth value, and the normal value. It also only gave me 15 bits to store depth, instead of 16. Switching to spherical coordinates got me back that bit, and removed another op from getting the depth value.

The actual format for the spherical is generated like this:
inline float2 cartesianToSpGPU( in float3 normalizedVec )
{
float atanYX = atan2( normalizedVec.y, normalizedVec.x );
float2 ret = float2( atanYX / PI, normalizedVec.z );

return (ret + 1.0) * 0.5;
}

and retrieved like this:
inline float3 spGPUToCartesian( in float2 spGPUAngles )
{
float2 expSpGPUAngles = spGPUAngles * 2.0 - 1.0;
float2 scTheta;

sincos( expSpGPUAngles.x * PI, scTheta.x, scTheta.y );
float2 scPhi = float2( sqrt( 1.0 - expSpGPUAngles.y * expSpGPUAngles.y ), expSpGPUAngles.y );

// Renormalization not needed
return float3( scTheta.y * scPhi.x, scTheta.x * scPhi.x, scPhi.y );
}

It is slightly more expensive to re-construct than the cartesian, but I think (this may not be true) that because the light shaders use the surface normal last, the GPU can do the work whenever it has time.

On lower end cards, the atan2 and sincos functions take longer. Some of the ATI boards with the unified shaders assign 4 shader cores which can't do trancendental functions, and 1 which can per ALU. NVidia cards have 4 cores per ALU, and each can do all ops. I encoded sincos and atan2 into A8 lookup textures for that case, and it works better.

Wolfgang,(Pat Wilson from GarageGames)It doesn't r...

2008-09-10T15:04:00.000-07:00

Wolfgang,
(Pat Wilson from GarageGames)

It doesn't require a dedicated depth buffer. I am using these formats for g-buffers:

8:8:8:8
normal.theta|normal.phi|depthHi|depthLo

16:16:16:16
normal.theta|normal.phi|foo|depth

The reason I chose this method for world space reconstruction is because it is very cheap, requiring only 1 mad in the case of a FS quad.

The z-data that is stored is also very good because it is linear, and it is in the range 0..1 where 1 is zFar in camera space. I like integer formats over FP16 formats for the G-buffer because I can control the ranges of the data.

I haven't done enough profiling to know for sure, but I think that using an 8:8:8:8 g-target may hit light shader performance significantly (it is slower on high-bandwidth cards, but not as much on low-bandwidth cards). The first thing the light does is sample from the G-buffer, but then every subsequent thing that it does is dependent on knowing the depth.

Hi Pat,I think this is the Crytek approach that wa...

2008-09-10T12:44:00.000-07:00

Hi Pat,
I think this is the Crytek approach that was covered in a SIGGRAPH 2007 session by Carsten Wenzel. This looks very cool to me. You have to generate a dedicated depth buffer for this?

- Wolfgang

Hi Damian,Yes this is without the DX9 offset the s...

2008-09-10T12:38:00.000-07:00

Hi Damian,
Yes this is without the DX9 offset the same. So you can consider it trivial but in my specific case we forgot about the half pixel offset :-~ so I had to figure out why there is light leaking around a person :-) (we use this also to fetch shadow maps that are in screen-space).

- Wolfgang

BTW: didn't you forget the divide by z? I would think there is something like

projectSpace.xy /= projectSpace.w

in there as well.

I am doing my world space reconstruction using mos...

2008-09-10T09:51:00.000-07:00

I am doing my world space reconstruction using mostly comments found here: http://forum.beyond3d.com/showthread.php?t=45628

To store, in HLSL:
float3 wsPos = IN.pos.xyz / IN.pos.w;
float depth = dot( vEye, wsPos - eyePos );

Where IN.pos comes from the VShader and is:
OUT.pos = mul( objToWorldMat, IN.position );

vEye is a shader constant, and is the world-space view-vector normalized to 1/zFar

eyePos is a shader constant, and is the world-space eye position

I am storing depth in 16 bits as an integer and this seems to be plenty.

To reconstruct:
float3 worldPos = eyePos + eyeRay * depth;

eyePos is a shader constant, world-space eye position.

eyeRay is:

-For a full-screen quad:

Calculate in vertex shader:
OUT.wsEyeRay = float4( IN.wsFrustCoord - eyePos, 1.0 );

Calculate in pixel shader:
OUT.wsEyeRay = float4( IN.normal - eyePos, 1.0 );

In the vertex shader, it is a full screen quad, and each vertex has the world-space co-ordinate of the far-frustum plane. I am calculating like this:

Point3F farFrustumCorners[4];
farFrustumCorners[0].set( frustLeft * zFarOverNear, zFar, frustBottom * zFarOverNear );
farFrustumCorners[1].set( frustLeft * zFarOverNear, zFar, frustTop * zFarOverNear );
farFrustumCorners[2].set( frustRight * zFarOverNear, zFar, frustTop * zFarOverNear );
farFrustumCorners[3].set( frustRight * zFarOverNear, zFar, frustBottom * zFarOverNear );

MatrixF camToWorld = thisFrame.worldToCamera;
camToWorld.inverse();

for( int i = 0; i < 4; i++ )
camToWorld.mulP( farFrustumCorners[i] );

-For convex geometry:

In Pixel shader:
float3 eyeRay = getDistanceVectorToPlane( negFarPlaneDotEye, IN.wsPos.xyz / IN.wsPos.w, farPlane );

'negFarPlaneDotEye' is a shader constant which is:

-dot( worldSpaceFarPlane, eyePosition )

'farPlane' is a shader constant which is the world-space far-plane.

This function is from that thread:
inline float3 getDistanceVectorToPlane( in float negFarPlaneDotEye, in float3 direction, in float4 plane )
{
float denum = dot( plane.xyz, direction.xyz );
float t = negFarPlaneDotEye / denum;

return direction.xyz * t;
}

-----

This works well for me. I am sure it can be optimized further.

You should also mention that the half pixel offset...

2008-09-10T00:34:00.000-07:00

You should also mention that the half pixel offset is specific to D3D9 as D3D10/OpenGL/consoles do not need to do this.

Code I have been using for years to do this (seen in my Light Index Deferred Rendering code)

Vertex Shader:
projectSpace = gl_ModelViewProjectionMatrix * gl_Vertex;
gl_Position = projectSpace;
projectSpace.xy = (projectSpace.xy + vec2(projectSpace.w)) * 0.5;

Fragment shader:

vec4 texValue = texture2DProj( TextureID, projectSpace);