Friday, June 13, 2008

Stable Cascaded Shadow Maps

I really like Michal Valient's article "Stable Cascaded Shadow Maps". It is a very practical approach to make Cascaded Shadow Maps more stable.
What I also like about it is the ShaderX idea. I wrote an article in ShaderX5 describing a first implementation (.... btw. I re-wrote that three times since than), Michal picks up from there and brings it to the next level.
There will be now a ShaderX7 article in which I will describe a slight improvement to Michal's approach. Michal picks the right shadow map with a rather cool trick. Mine is a bit different but it might be more efficient. So what I do to pick the right map is send down the sphere that is constructed for the light view frustum. I then check if the pixel is in the sphere. If it is I pick that shadow map, if it isn't I go to the next sphere. I also early out if it is not in a sphere by returning white.
At first sight it does not look like a trick but if you think about the spheres lined up along the view frustum and the way they intersect, it is actually pretty efficient and fast.
On my target platforms, especially on the one that Michal likes a lot, this makes a difference.

12 comments:

HellRaiZer said...

Hi,

I don't know if this is relevant to cascaded shadow maps, since i haven't have the chance to read the two articles you mentioned, so forgive my ignorance.

For PSSM with a maximum of 4 split, in order to select the correct shadow map index, i'm doing something like this:

float4 comparison = eyeSpaceZ.xxxx < splitPositions;
float index = 4 - dot(comparison, comparison);

where eyeSpaceZ is the current pixel's Z in camera space and splitPositions hold the camera space distance for each split plane (x = first plane, y = second, etc.)

So if your splitting planes are parallel to the camera Z axis, you can get the correct index in 3 asm instructions. I think, this can be extended to more than 4 splits, but i haven't tested this case yet.

Sorry if this completely irrelevant to your method. I'll have to get those two books (or wait for ShaderX 7 for reading the new article :))

HellRaiZer

PS. I would have posted this to the GameDev.net thread, but i thought it would be off-topic, so i'm posting it here.

Wolfgang Engel said...

This looks like it is the same what I used in the ShaderX5 article. The disadvantage of my implementation was that it is always a plane that is parallel to the near and far plane of the camera. Your light view frustum splits are not necessarily parallel. So what you can use instead is comparing to the top plane of each light view frustum, that works quite well or you could do the sphere / pixel comparison I mentioned here. Both are more precise and you do not loose so much precision.

HellRaiZer said...

So, i was missing something after all. :) Thanks for the clarifications.

HellRaiZer

John Hattan said...

Hey Wolfgang, this is John Hattan from gamedev.net. Sorry to pester you in your blog, but the email address on your gamedev articles appears to be dead. We're putting together a gamedev-themed project, and we'd like to use a couple of your articles.

If you'd email me at johnhattan@gamedev.net ASAP I can get you the particulars. Big thanks!

JayH said...

Hi,

This is my first post here :), just want to say I really love ShaderX series and it had helped me greatly.

With an implementation based on ShaderX6, I used the shadow map texture coordinate (calculated from light post projected world point) to determine which region the pixel is in. The disadvantage is that you will have to pay the cost of calculating the coordinates first(can be done in vertex shader I believe). The advantage is that you do not need to pass in any shader constants. Then one can do either border check or radius check against center of shadow map (which is always 0.5,0.5).

I thought about your method of passing the minimum enclosure sphere, if I am not mistaken, wouldn't it end up much like testing the coordinate against the center of the map (since the center of the sphere, will be the center of the shadow map?). Perhaps I have mis-understood the method.

Thanks,
Jay

JarkkoL said...

I don't know how the approach I use compares to this, but I use depth buffer & stencil trick with multi-pass approach for CSM/PSSM. I.e. you start from the closest shadow region and process pixels within the region marking pixels with stencil bit. Then you process next region which don't have the stencil bit set, etc. This relies on early depth/z cull and doesn't require any map selection because you always process only one map at the time, which is also good when you want to optimize memory use (e.g. on consoles).

Wolfgang Engel said...

So you collect the data by rendering for each map into a render target that uses the regular view with depth bounds / early Z culling on?
How do you combine the result of the four maps?
I take the four maps -that are in a texture atlas- and based on the world position of the pixel relative to the bounding spheres that surround the light view frusta I pick the right map. World position is calculated from the depth buffer (actually it is part of the directional light calculation).

JarkkoL said...

I flip the Z-test (D3DCMP_GREATER) and render a fullscreen quad (in case of PSSM and sphere around camera in case of CSM) for each shadow region at the far distance of the region. While rendering the region I check with stencil test if a stencil bit is set and discard the pixel if so. If the stencil bit isn't set and Z-test passes, I render the shadow for the pixel and set the stencil bit (i.e. effectively render and mark pixels within the region with the stencil bit).

So my flow is:
1) render shadow map for a region
2) render shadow for the region
3) loop to 1 until all regions are processed
Thus there is need for only single shadow map and no branching/map selection in pixel shader. I have the code on-line (for PSVSM) if you want to check it out.

Wolfgang Engel said...

You increase the draw calls and the vertex throughput this way. I tried this in my very first implementation more than 3 years ago. On the hardware platforms I work this is not a good solution. I believe I know your source code. You also co-authored a GPU Gems 3 chapter, is that right?

JarkkoL said...

Well, the only difference is that instead of 1 fullscreen quad for PSSM there is 4 (for 4 shadow regions that is), so it's barely an issue. I think this is vastly compensated by the other benefits though.

I haven't co-authored in GPU Gems to my knowledge at least (:

Wolfgang Engel said...

<<<
I haven't co-authored in GPU Gems to my knowledge at least (:
<<<
Oh I thought you wrote a sample implementation of Cascaded Shadow Maps ... I probably mixed your name up.

JarkkoL said...

I gave this multi-pass approach some extra thought and you should be able to render shadows without swapping the depth test direction by rendering shadow regions from furthest to closest instead. I recall reading that swapping the test might invalidate high-z, though I don't know if it applies if you have depth writes disabled.