Jump to content
Search In
  • More options...
Find results that contain...
Find results in...

Shadow Filtering



Happy Friday! I am taking a break from global illumination to take care of some various remaining odds and ends in Ultra Engine.

Variance shadow maps are a type of shadowmap filter technique that use a statistical sample of the depth at each pixel to do some funky math stuff. GPU Gems 3 has a nice chapter on the technique. The end result is softer shadows that run faster. I was wondering where my variance shadow map code went, until I realized this is something I only prototyped in OpenGL and never implemented in Vulkan until now. Here's my first pass at variance shadow maps in Vulkan:


There are a few small issues but they are no problem to work out. The blurring is taking place before the scene render, in the shadow map itself, which is a floating point color texture instead of a depth texture. (This makes VSMs faster than normal shadow maps.) The seams you see on the edges in the shot above are caused by that blurring, but there's a way we can fix that. If we store one sharp and one blurred image in the variance shadow map, we can interpolate between those based on distance from the shadow caster. Not only does this get rid of the ugly artifacts (say goodbye to shadow acne forever), but it also creates a realistic penumbra, as you can see in the shot of my original OpenGL implementation. Close to the shadow caster, the shadow is well-defined and sharp, but it gets much blurrier the further away it gets from the object:


Instead of blurring the near image after rendering, we can use MSAA to give it a fine-but-smooth edge. There is no such thing as an MSAA depth shadow sampler in GLSL, although I think there should be, and I have lobbied on behalf of this idea.


Finally, in my Vulkan implementation I used a compute shader instead of a fragment shader to perform the blurring. The advantage is that a compute shader can gather a bunch of samples and store them in memory, then access them to process a group of pixels at once. Instead of reading 9x9 pixels for each fragment, it can read a block of pixels and process them all at once, performing the same number of image writes, but much fewer reads:

// Read all required pixel samples
x = int(gl_WorkGroupID.x) * BATCHSIZE;
y = int(gl_WorkGroupID.y) * BATCHSIZE;
for (coord.x = max(x - EXTENTS, 0); coord.x < min(x + BATCHSIZE + EXTENTS, outsize.x); ++coord.x)
    for (coord.y = max(y - EXTENTS, 0); coord.y < min(y + BATCHSIZE + EXTENTS, outsize.y); ++coord.y)
        color = imageLoad(imagearrayCube[inputimage], coord);
        samples[coord.x - int(gl_WorkGroupID.x) * BATCHSIZE + EXTENTS][coord.y - int(gl_WorkGroupID.y) * BATCHSIZE + EXTENTS] = color;

This same technique will be used to make post-processing effects faster. I previously thought the speed of those would be pretty much the same in every engine, but now I see ways they can be improved significantly for a general-use speed increase. @klepto2 has been talking about the benefits of compute shaders for a while, and he is right, they are very cool. Most performance-intensive post-processing effects perform some kind of gather operation, so compute shaders can make a big improvement there.

One issue with conventional VSMs is that all objects must cast a shadow. Otherwise, an object that appears in front of a shadow caster will be dark. However, I added some math of my own to fix this problem, and it appears to work with no issues. So that's not something we need to worry about.

All around, variance shadow maps are a big win. They run faster, look better, and eliminate shadow acne, so they basically kill three birds with one stone.

  • Like 4


Recommended Comments

The array might be unnecessary if the texture cache is already providing the same functionality. Still testing this, I am not getting good performance with imageStore() in a compute shader right now.

Link to comment

It looks like I overlooked the performance cost of blurring the shadow image. The jury is still out on whether the compute shader with imageStore() is faster than a fragment shader, but I think VSMs are going to be the high-quality option. Regular depth shadow maps have almost no performance cost on low-end hardware but don't look as nice.

The final scene render for VSMs is in fact faster, so if the shadow is static the VSM becomes faster, but redrawing a variance shadow map incurs a pretty significant cost, on low-end hardware at least.

  • Like 1
Link to comment

I will get a better measure of performance if I test compute shaders with post-processing effects. It's a little difficult to get a good reading with this because you have several things going on.

Link to comment
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...