Jump to content
Search In
  • More options...
Find results that contain...
Find results in...

Finished Direct Lighting Step



While seeking a way to increase performance of octree ray traversal, I came across a lot of references to this paper:


Funnily enough, the first page of the paper perfectly describes my first two attempted algorithms. I started with a nearest neighbor approach and then implemented a top-down recursive design:


Bottom-Up Methods: Traversing starts at the first terminal node intersected by the ray. A process called neighbour finding is used to obtain the next terminal node from the current one [Glass84, Samet89, Samet90].

Top-Down Methods: These methods start from the root voxel (that is, from the one covering all others). Then a recursive procedure is used. From the current node, its direct descendants hit by the ray are obtained, and the process is (recursively) repeated for each of them, until terminal voxels are reached [Agate91, Cohen93, Endl94, Janse85, Garga93].

GLSL doesn't support recursive function calls, so I had to create a function that walks up and down the octree hierarchy without calling itself. This was an interesting challenge. You basically have to use a while loop and store your variables at each level in an array. Use a level integer to indicate the current level you are working at, and everything works out fine.

while (true)
    childnum = n[level];
    childindex = svotnodes[nodeindex].child[childnum];
    if (childindex != 0)
        pos[level + 1] = pos[level] - qsize;
        pos[level + 1] += coffset[childnum] * hsize;
        bounds.min = pos[level + 1] - qsize;
        bounds.max = bounds.min + hsize;
        if (AABBIntersectsRay2(bounds, p0, dir))
            if (level == maxlevels - 2)
                if (SVOTNodeGetDiffuse(childindex).a > 0.5f) return true;
                parent[level] = nodeindex;
                nodeindex = childindex;
                n[level] = 0;
                childnum = 0;
                size *= 0.5f;
                hsize = size * 0.5f;
                qsize = size * 0.25f;
    while (n[level] == 8)
        if (level == -1) return false;
        nodeindex = parent[level];
        childnum = n[level];
        size *= 2.0f;
        hsize = size * 0.5f;
        qsize = size * 0.25f;

I made an attempt to implement the technique described in the paper above, but something was bothering me. The octree traversal was so slow that even if I was able to speed it up four times, it would still be slower than Leadwerks with a shadow map.

I can show you very simply why. If a shadow map is rendered with the triangle below, the GPU has to process just three vertices, but if we used voxel ray tracing, it would require about 90 octree traversals. I think we can assume the post-vertex pipeline triangle rasterization process is effectively free, because it's a fixed function feature GPUs have been doing since the dawn of time:


The train station model uses 4 million voxels in the shot below, but it has about 40,000 vertices. In order for voxel direct lighting to be on par with shadow maps, the voxel traversal would have to be about 100 times faster then processing a single vertex. The numbers just don't make sense.


Basically, voxel shadows are limited by the surface area, and shadow maps are limited by the number of vertices. Big flat surfaces that cover a large area use very few vertices but would require many voxels to be processed. So for the direct lighting component, I think shadow maps are still the best approach. I know Crytek is claiming to get better performance with voxels, but my experience indicates otherwise.

Another aspect of shadow maps I did not fully appreciate before is the fact they give high resolution when an object is near the light source, and low resolution further away. This is pretty close to how real light works, and would be pretty difficult to match with voxels, since their density does not increase closer to the light source.


There are also issues with moving objects, skinned animation, tessellation, alpha discard, and vertex shader effects (waving leaves, etc.). All of these could be tolerated, but I'm sure shadow maps are much faster, so it doesn't make sense to continue on that route.

I feel I have investigated this pretty thoroughly and now I have a strong answer why voxels cannot replace shadow maps for the direct shadows. I also developed a few pieces of technology that will continue to be used going forward, like our own improved mesh voxelization and the sparse octree traversal routine (which will be used for reflections). And part of this forced me to implement Vulkan dynamic rendering, to get rid of render passes and simplify the code.

Voxel GI and reflections are still in the works, and I am farther along than ever now. Direct lighting is being performed on the voxel data, but now I am using the shadow maps to light the voxels. The next step is to downsample the lit voxel texture, then perform a GI pass, downsample again, and perform the second GI pass / light bounce. Because the octree is now sparse, we will be able to use a higher resolution with faster performance than the earlier videos I showed. And I hope to finally be able to show GI with a second bounce.

  • Like 4


Recommended Comments


Since you're going back to shadow maps, is there any ideas of reducing shadow banding? I made the shadow maps resolution for the spotlights in Cyclone at 1024 for a cleaner result but there is still some banding if you were to look hard enough.

Also, you could only adjust the resolution if you had api access.

  • Like 1
Link to comment

There must be a formula to calculate this exactly. The non-linear depth value makes it tricky to figure out. I wish GPUs supported a linear depth buffer.

  • Like 1
Link to comment

I think this is why most other engines support baking lightmaps which I know you want to avoid. I guess if you can't figure it out, I would say allow the end user to adjust the resolution of the shadow maps manually. Personally, I didn't see much of a difference above 1024, and I only bumped it if I had to. 

  • Upvote 1
Link to comment

I'm getting good results with an experimentally determined equation that considers the resolution of the texture and area it covers. The calculated offset is applied to the fragment position before multiplying it by the light projection matrix:



  • Like 3
Link to comment
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...