Jump to content
  • entries
    941
  • comments
    5,894
  • views
    866,837

About this blog

Learn about game development technology

Entries in this blog

Map Viewer available to subscribers

A map viewer application is now available for beta subscribers. This program will load any Leadwerks map and let you fly around in it, so you can see the performance difference the new renderer makes. I will be curious to hear what kind of results you see with this: Program is not tested with all hardware yet, and functionality is limited.

Josh

Josh

Map Loading, Materials, Shaders, and other Details

I have map loading working now. The LoadMap() function has three overloads you can use:: shared_ptr<Map> LoadMap(shared_ptr<World> world, const std::string filename); shared_ptr<Map> LoadMap(shared_ptr<World> world, const std::wstring filename); shared_ptr<Map> LoadMap(shared_ptr<World> world, shared_ptr<Stream> stream); Instead of returning a boolean to indicate success or failure, the LoadMap() function returns a Map object. The Map object gives

Josh

Josh

Clustered Forward Rendering - Fun with Light Types

By modifying the spotlight cone attenuation equation I created an area light, with shadow. And here is a working box light The difference here is the box light uses orthographic projection and doesn't have any fading on the edges, since these are only meant to shine into windows. If I scale the box light up and place it up in the sky, it kind of looks like a directional light. And it kind of is, expect a directional light would either use 3-4 different box lights set at rad

Josh

Josh

Clustered Forward Rendering - Multiple Light Types

I added spotlights to the forward clustered renderer. It's nothing too special, but it does demonstrate multiple light types working within a single pass. I've got all the cluster data and the light index list packed into one texture buffer now. GPU data needs to be aligned to 16 bytes because everything is built around vec4 data. Consequently, some of the code that handles this stuff is really complicated. Here's a sample of some of the code that packs all this data into an array.

Josh

Josh

Multiple Shadows

Texture arrays are a feature that allow you to pack multiple textures into a single one, as long as they all use the same format and size. In reality, this is just a convenience feature that packs all the textures into a single 3D texture. It allows things like cubemap lookups with a 3D texture, but the implementation is sort of inconsistent. In reality it would be much better if we were just given 1000 texture units to use. However, these can be used to pack all scene shadow maps into a single

Josh

Josh

Multisampled Shadowmaps

Because variance shadow maps allow us to store pre-blurred shadow maps it also allows us to take advantage of multipled textures. MSAA is a technique that renders extra pixels around the target pixel and averages the results. This can help bring out fine lines that are smaller than a pixel onscreen, and it also greatly reduces jagged edges. I wanted to see how well this would work for rendering shadow maps, and to see if I could reduce the ragged edge appearance that shadow maps are sometimes pr

Josh

Josh

Realistic Penumbras

Shadows with a constant softness along their edges have always bugged me. Real shadows look like this. Notice the shadow becomes softer the further away it gets from the door frame. Here is a mockup of roughly what that shadow looks like with a constant softness around it. It looks so fake! How does this effect happen? There's not really any such thing as a light that all emits from a single point. The closest thing would be a very small bulb, but that still has volume. Bec

Josh

Josh

Variance Shadow Maps

After a couple days of work I got point light shadows working in the new clustered forward renderer. This time around I wanted to see if I could get a more natural look for shadow edges, as well as reduve or eliminate shadow acne. Shadow acne is an effect that occurs when the resolution of the shadow map is too low, and incorrect depth comparisons start being made with the lit pixels: By default, any shadow mapping alogirthm will look like this, because not every pixel onscreen has an exact matc

Josh

Josh

Clustered Forward Rendering Victory

I got the remaining glitches worked out, and the deal is that clustered forward rendering works great. It has more flexibility than deferred rendering and it performs a lot faster. This means we can use a better materials and lighting system, and at the same time have faster performance, which is great for VR especially. The video below shows a scene with 50 lights working with fast forward rendering One of the last things I added was switching from a fixed grid size of 16x16x16 t

Josh

Josh

Clustered Forward Rendering Progress

In order to get the camera frustum space dividing up correctly, I first implemented a tiled forward renderer, which just divides the screen up into a 2D grid. After working out the math with this, I was then able to add the third dimension and make an actual volumetric data structure to hold the lighting information. It took a lot of trial and error, but I finally got it working. This screenshot shows the way the camera frustum is divided up into a cubic grid of 16x16x16 cells. Red an

Josh

Josh

Clustered Forward Rendering - First Performance Metrics

I was able to partially implement clustered forward rendering. At this time, I have not divided the camera frustum up into cells and I am just handing a single point light to the fragment shader, but instead of a naive implementation that would just upload the values in a shader uniform, I am going through the route of sending light IDs in a buffer. I first tried texture buffers because they have a large maximum size and I already have a GPUMemBlock class that makes them easy to work with. Becau

Josh

Josh

Taking Care of Business

This is about financial stuff, and it's not really your job to care about that, but I still think this is cool and wanted to share it with you. People are buying stuff on our website, and although the level of sales is much lower than Steam it has been growing. Unlike Steam, sales through our website are not dependent on a third party and cannot be endangered by flooded marketplaces, strange decisions, and other random events. Every customer I checked who used a credit card has kept it

Josh

Josh

Clustered Forward Rendering

I decided I want the voxel GI system to render direct lighting on the graphics card, so in order to make that happen I need working lights and shadows in the new renderer. Tomorrow I am going to start my implementation of clustered forward rendering to replace the deferred renderer in the next game engine. This works by dividing the camera frustum up into sectors, as shown below. A list of visible lights for each cell is sent to the GPU. If you think about it, this is really another v

Josh

Josh

Voxel Cone Tracing Part 5 - Hardware Acceleration

I was having trouble with cone tracing and decided to first try a basic GI algorithm based on a pattern of raycasts. Here is the result: You can see this is pretty noisy, even with 25 raycasts per voxel. Cone tracing uses an average sample, which eliminates the noise problem, but it does introduce more inaccuracy into the lighting. Next I wanted to try a more complex scene and get an estimate of performance. You may recognize the voxelized scene below as the "Sponza" scene freque

Josh

Josh

Voxel Cone Tracing Part 4 - Direct Lighting

Now that we can voxelize models, enter them into a scene voxel tree structure, and perform raycasts we can finally start calculating direct lighting. I implemented support for directional and point lights, and I will come back and add spotlights later. Here we see a shadow cast from a single directional light: And here are two point lights, one red and one green. Notice the distance falloff creates a color gradient across the floor: The idea here is to first calculate direc

Josh

Josh

Voxel Cone Tracing Part 3 - Raycasting

I added a raycast function to the voxel tree class and now I can perform raycasts between any two positions. This is perfect for calculating direct lighting. Shadows are calculated by performing a raycast between the voxel position and the light position, as shown in the screenshot below. Fortunately the algorithm seems to work great an there are no gaps or cracks in the shadow: Here is the same scene using a voxel size of 10 centimeters: If we move the light a little lower

Josh

Josh

Voxel Cone Tracing Part 2 - Sparse Octree

At this point I have successfully created a sparse octree class and can insert voxelized meshes into it. An octree is a way of subdividing space into eight blocks at each level of the tree: A sparse octree doesn't create the subnodes until they are used. For voxel data, this can save a lot of memory. It was difficult to get the rounding and all the math completely perfect (and it has to be completely perfect!) but now I have a nice voxel tree that can follow the camera around and

Josh

Josh

Voxel Cone Tracing

I've begun working on an implementation of voxel cone tracing for global illumination. This technique could potentially offer a way to perfrorm real-time indirect lighting on the entire scene, as well as real-time reflections that don't depend on having the reflected surface onscreen, as screen-space reflection does. I plan to perform the GI calculations all on a background CPU thread, compress the resulting textures using DXTC, and upload them to the GPU as they are completed. This means t

Josh

Josh

Three improvements I made to Leadwerks Game Engine 5 today

First, I was experiencing some crashes due to race conditions. These are very very bad, and very hard to track down. The problems were being caused by reuse of thread returned objects. Basically, a thread performs some tasks, returns an object with all the processed data, and then once the parent thread is done with that data it is returned to a pool of objects available for the thread to use. This is pretty complicated, and I found that when I switched to just creating a new return object each

Josh

Josh

Threaded Animation

The animation update routine has been moved into its own thread now where it runs in the background as you perform your game logic. We can see in the screenshot below that animation updates for 1025 characters take about 20 milliseconds on average. (Intel graphics below, otherwise it would be 1000 FPS lol.) In Leadwerks 4 this would automatically mean that your max framerate would be 50 FPS, assuming nothing else in the game loop took any time at all. Because of the asynchronous threa

Josh

Josh

Three Types of Optimization

In designing the new engine, I have found that there are three distinct types of optimization. Streamlining This is refinement. You make small changes and try to gain a small amount of performance. Typically, this is done as a last step before releasing code. The process can be ongoing, but suffers from diminishing returns after a while. When you eliminate unnecessary math based on guaranteed assumptions you are streamlining code. For example, a 4x4 matrix multiplication can skip the calc

Josh

Josh

Animation Tweening

Leadwerks 5 uses a different engine architecture with a game loop that runs at either 30 (default) or 60 updates per second. Frames are passed to the rendering thread, which runs at an independent framerate that can be set to 60, 90, or unlimited. This is great for performance but there are some challenges in timing. In order to smooth out the motion of the frames, the results of the last two frames received are interpolated between. Animation is a big challenge for this. There could potentially

Josh

Josh

First Animation Metrics

I got skinned animation working in the new renderer, after a few failed attempts that looked like something from John Carpenter's The Thing. I set up a timer and updated a single animation on a model 10,000 times. Animation consists of two phases. First, all animations are performed to calculate the local position and quaternion rotation. Second, 4x4 matrices are calculated for the entire hierarchy in global space and copied into an array of floats. To test this, I placed this code inside the ma

Josh

Josh

Animation in Leadwerks 5

The design of Leadwerks 4 was meant to be flexible and easy to use. In Leadwerks 5, our foremost design goals are speed and scalability. In practical terms that means that some options are going to go away in order to give you bigger games that run faster. I'm working out the new animation system. There are a few different ways to approach this. In situations like this I find it is best to start by deciding the desired outcome and then figuring out how to achieve that. So what do we want?

Josh

Josh

Second Performance Test: nearly 400% faster!

After observing the behavior of the previous test, I rearranged the threading architecture for even more massive performance gains. This build runs at speeds in excess of 400 FPS with 100,000 entities....on Intel integrated graphics! I've had more luck with concurrency in design than parallelism. (Images below are taken from here.) Splitting the octree recursion up into separate threads produced only modest gains. It's difficult to optimize because the sparse octree is unpredicta

Josh

Josh

×
×
  • Create New...