Jump to content

Josh

Staff
  • Posts

    23,507
  • Joined

  • Last visited

Blog Entries posted by Josh

  1. Josh
    There are two aspects of GLTF files that have non-optimal loading speed. First, the vertex data is not stored in the same exact layout and format as our vertex structure. I found the difference in performance for this was pretty small, even for large models, so I was willing to let it go. Tangents can take a bit longer to build, but those are usually included in the model if they are needed.
    The second issue is the triangle tree structure which is used for raycasting (the pick commands). I found the difference in debug mode was really significant when using a large (two million poly) mesh. Here are the numbers. The big concern is building the pick tree in debug mode, which takes two minutes using this model.
    GLTF Loading Speed (milliseconds)
    Release Mode
    Vertices only: 1249 Vertices + build tangents: 1276 Vertices + build collision: 5563 Debug Mode
    Vertices only: 3758 Vertices + build tangents: 12453 Vertices + build collision: 120,000 I was able to improve this speed significantly by saving vertex and pick tree data to a cache file after processing.
    Release Mode
    Cached data: 1233 Debug Mode
    Cached data: 2707 This is sort of a halfway step between using GLTF natively and converting to our faster-loading GMF2 file format.
    I like the idea of keeping all files in easily-accessible formats like DDS and GLTF, but there are some big questions. Where and how should the mesh cache files be stored? A published game will not necessarily have write access for files in its own install folder, so any files written should be in the user Documents or AppData folders. How should the GLTF file be identified? By checksum? By full file path? The full file path will change when the game is installed on another computer. Should the mesh cache files be shipped with the game, or generated the first time it is run? Do we even need to worry about caching when release mode is used? It seems like only debug mode has a significant performance problem, so maybe this should be something that only gets used during development?j I need to think about this.
  2. Josh
    A big update is now available for beta subscribers!
    Multipass Rendering
    You can use several cameras to increase the depth range or mix 2D and 3D graphics.
    2D Graphics
    As discussed earlier, 2D graphics are now supported using persistent 2D objects.
    Depth Sorting
    Multiple layers of transparency will now render in correct order, with lighting in each layer. Note that I used an empty script called "null.lua" to prevent the glass surfaces here from being collapsed at load.

    Coroutine Sequences
    Lua coroutine sequences are working now. Lua coroutines can be added to an entity and they will be updated each frame, in order:
    function Start() self:AddCoroutine(self.MoveToPoint,10,0,0) self:AddCoroutine(self.MoveToPoint,10,0,10) self:AddCoroutine(self.MoveToPoint,0,0,10) end Key / Action Bindings
    Key bindings are now working for Lua scripts or C++ actors. You supply a default key code and an action name. Action name is not implemented yet but later it will allow the player to remap their keys.
    Localization
    Along with text rendering and Unicode support, you can now load language definition sets to automatically replace your text when you call CreateText.
    auto lang = LoadLanguage("Config/Localization/German.json"); SetLanguage(lang); Language files are just a bunch of JSON key pairs:
    { "Hello": "Guten Tag", "How are you?": "Wie gehts?", "I must have an apple.": "Ich muss einen Apfel haben." } Vertical fonts are not currently supported.
    Notification Boxes and File Dialogs
    In anticipation of our new editor, these have been added.
  3. Josh
    Previously, we saw how the new renderer can combine multiple cameras and even multiple worlds in a single render to combine 3D and 2D graphics. During the process of implementing Z-sorting for multiple layers of transparency, I found that Vulkan does in fact respect rasterization order. That is, objects are in fact drawn in the same order you provide draw calls to a command buffer.
    Furthermore, individual primitives (polygons) are also rendered in the order they are stored in the indice buffer:
    Now if you were making a 2D game with 1000 zombie sprites onscreen you would undoubtedly want to use 3D-in-2D rendering with an orthographic camera. Batching and depth discard would give you much faster performance when the number of objects goes up. However, the 2D aspect of most games is relatively simple, with only a dozen or so 2D sprites making up the user interface. Given that 2D graphics are not normally going to be much of a bottleneck, and that the biggest performance savings we have achieved was in making text a static object, I decided to rework the 2D rendering system into something that was a little simpler to use.
    Sprites are no longer a 3D entity, but are a new type of pure 2D object. They act in a similar way as entities with position, rotation, and scale commands, but they only use 2D coordinates:
    //Create a sprite auto sprite = CreateSprite(world,100,100); //Make blue sprite->SetColor(0,0,1); //Position in upper-left corner of screen sprite->SetPosition(10,10) Sprites have a handle you can set. By default this is in the upper-left corner of the sprite, but you can change it to recenter them. Sprites can also be rotated around the Z axis:
    //Center the handle sprite->SetHandle(0.5,0.5); //Rotation around center sprite->SetRotation(45); SVG vector images are great for 2D drawing and GUIs because they can scale for different display resolutions. We support these as well, with an optional scale value the image can be rasterized at.
    auto sprite = LoadSprite(world, "tiger.svg", 0, 2.0);
    Text is now just another type of sprite:
    auto text = CreateSprite(world, font, L"Hello, how are you today?\nI am fine.", 72, TEXT_LEFT); These sprites are all displayed within the same world as the 3D rendering, so unlike what I previously wrote about...
    You do not have to create extra cameras or worlds just to draw 2D graphics. (If you are doing something advanced then the multi-camera method I previously described is a good option, but you have to have very demanding needs for it to make a difference.) Regular old screen coordinates you are used to will be used (coordinate [0,0] is top-left). By default sprites will be drawn in the order they are created. However, I definitely see a need for additional control here and I am open to ideas. Should there be a sprite order value, a MoveToFront() method, or a system of different layers? I'm not sure yet.
    I'm also not sure how per-camera sprites will be controlled. At this time sprites are stored in a per-world list, but we will want some 2D elements to only appear on some cameras. I am not sure yet how this will be controlled.
    I am going to try to get an update out soon with these features so you can try them out yourself.
  4. Josh
    Previously I described how multiple cameras can be combined in the new renderer to create an unlimited depth buffer. That discussion lead into multi-world rendering and 2D drawing. Surprisingly, there is a lot of overlap in these features, and it makes sense to solve all of it at one time.
    Old 2D rendering systems are designed around the idea of storing a hierarchy of state changes. The renderer would crawl through the hierarchy and perform commands as it went along, rendering all 2D elements in the order they should appear. It made sense for the design of the first graphics cards, but this style of rendering is really inefficient on modern graphics hardware. Today's hardware works best with batches of objects, using the depth buffer to handle which object appears on top. We don't sort 3D objects back-to-front because it would be monstrously inefficient, so why should 2D graphics be any different?
    We can get much better results if we use the same fast rendering techniques we use for 3D graphics and apply it to 2D shapes. After all, the only difference between 3D and 2D rendering is the shape of the camera projection matrix. For this reason, Turbo Engine will use 2D-in-3D rendering for all 2D drawing. You can render a pure 2D scene by setting the camera projection mode to orthographic, or you can create a second orthographic camera and render it on top of your 3D scene. This has two big implications:
    Performance will be incredibly fast. I predict 100,000 uniquely textured sprites will render pretty much instantaneously. In fact anyone making a 2D PC game who is having trouble with performance will be interested in using Turbo Engine. Advanced 3D effects will be possible that we aren't used to seeing in 2D. For example, lighting works with 2D rendering with no problems, as you can see below. Mixing of 3D and 2D elements will be possible to make inventory systems and other UI items. Particles and other objects can be incorporated into the 2D display.
    The big difference you will need to adjust to is there are no 2D drawing commands. Instead you have persistent objects that use the same system as the 3D rendering.
    Sprites
    The primary 2D element you will work with is the Sprite entity, which works the same as the 3D sprites in Leadwerks 4. Instead of drawing rectangles in the order you want them to appear, you will use the Z position of each entity and let the depth buffer take care of the rest, just like we do with 3D rendering. I also am adding support for animation frames and other features, and these can be used with 2D or 3D rendering.

    Rotation and scaling of sprites is of course trivial. You could even use effects like distance fog! Add a vector joint to each entity to lock the Z axis in the same direction and Newton will transform into a nice 2D physics system.
    Camera Setup
    By default, with a zoom value of 1.0 an orthographic camera maps so that one meter in the world equals one screen pixel. We can position the camera so that world coordinates match screen coordinates, as shown in the image below.
    auto camera = CreateCamera(world); camera->SetProjectionMode(PROJECTION_ORTHOGRAPHIC); camera->SetRange(-1,1); iVec2 screensize = framebuffer->GetSize(); camera->SetPosition(screensize.x * 0.5, -screensize.y * 0.5); Note that unlike screen coordinates in Leadwerks 4, world coordinates point up in the positive direction.

    We can create a sprite and reset its center point to the upper left hand corner of the square like so:
    auto sprite = CreateSprite(world); sprite->mesh->Translate(0.5,-0.5,0); sprite->mesh->Finalize(); sprite->UpdateBounds(); And then we can position the sprite in the upper left-hand corner of the screen and scale it:
    sprite->SetColor(1,0,0); sprite->SetScale(200,50); sprite->SetPosition(10,-10,0);
    This would result in an image something like this, with precise alignment of screen pixels:

    Here's an idea: Remember the opening sequence in Super Metroid on SNES, when the entire world starts tilting back and forth? You could easily do that just by rotating the camera a bit.
    Displaying Text
    Instead of drawing text with a command, you will create a text model. This is a series of rectangles of the correct size with their texture coordinates set to display a letter for each rectangle. You can include a line return character in the text, and it will create a block of multiple lines of text in one object. (I may add support for text made out of polygons at a later time, but it's not a priority right now.)
    shared_ptr<Model> CreateText(shared_ptr<World> world, shared_ptr<Font> font, const std::wstring& text, const int size) The resulting model will have a material with the rasterized text applied to it, shown below with alpha blending disabled so you can see the mesh background. Texture coordinates are used to select each letter, so the font only has to be rasterized once for each size it is used at:

    Every piece of text you display needs to have a model created for it. If you are displaying the framerate or something else that changes frequently, then it makes sense to create a cache of models you use so your game isn't constantly creating new objects. If you wanted, you could modify the vertex colors of a text model to highlight a single word.

    And of course all kinds of spatial transformations are easily achieved.

    Because the text is just a single textured mesh, it will render very fast. This is a big improvement over the DrawText() command in Leadwerks 4, which performs one draw call for each character.
    The font loading command no longer accepts a size. You load the font once and a new image will be rasterized for each text size the engine requests internally:
    auto font = LoadFont("arial.ttf"); auto text = CreateText(foreground, font, "Hello, how are you today?", 18); Combining 2D and 3D
    By using two separate worlds we can control which items the 3D camera draws and which item 2D camera draws: (The foreground camera will be rendered on top of the perspective camera, since it is created after it.) We need to use a second camera so that 2D elements are rendered in a second pass with a fresh new depth buffer.
    //Create main world and camera auto world = CreateWorld(); auto camera = CreateCamera(world); auto scene = LoadScene(world,"start.map"); //Create world for 2D rendering auto foreground = CreateWorld() auto fgcam = CreateCamera(foreground); fgcam->SetProjection(PROJECTION_ORTHOGRAPHIC); fgcam->SetClearMode(CLEAR_DEPTH); fgcam->SetRange(-1,1); auto UI = LoadScene(foreground,"UI.map"); //Combine rendering world->Combine(foreground); while (true) { world->Update(); world->Render(framebuffer); } Overall, this will take more work to set up and get started with than the simple 2D drawing in Leadwerks 4, but the performance and additional control you get are well worth it. This whole approach makes so much sense to me, and I think it will lead to some really cool possibilities.
    As I have explained elsewhere, performance has replaced ease of use as my primary design goal. I like the results I get with this approach because I feel the design decisions are less subjective.
  5. Josh
    Current generation graphics hardware only supports up to a 32-bit floating point depth buffer, and that isn't adequate for large-scale rendering because there isn't enough precision to make objects appear in the correct order and prevent z-fighting.

    After trying out a few different approaches I found that the best way to support large-scale rendering is to allow the user to create several cameras. The first camera should have a range of 0.1-1000 meters, the second would use the same near / far ratio and start where the first one left off, with a depth range of 1000-10,000 meters. Because the ratio of near to far ranges is what matters, not the actual distance, the numbers can get very big very fast. A third camera could be added with a range out to 100,000 kilometers!
    The trick is to set the new Camera::SetClearMode() command to make it so only the furthest-range camera clears the color buffer. Additional cameras clear the depth buffer and then render on top of the previous draw. You can use the new Camera::SetOrder() command to ensure that they are drawn in the order you want.
    auto camera1 = CreateCamera(world); camera1->SetRange(0.1,1000); camera1->SetClearMode(CLEAR_DEPTH); camera1->SetOrder(1); auto camera2 = CreateCamera(world); camera2->SetRange(1000,10000); camera2->SetClearMode(CLEAR_DEPTH); camera2->SetOrder(2); auto camera3 = CreateCamera(world); camera3->SetRange(10000,100000000); camera3->SetClearMode(CLEAR_COLOR | CLEAR_DEPTH); camera3->SetOrder(3); Using this technique I was able to render the Earth, sun, and moon to-scale. The three objects are actually sized correctly, at the correct distance. You can see that from Earth orbit the sun and moon appear roughly the same size. The sun is much bigger, but also much further away, so this is exactly what we would expect.

    You can also use these features to render several cameras in one pass to show different views. For example, we can create a rear-view mirror easily with a second camera:
    auto mirrorcam = CreateCamera(world); mirrorcam->SetParent(maincamera); mirrorcam->SetRotation(0,180,0); mirrorcam=>SetClearMode(CLEAR_COLOR | CLEAR_DEPTH); //Set the camera viewport to only render to a small rectangle at the top of the screen: mirrorcam->SetViewport(framebuffer->GetSize().x/2-200,10,400,50); This creates a "picture-in-picture" effect like what is shown in the image below:

    Want to render some 3D HUD elements on top of your scene? This can be done with an orthographic camera:
    auto uicam = CreateCamera(world); uicam=>SetClearMode(CLEAR_DEPTH); uicam->SetProjectionMode(PROJECTION_ORTHOGRAPHIC); This will make 3D elements appear on top of your scene without clearing the previous render result. You would probably want to move the UI camera far away from the scene so only your HUD elements appear in the last pass.
  6. Josh
    Still a lot of things left to do. Now that I have very large-scale rendering working, people want to fill it up with very big terrains. A special system will be required to handle this, which adds another layer to the terrain system. Also, I want to resume work on the voxel GI system, as I feel these results are much better than the performance penalty of ray-tracing. There are a few odds and ends like AI navigation and cascaded shadow maps to finish up.
    I am planning to have the engine more or less finished in the spring, and begin work on the new editor. Our workflow isn't going to change much. The new editor is just going to be a more refined version of what we already have, although it is a complete new program written from scratch, this time in C++.
    It's kind of overwhelming but I have confidence in the whole direction and strategy of this new product.
  7. Josh
    A new build is available on the beta branch on Steam.
    Updated to Visual Studio 2019. Updated to latest version of OpenVR and Steamworks SDK. Fixed object tracking with seated VR mode. Note that the way seated VR works now is a little different, you need to use the VR orientation instead of just positioning the camera (see below). Added VR:SetRotation() so you can rotate the world around in VR. The VRPlayer script has rotation added to the left controller. Press the touchpad to turn left and right. Any arbitrary rotation will work, including roll and pitch. Here are the offset commands:
    static void VR::SetOffset(const Vec3& position); static void VR::SetOffset(const float x, const float y, const float z); static void VR::SetOffset(const float x, const float y, const float z, const float pitch, const float yaw, const float roll); static void VR::SetOffset(const Vec3& position, const Vec3& rotation); static void VR::SetOffset(const Vec3& position, const Quat& rotation); static Vec3 VR::GetOffset(); static Vec3 VR::GetRotation(); static Quat VR::GetQuaternion();  
  8. Josh
    If were a user of BlitzMax in the past, you will love these convenience commands in Turbo Engine:
    int Notify(const std::wstring& title, const std::wstring& message, shared_ptr<Window> window = nullptr, const int icon = 0) int Confirm(const std::wstring& title, const std::wstring& message, shared_ptr<Window> window = nullptr, const int icon = 0) int Proceed(const std::wstring& title, const std::wstring& message, shared_ptr<Window> window = nullptr, const int icon = 0)
  9. Josh
    A huge update is available for Turbo Engine Beta.
    Hardware tessellation adds geometric detail to your models and smooths out sharp corners. The new terrain system is live, with fast performance, displacement, and support for up to 255 material layers. Plugins are now working, with sample code for loading MD3 models and VTF textures. Shader families eliminate the need to specify a lot of different shaders in a material file. Support for multiple monitors and better control of DPI scaling. Notes:
    Terrain currently has cracks between LOD stages, as I have not yet decided how I want to handle this. Tessellation has some "shimmering" effects at some resolutions. Terrain may display a wire grid on parts. Directional lights are supported but cast no shadows. Tested in Nvidia and AMD, did not test on Intel. Subscribers can get the latest beta in the private forum here.

     
     
  10. Josh
    New commands in Turbo Engine will add better support for multiple monitors. The new Display class lets you iterate through all your monitors:
    for (int n = 0; n < CountDisplays(); ++n) { auto display = GetDisplay(n); Print(display->GetPosition()); //monitor XY coordinates Print(display->GetSize()); //monitor size Print(display->GetScale()); //DPI scaling } The CreateWindow() function now takes a parameter for the monitor to create the window on / relative to.
    auto display = GetDisplay(0); Vec2 scale = display->GetScale(); auto window = CreateWindow(display, "My Game", 0, 0, 1280.0 * scale.x, 720.0 * scale.y, WINDOW_TITLEBAR | WINDOW_RESIZABLE); The WINDOW_CENTER style can be used to center the window on one display.
    You can use GetDisplay(DISPLAY_PRIMARY) to retrieve the main display. This will be the same as GetDisplay(0) on systems with only one monitor.
  11. Josh
    Texture loading plugins allow us to move some big libraries like FreeImage into separate optional plugins, and it also allows you to write plugins to load new texture and image formats.
    To demonstrate the feature, I have added a working VTF plugin to the GMF2 SDK. This plugin will load Valve texture files used in Source Engine games like Half-Life 2 and Portal. Here are the results, showing a texture loaded directly from VTF format and also displayed in Nem’s nifty VTFEdit tool.

    Just like we saw with the MD3 model loader, loading a texture with a plugin is done by first loading the plugin, and then the texture. (Make sure you keep a handle to the plugin or it will be automatically unloaded.)
    auto vtfloader = LoadPlugin("Plugins/VTF.dll"); auto tex = LoadTexture("Materials/HL2/gordon.vtf”); This will be included in the next update to the Turbo Game Engine update, and at that point there is nothing stopping you from creating your own plugins. I hope to someone smart write a plugin for Crunch .crn files, which could cut the size of your game's distribution files down by half.
    Unlike the GMF2 model format, which provides fast loading of large model files, my internal texture format ("GTF2") has no advantage over other texture formats, and I do not plan to make standalone files in this format. I recommend using DDS files for textures, using BC7 compression for most images and BC5 for normal maps.
  12. Josh
    I wanted to work on something a bit easier before going back into voxel ray tracing, which is another difficult problem. "Something easier" was terrain, and it ended up consuming the entire month of August, but I think you will agree it was worthwhile.
    In Leadwerks Game Engine, I used clipmaps to pre-render the terrain around the camera to a series of cascading textures. You can read about the implementation here:
    This worked very well with the hardware we had available at the time, but did result in some blurriness in the terrain surface at far distances. At the time this was invented, we had some really severe hardware restrictions, so this was the best solution then. I also did some experiments with tessellation, but a finished version was never released.
    New Terrain System
    Vulkan gives us a lot more freedom to follow our dreams. When designing a new system, I find it useful to come up with a list of attributes I care about, and then look for the engineering solution that best meets those needs.
    Here's what we want:
    Unlimited number of texture layers Pixel-perfect resolution at any distance Support for tessellation, including physics that match the tessellated surface. Fast performance independent from the number of texture layers (more layers should not slow down the renderer) Hardware tessellation is easy to make a basic demo for, but it is hard to turn it into a usable feature, so I decided to attack this first. You can read my articles about the implementation below. Once I got the system worked out for models, it was pretty easy to carry that over to terrain.
    So then I turned my attention to the basic terrain system. In the new engine, terrain is a regular old entity. This means you can move it, rotate it, and even flip it upside down to make a cave level. Ever wonder what a rotated terrain looks like?

    Now you know.
    You can create multiple terrains, instead of just having one terrain per world like in Leadwerks. If you just need a little patch of terrain in a mostly indoor scene, you can create one with exactly the dimensions you want and place it wherever you like. And because terrain is running through the exact same rendering path as models, shadows work exactly the same.
    Here is some of the terrain API, which will be documented in the new engine:
    shared_ptr<Terrain> CreateTerrain(shared_ptr<World> world, const int tilesx, const int tiles, const int patchsize = 32, const int LODLevels = 4) shared_ptr<Material> Terrain::GetMaterial(const int x, const int y, const int index = 0) float Terrain::GetHeight(const int x, const int y, const bool global = true) void Terrain::SetHeight(const int x, const int y, const float height) void Terrain::SetSlopeConstraints(const float minimum, const float maximum, const float range, const int layer) void Terrain::SetHeightConstraints(const float minimum, const float maximum, const float range, const int layer) int Terrain::AddLayer(shared_ptr<Material> material) void Terrain::SetMaterial(const int index, const int x, const int y, const float strength = 1.0, const int threadindex = 0) Vec3 Terrain::GetNormal(const int x, const int y) float Terrain::GetSlope(const int x, const int y) void Terrain::UpdateNormals() void Terrain::UpdateNormals(const int x, const int y, const int width, const int height) float Terrain::GetMaterialStrength(const int x, const int y, const int index) What I came up with is flexible it can be used in three ways.
    Create one big terrain split up into segments (like Leadwerks Engine does, except non-square terrains are now supported). Create small patches of terrain to fit in a specific area. Create many terrains and tile them to simulate very large areas. Updating Normals
    I spent almost a full day trying to calculate terrain normal in local space. When they were scaled up in a non-linear scale, the PN Quads started to produce waves. I finally realized that normal cannot really be scaled. The scaled vector, even if normalized, is not the correct normal. I searched for some information on this issue, but the only thing I could find is a few mentions of an article called "Abnormal Normals" by someone named Eric Haines, but it seems the original article has gone down the memory hole. In retrospect it makes sense if I picture the normal vectors rotating instead of shifting each axis. So bottom line is that normal for any surface have to be recalculated if a non-uniform scale is used.
    I'm doing more things on the CPU in this design because the terrain system is more complex, and because it's a lot harder to get Vulkan to do anything. I might move it over to the GPU in the future but for right now I will stick with the CPU. I used multithreading to improve performance by a lot:
    Physics
    Newton Dynamics provides a way to dynamically calculate triangles for collision. This will be used to calculate a high-res collision mesh on-the-fly for physics. (For future development.) Something similar could probably be done for the picking system, but that might not be a great idea to do.
    Materials
    At first I thought I would implement a system where one terrain vertex just has one material, but it quickly became apparent that this would result in very "square" patterns, and that per-vertex blending between multiple materials would be needed. You can see below the transitions between materials form a blocky pattern.

    So I came up with a more advanced system that gives nice smooth transitions between multiple materials, but is still very fast:

    The new terrain system supports up to 256 different materials per terrain. I've worked out a system that runs fast no matter how many material layers you use, so you don't have to be concerned at all about using too many layers. You will run out of video memory before you run out of options.
    Each layer uses a PBR material with full support for metalness, roughness, and reflections. This allows a wider range of materials, like slick shiny obsidian rocks and reflective ice. When combined with tessellation, it is possible to make snow that actually looks like snow.

     Instancing
    Like any other entity, terrain can be copied or instantiated. If you make an instance of a terrain, it will use the same height, material, normal, and alpha data as the original. When the new editor arrives, I expect that will allow you to modify one terrain and see the results appear on the other instance immediately. A lot of "capture the flag" maps have two identical sides facing each other, so this could be good for that.

    Final Shots
    Loading up "The Zone" with a single displacement map added to one material produced some very nice results.



    The new terrain system will be very flexible, it looks great, and it runs fast. (Tessellation requires a high-end GPU, but can be disabled.) I think this is one of the features that will make people very excited about using the new Turbo Game Engine when it comes out.
  13. Josh
    Multithreading is very useful for processes that can be split into a lot of parallel parts, like image and video processing. I wanted to speed up the normal updating for the new terrain system so I added a new thread creation function that accepts any function as the input, so I can use std::bind with it, the same way I have been easily using this to send instructions in between threads:
    shared_ptr<Thread> CreateThread(std::function<void()> instruction); The terrain update normal function has two overloads. Once can accept parameters for the exact area to update, but if no parameters are supplied the entire terrain is updated:
    virtual void UpdateNormals(const int x, const int y, const int width, const int height); virtual void UpdateNormals(); This is what the second overloaded function looked like before:
    void Terrain::UpdateNormals() { UpdateNormals(0, 0, resolution.x, resolution.y); } And this is what it looks like now:
    void Terrain::UpdateNormals() { const int MAX_THREADS_X = 4; const int MAX_THREADS_Y = 4; std::array<shared_ptr<Thread>, MAX_THREADS_X * MAX_THREADS_Y> threads; Assert((resolution.x / MAX_THREADS_X) * MAX_THREADS_X == resolution.x); Assert((resolution.y / MAX_THREADS_Y) * MAX_THREADS_Y == resolution.y); for (int y = 0; y < MAX_THREADS_Y; ++y) { for (int x = 0; x < MAX_THREADS_X; ++x) { threads[y * MAX_THREADS_X + x] = CreateThread(std::bind((void(Terrain::*)(int, int, int, int)) & Terrain::UpdateNormals, this, x * resolution.x / MAX_THREADS_X, y * resolution.y / MAX_THREADS_Y, resolution.x / MAX_THREADS_X, resolution.y / MAX_THREADS_Y)); } } for (auto thread : threads) { thread->Resume(); } for (auto thread : threads) { thread->Wait(); } } Here are the results, using a 2048x2048 terrain. You can see that multithreading dramatically reduced the update time. Interestingly, four threads runs more than four times faster than a single thread. It looks like 16 threads is the sweet spot, at least on this machine, with a 10x improvement in performance.

  14. Josh
    The GMF2 data format gives us fine control to enable fast load times. Vertex and indice data is stored in the exact same format the GPU uses so everything is read straight into video memory. There are some other optimizations I have wanted to make for a long time, and our use of big CAD models makes this a perfect time to implement these improvements.
    Precomputed BSP Trees
    For raycast (pick) operations, a BSP structure needs to be constructed from the mesh data. In Leadwerks Engine this structure is computed after a model is loaded. The "Bober Station: model from The Zone is a 46,000 poly model. Here are the build times for its BSP structure:
    Release: 500 milliseconds Debug: 6143 milliseconds This is not our biggest mesh. Let's assume a two million poly mesh, which is actually something we come across with CAD data. With this size model our load times increase 40x:
    Release: 20 seconds Debug: 4 minutes Those numbers are not good. To reduce load times I have embedded precomputed BSP structures into the GMF2 format so that they can be loaded from the file. I don't have any numbers yet because there is currently no GMF1 to GMF2 pipeline, but I believe this will significantly reduce load times, especially for large models.
    I am interested in the idea of replacing the BSP float vertex values with shorts to reduce memory, but I'm not going to worry about it right now.
    Static Meshes
    I also am adding making the engine dump vertex and indice data from memory by default when a mesh is sent to the GPU. This will reduce memory usage, since you normally don't need a copy of the vertex data in memory for any reason.
    void Mesh::Finalize(const bool makestatic) { GameEngine::Get()->renderingthreadmanager->AddInstruction(std::bind(&RenderMesh::Modify, rendermesh, vertices, indices)); if (makestatic) { if (collider == nullptr) UpdateCollider(); vertices.clear(); vertices.shrink_to_fit(); indices.clear(); indices.shrink_to_fit(); } } This change reduced our system memory usage in "The Zone" by 40 mb.
  15. Josh
    My work with the MD3 format and its funny short vertex positions made me think about vertex array sizes. (Interestingly, the Quake 2 MD2 format uses a single char for vertex positions!)
    Smaller vertex formats run faster, so it made sense to look into this. Here was our vertex format before, weighing in at 72 bytes:
    struct Vertex { Vec3 position; float displacement; Vec3 normal; Vec2 texcoords[2]; Vec4 tangent; unsigned char color[4]; unsigned char boneweights[4]; unsigned char boneindices[4]; } According to the OpenGL wiki, it is okay to replace the normals with a signed char. And if this works for normals, it will also work for tangents, since they are just another vector.
    I also implemented half floats for the texture coordinates.
    Here is the vertex structure now at a mere 40 bytes, about half the size:
    struct Vertex { Vec3 position; signed char normal[3]; signed char displacement; unsigned short texcoords[4]; signed char tangent[4]; unsigned char color[4]; unsigned char boneweights[4]; unsigned char boneindices[4]; } Everything works with no visible loss of quality. Half-floats for the position would reduce the size by an additional 6 bytes, but would likely incur two bytes of padding since it would no longer by aligned to four bytes, like most GPUs prefer. So it would really only save four bytes, which is not worth it for half the precision.
    Another interesting thing I might work into the GMF format is Draco mesh compression by Google. This is only for shrinking file sizes, and does not lead to any increased performance. The bulk of your game data is going to be in texture files, so this isn't too important but might be a nice extra.
  16. Josh
    The GMF2 SDK has been updated with tangents and bounds calculation, object colors, and save-to-file functionality.
    The GMF2 SDK is a lightweight engine-agnostic open-source toolkit for loading and saving of game models. The GMF2 format can work as a standalone file format or simply as a data format for import and export plugins. This gives us a protocol we can pull model data into and export model data from, and a format that loads large data much faster than alternative file formats.
    Here is an example showing how to construct a GMF2 model and save it to a file:
    //Create a GMF file GMFFile* file = new GMFFile; //Create a model GMFNode* node = new GMFNode(file, GMF_TYPE_MODEL); //Set the orientation node->SetMatrix(1,0,0,0, 0,1,0,0, 0,0,1,0, 0,0,0,1); //Set the object color node->SetColor(0,0,1,1); //Add one LOD level GMFLOD* lod = node->AddLOD(); //Create a triangle mesh and add it to the LOD. (Meshes can be shared.) GMFMesh* mesh = new GMFMesh(file, 3); lod->AddMesh(mesh); //Add vertices mesh->AddVertex(-0.5,0.5,0, 0,0,1, 0,0); mesh->AddVertex(0.5,0.5,0, 0,0,1, 1,0); mesh->AddVertex(0.5,-0.5,0, 0,0,1, 1,1); mesh->AddVertex(-0.5,-0.5,0, 0,0,1, 0,1); //Add triangles mesh->AddPolygon(0,1,2); mesh->AddPolygon(0,2,3); //Build vertex tangents (optional) mesh->UpdateTangents(); //Save data to file file->Save("out.gmf"); //Cleanup - This will get rid of all created objects delete file; You can get the GMF2 SDK on GitHub.
  17. Josh

    Articles
    Not a lot of people know about this, but back in 2001 Discreet (before the company was purchased by Autodesk) released a free version of 3ds max for modding games. Back then game file formats and tools were much more highly specialized than today, so each game required a "game pack" to customize the gmax interface to support that game. I think the idea was to charge the game developer money to add support for their game. Gmax supported several titles including Quake 3 Arena and Microsoft Flight Simulator, but was later discontinued.
    I personally love the program because it includes only the features you need from 3ds max for hard surface modeling. It's basically a stripped down version of 3ds max with only the features you need.

    Gmax still survives today, apparently in the custody of Turbosquid. You can download it here. You need the gmax 1.2 installer and the Tempest game pack for Quake 3. You will also need to request a free registration key from Turbosquid. After installing gmax, you can simply find the MD3 export plugin ("md3exp.dle") in the Tempest game pack download and copy that to your "C:\gmax\plugins" directory to enable export. There is also an optional  There is also an optional MD3 import script by Chris Cookson which is uploaded here for safekeeping:
    q3-md3.ms
    With the plugin system in our new engine I was able to add support for loading Quake 3 MD3 models, so you can export your gmax models and load them up in the new engine. However, there are some restrictions. The MD3 file format uses compressed vertex positions, so your vertex positions have a limited range and resolution. Additionally, there are restrictions on what you can do with the gmax program, so take a look at the licensing terms before you do anything. Still, it's a fun program to have and this is a nice feature to play around with.
     
  18. Josh
    I wanted to get a working example of the plugin system together. One thing led to another, and...well, this is going to be a very different blog.
    GMF2
    The first thing I had to do is figure out a way for plugins to communicate with the main program. I considered using a structure they both share in memory, but those always inevitably get out of sync when either the structure changes or there is a small difference between the compilers used for the program and DLL. This scared me so I went with a standard file format to feed the data from the plugin to the main program.
    GLTF is a pretty good format but has some problems that makes me look for something else to use in our model loader plugins.
    It's text-based and loads slower. It's extremely complicated. There are 3-4 different versions of the file format and many many options to split it across multiple files, binary, text, or binary-to-text encoding. It's meant for web content, not PC games. Tons of missing functionality is added through a weird plugin system. For example, DDS is supported through a plugin, but a backup PNG has to be included in case the loaded doesn't support the extension. The GMF file format was used in Leadwerks. It's a custom fast-loading chunk-based binary file format. GMF2 is a simpler flat binary format updated with some extra features:
    Vertices are stored in a single array ready to load straight into GPU memory. Vertex displacement is supported. Compressed bitangents. Quad and line meshes are supported. LOD Materials and textures can be packed into the format or loaded from external files. PBR and Blinn-Phong materials Mesh bounding boxes are supported so they don't have to be calculated at load time. This means the vertex array never has to be iterated through, making large meshes load faster. I am not sure yet if GMF2 will be used for actual files or if it is just going to be an internal data format for plugins to use. GLTF will continue to be supported, but the format is too much of a mess to use for plugin data.
    Here's a cool five-minute logo:

    The format looks something like this:
    char[4] ID "GMF2" int version 200 int root //ID of the root entity int texture_count //number of textures int textures_pos //file position for texture array int materials_count //number of materials int materials_pos //file position for materials int nodes_count //number of nodes int nodes_pos //file position for nodes As you can see, it is really easy to read and really easy to write. There's enough complexity in this already. I'm not bored. I don't need to introduce unnecessary complexity into the design just so I can show off. There are real problems that need to be solved and making a "tricky" file format is not one of them.
    In Leadwerks 2, we had a bit of code called the "GMF SDK". This was written in BlitzMax, and allowed construction of a GMF file with easy commands. I've created new C++ code to create GMF2 files:
    //Create the file GMFFile* file = new GMFFile; //Create a model node = new GMFNode(file, GMF_TYPE_MODEL); //Add an LOD level GMFLOD* lod = node->AddLOD(); //Add a mesh GMFMesh* mesh = lod->AddMesh(3); //triangle mesh //Add a vertex mesh->AddVertex(0,0,0, 0,0,1, 0,0, 0,0, 255,255,255,255); mesh->AddVertex(0,0,1, 0,0,1, 0,0, 0,0, 255,255,255,255); mesh->AddVertex(0,1,1, 0,0,1, 0,0, 0,0, 255,255,255,255); //Add a triangle mesh->AddIndice(0); mesh->AddIndice(1); mesh->AddIndice(2); Once your GMFFile is constructed you can save it into memory with one command. The Turbo Plugin SDK is a little more low-level than the engine, so it includes a MemWriter class to help with this, since the engine Stream class is not present.
    As a test I am writing a Quake 3 MD3 import plugin and will provide the project and source as an example of how to use the Turbo Plugin SDK.

    Packages
    The ZIP virtual file system from Leadwerks is being expanded and formalized. You can load a Package object to add new virtual files to your project. These will also be loadable from the editor, so you can add new packages to a project, and the files will appear in the asset browser and file dialogs. (Package files are read-only.) Plugins will allow packages to be loaded from file formats besides ZIP, like Quake WADs or Half-Life GCF files. Notice we keep all our loaded items in variables or arrays because we don't want them to get auto-deleted.
    //Load Quake 3 plugins auto pk3reader = LoadPlugin("Plugins/PK3.dll"); auto md3loader = LoadPlugin("Plugins/MD3.dll"); auto bsploader = LoadPlugin("Plugins/Q3BSP.dll"); Next we load the game package files straight from the Quake game directory. This is just like the package system from Leadwerks.
    //Load Quake 3 game packages std::wstring q3apath = L"C:/Program Files (x86)/Steam/steamapps/common/Quake 3 Arena/baseq3"; auto dir = LoadDir(q3apath); std::vector<shared_ptr<Package> > q3apackages; for (auto file : dir) { if (Lower(ExtractExt(file)) == L"pk3") { auto pak = LoadPackage(q3apath + L"/" + file); if (pak) q3apackages.push_back(pak); } } Now we can start loading content directly from the game.
    //Load up some game content from Quake! auto head = LoadModel(world, "models/players/bitterman/head.md3"); auto upper = LoadModel(world, "models/players/bitterman/upper.md3"); auto lower = LoadModel(world, "models/players/bitterman/lower.md3"); auto scene = LoadScene(world, "maps/q3ctf2.bsp"); Modding
    I have a soft spot for modding because that is what originally got me into computer programming and game development. I was part of the team that made "Checkered Flag: Gold Cup" which was a spin-off on the wonderful Quake Rally mod:
    I expect in the new editor you will be able to browse through game files just as if they were uncompressed in your project file, so the new editor can act as a modding tool, for those who are so inclined. It's going to be interesting to see what people do with this. We can configure the new editor to run a launch script that handles map compiling and then launches the game. All the pieces are there to make the new editor a tool for modding games, like Discreet's old Gmax experiment.

    I am going to provide official support for Quake 3 because all the file formats are pretty easy to load, and because gmax can export to MD3 and it would be fun to load Gmax models. Other games can be supported by adding plugins.
    So here are some of the things the new plugin system will allow:
    Load content directly from other games and use it in your own game. I don't recommend using copyrighted game assets for commercial projects, but you could make a "mod" that replaces the engine and requires the player to own the original game. You could probably safely use content from any Valve games and release a commercial game on Steam. Use the editor as a tool to make Quake or other maps. Add plugin support for new file formats. This might all be a silly waste of time, but it's a way to get the plugin system working, and if we can make something flexible enough to build Quake maps, well that is a good test of the robustness of the system.
     
  19. Josh
    I started to implement quads for tessellation, and at that point the shader system reached the point of being unmanageable. Rendering an object to a shadow map and to a color buffer are two different processes that require two different shaders. Turbo introduces an early Z-pass which can use another shader, and if variance shadow maps are not in use this can be a different shader from the shadow shader. Rendering with tessellation requires another set of shaders, with one different set for each primitive type (isolines, triangles, and quads). And then each one of these needs a masked and opaque option, if alpha discard is enabled.
    All in all, there are currently 48 different shaders a material could use based on what is currently being drawn. This is unmanageable.
    To handle this I am introducing the concept of a "shader family". This is a JSON file that lists all possible permutations of a shader. Instead of setting lots of different shaders in a material, you just set the shader family one:
    shaderFamily: "PBR.json" Or in code:
    material->SetShaderFamily(LoadShaderFamily("PBR.json")); The shader family file is a big JSON structure that contains all the different shader modules for each different rendering configuration: Here are the partial contents of my PBR.json file:
    { "turboShaderFamily" : { "OPAQUE": { "default": { "base": { "vertex": "Shaders/PBR.vert.spv", "fragment": "Shaders/PBR.frag.spv" }, "depthPass": { "vertex": "Shaders/Depthpass.vert.spv" }, "shadow": { "vertex": "Shaders/Shadow.vert.spv" } }, "isolines": { "base": { "vertex": "Shaders/PBR_Tess.vert.spv", "tessellationControl": "Shaders/Isolines.tesc.spv", "tessellationEvaluation": "Shaders/Isolines.tese.spv", "fragment": "Shaders/PBR_Tess.frag.spv" }, "shadow": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "Shaders/DepthPass_Isolines.tesc.spv", "tessellationEvaluation": "Shaders/DepthPass_Isolines.tese.spv" }, "depthPass": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "DepthPass_Isolines.tesc.spv", "tessellationEvaluation": "DepthPass_Isolines.tese.spv" } }, "triangles": { "base": { "vertex": "Shaders/PBR_Tess.vert.spv", "tessellationControl": "Shaders/Triangles.tesc.spv", "tessellationEvaluation": "Shaders/Triangles.tese.spv", "fragment": "Shaders/PBR_Tess.frag.spv" }, "shadow": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "Shaders/DepthPass_Triangles.tesc.spv", "tessellationEvaluation": "Shaders/DepthPass_Triangles.tese.spv" }, "depthPass": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "DepthPass_Triangles.tesc.spv", "tessellationEvaluation": "DepthPass_Triangles.tese.spv" } }, "quads": { "base": { "vertex": "Shaders/PBR_Tess.vert.spv", "tessellationControl": "Shaders/Quads.tesc.spv", "tessellationEvaluation": "Shaders/Quads.tese.spv", "fragment": "Shaders/PBR_Tess.frag.spv" }, "shadow": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "Shaders/DepthPass_Quads.tesc.spv", "tessellationEvaluation": "Shaders/DepthPass_Quads.tese.spv" }, "depthPass": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "DepthPass_Quads.tesc.spv", "tessellationEvaluation": "DepthPass_Quads.tese.spv" } } } } } A shader family file can indicate a root to inherit values from. The Blinn-Phong shader family pulls settings from the PBR file and just switches some of the fragment shader values.
    { "turboShaderFamily" : { "root": "PBR.json", "OPAQUE": { "default": { "base": { "fragment": "Shaders/Blinn-Phong.frag.spv" } }, "isolines": { "base": { "fragment": "Shaders/Blinn-Phong_Tess.frag.spv" } }, "triangles": { "base": { "fragment": "Shaders/Blinn-Phong_Tess.frag.spv" } }, "quads": { "base": { "fragment": "Shaders/Blinn-Phong_Tess.frag.spv" } } } } } If you want to implement a custom shader, this is more work because you have to define all your changes for each possible shader variation. But once that is done, you have a new shader that will work with all of these different settings, which in the end is easier. I considered making a more complex inheritance / cascading schema but I think eliminating ambiguity is the most important goal in this and that should override any concern about the verbosity of these files. After all, I only plan on providing a couple of these files and you aren't going to need any more unless you are doing a lot of custom shaders. And if you are, this is the best solution for you anyways.
    Consequently, the baseShader, depthShader, etc. values in the material file definition are going away. Leadwerks .mat files will always use the Blinn-Phong shader family, and there is no way to change this without creating a material file in the new JSON material format.
    The shader class is no longer derived from the Asset class because it doesn't correspond to a single file. Instead, it is just a dumb container. A ShaderModule class derived from the Asset class has been added, and this does correspond with a single .spv file. But you, the user, won't really need to deal with any of this.
    The result of this is that one material will work with tessellation enabled or disabled, quad, triangle, or line meshes, and animated meshes. I also added an optional parameter in the CreatePlane(), CreateBox(), and CreateQuadSphere() commands that will create these primitives out of quads instead of triangles. The main reason for supporting quad meshes is that the tessellation is cleaner when quads are used. (Note that Vulkan still displays quads in wireframe mode as if they are triangles. I think the renderer probably converts them to normal triangles after the tessellation stage.)


    I also was able to implement PN Quads, which is a quad version of the Bezier curve that PN Triangles add to tessellation.



    Basically all the complexity is being packed into the shader family file so that these decisions only have to be made once instead of thousands of times for each different material.
  20. Josh
    With tessellation now fully implemented, I was very curious to see how it would perform when applied to arbitrary models. With tessellation, vertices act like control points for a Bezier mesh that is subdivided dynamically in screen space. Could tessellation be used to add new details to any low-poly model?
    Here is a low-res character model with a pointy head and obvious sharp edges all around his silhouette:


    When tessellation is enabled, the sharp edges go away and the mesh magically appears more detailed:


    Let's take a look at the "Doom Seeker" enemy model from Doom 3. The original model is 640 triangles:


    Tessellation gets rid of the blocky outline.

    And displacement adds new details to the model.


    What about other shapes? Can this technique be used to turn any low-poly model into a high-res version? Here is a tire model from "The Zone" DLC. Even with displacement set to zero on edge vertices, cracks still appear.

    So my conclusion on this is that tessellation is fantastic for closed organic shapes. In fact, I think it totally replaces separate LOD meshes for rocks, characters, and other organic shapes. However, it is not something you should use everywhere without any discretion.
  21. Josh
    Previously I talked about the technical details of hardware tessellation and what it took to make it truly useful. In this article I will talk about some of the implications of this feature and the more advanced ramifications of baking tessellation into Turbo Game Engine as a first-class feature in the 
    Although hardware tessellation has been around for a few years, we don't see it used in games that often. There are two big problems that need to be overcome.
    We need a way to prevent cracks from appearing along edges. We need to display a consistent density of triangles on the screen. Too many polygons is a big problem. I think these issues are the reason you don't really see much use of tessellation in games, even today. However, I think my research this week has created new technology that will allow us to make use of tessellation as an every-day feature in our new Vulkan renderer.
    Per-Vertex Displacement Scale
    Because tessellation displaces vertices, any discrepancy in the distance or direction of the displacement, or any difference in the way neighboring polygons are subdivided, will result in cracks appearing in the mesh.

    To prevent unwanted cracks in mesh geometry I added a per-vertex displacement scale value. I packed this value into the w component of the vertex position, which was not being used. When the displacement strength is set to zero along the edges the cracks disappear:

    Segmented Primitives
    With the ability to control displacement on a per-vertex level, I set about implementing more advanced model primitives. The basic idea is to split up faces so that the edge vertices can have their displacement scale set to zero to eliminate cracks. I started with a segmented plane. This is a patch of triangles with a user-defined size and resolution. The outer-most vertices have a displacement value of 0 and the inner vertices have a displacement of 1. When tessellation is applied to the plane the effect fades out as it reaches the edges of the primitive:

    I then used this formula to create a more advanced box primitive. Along the seam where the edges of each face meet, the displacement smoothly fades out to prevent cracks from appearing.

    The same idea was applied to make segmented cylinders and cones, with displacement disabled along the seams.


    Finally, a new QuadSphere primitive was created using the box formula, and then normalizing each vertex position. This warps the vertices into a round shape, creating a sphere without the texture warping that spherical mapping creates.

    It's amazing how tessellation and displacement can make these simple shapes look amazing. Here is the full list of available commands:
    shared_ptr<Model> CreateBox(shared_ptr<World> world, const float width = 1.0); shared_ptr<Model> CreateBox(shared_ptr<World> world, const float width, const float height, const float depth, const int xsegs = 1, const int ysegs = 1); shared_ptr<Model> CreateSphere(shared_ptr<World> world, const float radius = 0.5, const int segments = 16); shared_ptr<Model> CreateCone(shared_ptr<World> world, const float radius = 0.5, const float height = 1.0, const int segments = 16, const int heightsegs = 1, const int capsegs = 1); shared_ptr<Model> CreateCylinder(shared_ptr<World> world, const float radius = 0.5, const float height=1.0, const int sides = 16, const int heightsegs = 1, const int capsegs = 1); shared_ptr<Model> CreatePlane(shared_ptr<World> world, cnst float width=1, const float height=1, const int xsegs = 1, const int ysegs = 1); shared_ptr<Model> CreateQuadSphere(shared_ptr<World> world, const float radius = 0.5, const int segments = 8); Edge Normals
    I experimented a bit with edges and got some interesting results. If you round the corner by setting the vertex normal to point diagonally, a rounded edge appears.

    If you extend the displacement scale beyond 1.0 you can get a harder extended edge.

    This is something I will experiment with more. I think CSG brush smooth groups could be used to make some really nice level geometry.
    Screen-space Tessellation LOD
    I created an LOD calculation formula that attempts to segment polygons into a target size in screen space. This provides a more uniform distribution of tessellated polygons, regardless of the original geometry. Below are two cylinders created with different segmentation settings, with tessellation disabled:

    And now here are the same meshes with tessellation applied. Although the less-segmented cylinder has more stretched triangles, they both are made up of triangles about the same size.

    Because the calculation works with screen-space coordinates, objects will automatically adjust resolution with distance. Here are two identical cylinders at different distances.

    You can see they have roughly the same distribution of polygons, which is what we want. The same amount of detail will be used to show off displaced edges at any distance.

    We can even set a threshold for the minimum vertex displacement in screen space and use that to eliminate tessellation inside an object and only display extra triangles along the edges.

    This allows you to simply set a target polygon size in screen space without adjusting any per-mesh properties. This method could have prevented the problems Crysis 2 had with polygon density. This also solves the problem that prevented me from using tessellation for terrain. The per-mesh tessellation settings I worked on a couple days ago will be removed since it is not needed.
    Parallax Mapping Fallback
    Finally, I added a simple parallax mapping fallback that gets used when tessellation is disabled. This makes an inexpensive option for low-end machines that still conveys displacement.

    Next I am going to try processing some models that were not designed for tessellation and see if I can use tessellation to add geometric detail to low-poly models without any cracks or artifacts.
  22. Josh
    Now that I have all the Vulkan knowledge I need, and most work is being done with GLSL shader code, development is moving faster. Before starting voxel ray tracing, another hard problem, I decided to work one some *relatively* easier things for a few days. I want tessellation to be an every day feature in the new engine, so I decided to work out a useful implementation of it.
    While there are a ton of examples out there showing how to split a triangle up into smaller triangles, useful discussion and techniques of in-game tessellation is much more rare. I think this is because there are several problems to solve before this technical feature can really be made practical.
    Tessellation Level
    The first problem is deciding how much to tessellate an object. Tessellation should use a single detail level per set of primitives being drawn. The reason for this is that cracks will appear when you apply displacement if you try to use a different tessellation level for each polygon. I solved this with some per-mesh setting for tessellation parameters.
    Note: In Leadwerks Game Engine, a model is an entity with one or more surfaces. Each surface has a vertex array, indice array, and a material. In Turbo Game Engine, a model contains one of more LODs, and each LOD can have one or more meshes. A mesh object in Turbo is like a surface object in Leadwerks.
    We are not used to per-mesh settings. In fact, the material is the only parameter meshes contained other than vertex or indice data. But for tessellation parameters, that is exactly what we need, because the density of the mesh polygons gives us an idea of how detailed the tessellation should be. This command has been added:
    void Mesh::SetTessellation(const float detail, const float nearrange, const float farrange) Here are what the parameters do:
    detail: the higher the detail, the more the polygons are split up. Default is 16. nearrange: the distance below which tessellation stops increasing. Default is 1.0 meters. farrange: the distance below which tessellation starts increasing. Default is 20.0 meters. This command gives you the ability to set properties that will give a roughly equal distribution of polygons in screen space. For convenience, a similar command has been added to the model class, which will apply the settings to all meshes in the model.
    Surface Displacement
    Another problem is culling offscreen polygons so the GPU doesn't have to process millions of extra vertices. I solved this by testing if all control vertices lie outside one edge of the camera frustum. (This is not the same as testing if any control point is inside the camera frustum, as I have seen suggested elsewhere. The latter will cause problems because it is still possible for a triangle to be visible even if all its corners are outside the view.) To account for displacement, I also tested the same vertices with maximum displacement applied.
    To control the amount of displacement, a scale property has been added to the displacementTexture object scheme:
    "displacementTexture": { "file": "./harshbricks-height5-16.png", "scale": 0.05 } A Boolean value called "tessellation" has also been added to the JSON material scheme. This will tell the engine to choose a shader that uses tessellation, so you don't need to explicitly specify a shader file. Shadows will also be rendered with tessellation, unless you explicitly choose a different shadow shader.
    Here is the result:

    Surface displacement takes the object scale into account, so if you scale up an object the displacement will increase accordingly.
    Surface Curvature
    I also added an implementation of the PN Triangles technique. This treats a triangle as control points for a Bezier curve and projects tessellated curved surfaces outwards.
     
    You can see below the shot using the PN Triangles technique eliminates the pointy edges of the sphere.


    The effects is good, although it is more computationally expensive, and if a strong displacement map is in use, you can't really see a difference. Since vertex positions are being changed but texture coordinates are still using the same interpolation function, it can make texture coordinates appear distorted. To counter this, texture coordinates would need to be recalculated from the new vertex positions.
    EDIT:
    I found a better algorithm that doesn't produce the texcoord errors seen above.


    Finally, a global tessellation factor has been added that lets you scale the effect for different hardware levels:
    void SetTessellationDetail(const float detail) The default setting is 1.0, so you can use this to scale up or down the detail any way you like.
  23. Josh
    Shadow maps are now supported in the new Vulkan renderer, for point, spot, and box lights!

    Our new forward renderer eliminates all the problems that deferred renderers have with transparency, so shadows and lighting works great with transparent surfaces. Transparent objects even receive lighting from their back face automatically!

    There is some shadow acne, which I am not going to leave alone right now because I want to try some ideas to eliminate it completely, so you never have to adjust offsets or other settings. I also want to take another look at variance shadow maps, as these can produce much better results than depthmap shadows. I also noticed some seams in the edges of point light shadows.
    Another interesting thing to note is that the new renderer handles light and shadows with orthographic projection really well.

    A new parameter has also been added to the JSON material loader. You can add a scale factor inside the normalTexture block to make normal maps appear bumpier. The default value is 1.0:
    "normalTexture": { "file": "./glass_dot3.tex", "scale": 2.0 } Also, the Context class has been renamed to "Framebuffer". Use CreateFramebuffer() instead of CreateContext().
  24. Josh
    After about four days of trying to get render-to-texture working in Vulkan, I have everything working except...it doesn't work. No errors, no clue what is wrong, but the renderer is not even clearing the depth attachment, which is why the texture read shown here is flat red.

    There's not much else to say right now. I will keep trying to find the magic combination of cryptic obscure settings it takes to make Vulkan do what I want.
    This is very hard stuff, but once I have it working I think that gives me all the functionality I need to finish the Vulkan renderer. Post-processing effects are just another render-to-texture problem. Now, the Vulkan renderer is much more strict than OpenGL, and I think the way post-processing will work is with a an automatically created blur texture and a single post-process shader. It's very wasteful to render a bunch of different passes for each effect, so instead I think we will just have some different effects files and a shader that can include whatever effects you want. In fact, the post-process pass can probably be done as a third Vulkan subpass which would be very efficient.
  25. Josh
    Now that we have lights working in our clustered forward Vulkan renderer (same great technique the latest DOOM games are using) I am starting to implement shadow maps. The first issue that came up was managing render-to-texture when the texture might still be in use rendering the previous frame. At first I thought multiple shadowmaps would be needed per light, like a double-buffering system, but that would double the number of shadow textures and video memory. Instead, I created a simple object pool which stores spare shadowmaps that can be swapped around and used as they are needed.
    It turns out I already have pretty much all the code I need because Vulkan's swapchain creation works exactly the same way, by rendering to a series of images. I worked through the code slowly and came up with this by the end of the day, which runs successfully without throwing any validation errors:
    bool RenderBuffer::Initialize(shared_ptr<RenderContext> context, const int width, const int height, const int depth, const int colorcomponents, const bool depthcomponent, const int samples) { this->device = GameEngine::Get()->renderingthreadmanager->instance; int totalcomponents = colorcomponents; if (depthcomponent) totalcomponents++; std::fill(colortexture.begin(), colortexture.end(), nullptr); depthtexture = nullptr; //Create color images for (int i = 0; i < colorcomponents; ++i) { colortexture[i] = make_shared<RenderTexture>(); colortexture[i]->Initialize(VK_IMAGE_TYPE_2D, device->chaininfo.imageFormat, width, height, 1, 0, false, -1,-1,-1,-1,1, VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT); colorimages[i] = colortexture[i]->vkimage; } //Create depth image if (depthcomponent) { depthtexture = make_shared<RenderTexture>(); depthtexture->Initialize(VK_IMAGE_TYPE_2D, device->depthformat, width, height, 1, samples, false, -1, -1, -1, -1, 1, VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT); } //Create image views imageviews.resize(totalcomponents); VkImageViewCreateInfo createInfo = {}; createInfo.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; createInfo.viewType = VK_IMAGE_VIEW_TYPE_2D; createInfo.components.r = VK_COMPONENT_SWIZZLE_IDENTITY; createInfo.components.g = VK_COMPONENT_SWIZZLE_IDENTITY; createInfo.components.b = VK_COMPONENT_SWIZZLE_IDENTITY; createInfo.components.a = VK_COMPONENT_SWIZZLE_IDENTITY; createInfo.subresourceRange.baseMipLevel = 0; createInfo.subresourceRange.levelCount = 1; createInfo.subresourceRange.baseArrayLayer = 0; createInfo.subresourceRange.layerCount = 1; // Create color image views for (size_t i = 0; i < colorcomponents; i++) { createInfo.image = colorimages[i]; createInfo.format = device->chaininfo.imageFormat; createInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; VkAssert(vkCreateImageView(device->device, &createInfo, nullptr, &imageviews[i])); } //Create depth image view if (depthcomponent) { createInfo.image = depthtexture->vkimage; createInfo.format = depthtexture->format; createInfo.subresourceRange.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT; VkAssert(vkCreateImageView(device->device, &createInfo, nullptr, &imageviews[colorcomponents])); } //Create framebuffer VkFramebufferCreateInfo framebufferInfo = {}; framebufferInfo.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO; framebufferInfo.renderPass = device->renderpass->pass; framebufferInfo.attachmentCount = 2; framebufferInfo.pAttachments = imageviews.data(); framebufferInfo.width = width; framebufferInfo.height = height; framebufferInfo.layers = 1; VkAssert(vkCreateFramebuffer(device->device, &framebufferInfo, nullptr, &framebuffer)); return true; } The blog cover image isn't the Vulkan renderer, I just found a random image of Leadwerks shadows.
×
×
  • Create New...