Jump to content

Josh

Staff
  • Posts

    23,506
  • Joined

  • Last visited

Blog Entries posted by Josh

  1. Josh
    I am experimenting with a system for creating a sequence of actions using Lua coroutines. This allows you to define a bunch of behavior at startup and let the game just run without having to keep track of a lot of states.
    You can add coroutines to entities and they will be executed in order. The first one will complete, and then the next one will start.
    A channel parameter allows you to have separate stacks of commands so you can have multiple sequences running on the same object. For example, you might have one channel that controls entity colors while another channel is controlling position.
    function Script:Start() local MotionChannel = 0 local ColorChannel = 1 local turnspeed = 1 local colorspeed = 3 --Rotate back and forth at 1 degree / second self:AddCoroutine(MotionChannel, ChangeRotation, 0, 45, 0, turnspeed) self:AddCoroutine(MotionChannel, ChangeRotation, 0, -45, 0, turnspeed) self:LoopCourtines(MotionChannel)--keeps the loop going instead of just running once --Flash red and black every 3 seconds self:AddCoroutine(ColorChannel, ChangeColor, 1, 0 , 0, 1, colorspeed) self:AddCoroutine(ColorChannel, ChangeColor, 0, 0, 0, 1, colorspeed) self:LoopCourtines(ColorChannel)--keeps the loop going instead of just running once end There's no Update() function! Where do the coroutine functions come from? These can be in the script itself, or they can be general-use functions loaded from another script. For example, you can see an example of a MoveToPoint() coroutine function in this thread.
    The same script could be created using an Update function but it would involve a lot of stored states. I started to write it out actually for this blog, but then I said "ah screw it, I don't want to write all that" so you will have to use your imagination.
    Now if you can imagine a game like the original Warcraft, you might have a script function like this that is called when the player assigns a peasant to collect wood:
    function Script:CollectWood() self:ClearCoroutines(0) self:AddCoroutine(0, self.GoToForestAndFindATree) self:AddCoroutine(0, self.ChopDownTree) self:AddCoroutine(0, self.GoToCastle) self:AddCoroutine(0, self.Wait, 6000) self.AddCoroutine(0, self.DepositWood, 100) self:LoopCoroutines(0) end I wonder if there is some way to create a sub-loop so if the NPC gets distracted they carry out some actions then return to the sequence they were in before, at the same point in the sequence.
    Of course this would work really well for cutscenes or any other type of one-time sequence of events.
  2. Josh
    A new beta is available. In this build I cleaned up a lot of internal stuff. I removed some parts of the engine that I want to redesign in order to clean up the source.
    JSON material files loaded from MDL files are now supported.
    Added ActiveWindow() command. if the game window is not the foreground window this will return null.
    The Steamworks and all dependent classes are temporarily removed. There's a lot of stuff in there I don't intend to use in the future like all the Workshop functions, and I want to reintegrate it one piece at a time in a neater fashion. The good features like P2P networking will definitely be included in the future.
    File IO finished
    The file read and write commands are now 100% using global functions and internally using Unicode strings. You can still call functions with a regular std::string but internally it will just convert it to a wide string. The zip file read support is removed temporarily so I can rethink its design.
    Key and mouse event binding
    Since I am consciously making the decision to design the new engine for intermediate and expert users instead of beginners, it occurred to me that the MouseHit and KeyHit functions are flawed, since they rely on a global state and will cause problems if two pieces of code check for the same key or button. So I removed them and came up with this system:
    self:BindKey(KEY_E,self.Interact) self:BindKey(KEY_SPACE,self.Jump) self:BindMouseButton(MOUSE_LEFT,self.Throw) This works exactly as you would expect, by calling the entity script function when the key or mouse button is pressed. Naturally a key release event would also be useful. Perhaps a BindKeyRelease() function is the best way to do this? The KeyDown() / MouseDown() functions are still available, since they do not have the problems the Hit() commands do. The same technique will work with C++ actors though it is not yet implemented.
    This is undoubtedly more complicated than a simple MouseHit() command but it is better for people as their code gets more complex. Instead of hurting the experience for advanced users, I am going to force beginners to adjust to the correct way of doing things.
    This design is not final. I think there are other ways programmers might want to bind events. I would like to hear your ideas. I am willing to create a more complicated system if it means it is more useful. This is a big change in my idea of good design.
  3. Josh
    This is a good time to write about some very broad changes I expect to come about over the next year in our community as our new engine "Turbo" arrives. Turbo Game Engine, as the name suggests, offers really fast performance using a groundbreaking Vulkan-based renderer, which is relevant to everyone but particularly beneficial for VR developers who struggle to keep their framerates up using conventional game engines. I want to help get you onboard with some of the ideas that I myself am processing.
    Less emphasis on how-to tutorials, more emphasis on API documentation
    The new engine assumes you are either an artist or a programmer, and if you are a programmer you already know basic C++ or Lua. More attention will be paid to precisely documenting how commands behave. There will be a more strict division between supported and unsupported features. There will be less "guessing" what the user is trying to do, and more formal documentation saying "if you do X then Y will occur". For example, every entity creation function requires the world object be explicitly supplied in the creation command, instead of hiding this away in a global state. There will not be tutorials explaining what a variable is or teaching basic programming concepts.
    More responsiveness to user requests, especially for programming features
    Leadwerks 4 features have been in a semi-frozen state for a while now. Although many new features have been added, I have not wanted to create breaking changes, and have been reluctant to introduce things that might create new bugs, because I knew an entire new infrastructure for future development was on the way. With the new engine I will be more receptive to suggestions that make the engine better. One example would be an animation events system that lets users set a point in an animation where an event is called. These changes need to be implemented within the design philosophy of the new engine. For example, I would use an Actor class method to call the event function rather than a raw pointer. Emphasis should be placed on what is practical and useful for competent programmers and artists, and how everything fits into the overall design.
    Less attempts at hand-holding for new developers
    The new engine will not attempt to teach children to make their own MMORPG. Our marketing materials will not even suggest this is possible. The new engine will deliver performance faster than any other game engine in the world, period. Consequently, I think the community will gain a lot more advanced users, and though some of them will not even interact on the forum I do think you will see more organic creativity and quality. In its own way, the new engine actually is quite a lot easier to work with, but the sales pitch is not going to emphasize that, it will just be something people discover as they use it. I love seeing all the weird and cool creations that comes from people who are completely new to game development, but those people were new to game development and did well with Leadwerks had a lot of natural talent. Instead of trying to come up with a magic combination of features and tutorials to turn novices into John Carmack, we are going to rely on the product benefits to draw them and expect them to get up to speed quickly. Discussions should be about what is best for intermediate / experts, not trying to figure out what beginners want. Ease of use is subjective and I feel we have hit the point of diminishing returns chasing after this. If beginners want to jump in and learn that is great, but it is not our reason for existing.
    Stronger focus on the core essentials
    At the time of this writing, there are only eight entity types in the beta of the new engine. We can't win based on number of features, but we can do the core essentials much better than anyone else. Our new Vulkan renderer offers performance that developers (especially VR) can't live without. Models, lights, and rendering are the core features I want to focus on, and these can be expanded by the end user to create their own. For example, a custom particle system with support for all kinds of behaviors could easily be created with the model class and a few custom shaders, without breaking the performance that makes this engine valuable. Our new technology is very well thought out and will give us a stable base for a long time. I am planning on a plugin / extensions system because its best for this to be integrated in the core design, but you should not expect this to be very useful for a couple of years. Plugin systems require huge network effects to offer anything valuable. We can only reach that type of scale by offering something else unique that no one can match us on. Fortunately, we have something. It's right in the name.
    More formal support for good standards
    Vulkan has turned out to be a very good move. I don’t think anyone realizes how big a deal GLTF support is yet: you can download thousands of models from Sketchfab and other sources and load them right now with no adjustments. I may join the Khronos consortium and I have some ideas for additional useful GLTF extensions. I'm using JSON for a lot of files and it's great. DDS will be our main texture file format. There are more good standards today than there were ten years ago, and I will adopt the ones that fit our goals.
    Different type of new user appearing
    With Leadwerks, the average new user appears on the forum and says “hey, I want to make a game but I don’t really know how, please tell me what I need to know.” With the new engine I think it will be more like “hey, I’m more or less an expert already, I know exactly what I want to make, please tell me what I need to know.” I expect them to have less tolerance for bugs, undefined behavior, or undocumented features, and at the same time I think it will be easier to have frank discussions about exactly what developers need.
    In very general terms that is how I want to focus things. I think everyone here will adjust to this more strict and well-defined approach and end up liking it a lot better.
  4. Josh
    A new update is available for beta subscribers.
    What's new
    Added support for strip lights. To create these just call CreateLight(world, LIGHT_STRIP). The entity scale on the Z axis will determine the length of the line, and the outer range will determine the radius in which light shows. Added new properties to the JSON material scheme. "textureScroll" is a float value that can animate a texture to make it smoothly move. "textureScrollRotation" is an angle to control which direction the texture moves. An example material is included. Renamed "albedoMap", "normalMap", "emissionMap", "displacementMap", and "brdfMap" to "baseTexture", "normalTexture", "emissionTexture", "displacementTexture", and "brdfTexture" in JSON material scheme.
  5. Josh
    It's always fun when I can do something completely new that people have never seen in a game engine. I've had the idea for a while to create a new light type for light strips, and I got to implement this today. The new engine has taken a tremendous amount of effort to get working over two years, but as development continues I think I will become much more responsive to your suggestions since we have a very strong foundation to build on now.
    Using this test scene provided by @reepblue you can see how this new light type looks and behaves. They are great for placing along walls, but what really made me interested was the idea to calculate specular lighting not from a single point, but with a different way. I thought if I could figure out the math I would get a realistic reflection on the ground, and it worked!

    The reflection on the floor is actually the specular component of the light. We are used to thinking of specular reflections as a little white circle that moves around, but the light doesn't have to be coming from a single point. Some calculations in the shader can be used to determine the closest point to the light strip and use that for reflections. The net effect is that a long bar appears on the floor, matching the length of the light. This is not a screen-space effect or a cubemap. When you look down at the floor the specular component is still there shining back at you. Every surface is using the same exact equation, but it appears very different on the walls, the ceiling, and the floor due to the different angles.

    Even a surface facing opposite the light will correctly reflect it back to the camera.

    In this image, I created a small green strip light that looks like a laser. There is no visible laser beam, but if there was it would appear above the soft green lighting. The hard line on the ground is actually the specular reflection of the light. You can see it reflecting off the sphere as well.

    The new Vulkan renderer also supports box lights, which are a directional light with a defined boundary, and I have an idea for one more type of light.
  6. Josh
    A new update is available for beta subscribers. Transparent materials are now supported. Unlike the old deferred renderer, our new clustered forward renderer supports transparency really really well! You can add these in a JSON material file with a Boolean property called "transparent" set to true:
    "transparent": true There are no separate blend modes now, since pre-multiplied alpha allows alpha and additive blending in a single pass. This is actually a really simple technique but for some reason all the information about it online are these horrible academic examples that don't show any clear benefit. It wasn't until I thought "how do I make shiny glass?" that I actually started looking for a way to do this.
    The command Material::SetTransparent(true) replaces the old SetBlendMode(blendmode) function. GLTF materials with blending will be automatically loaded. Per-object Z-sorting is not yet supported, but transparent groups of objects will always be rendered on top of opaque objects automatically, so you don't actually need to enable sorting if you don't expect to have two layers of transparency visible anywhere..

    We haven't seen a lot of transparency in games since the mid-2000s, because at the time deferred rendering was the best lighting technique. I think the capabilities of our new renderer will open up a lot of possibilities to create games that look different from anything in the past.
    You can get access to the beta and private forum right now for just $5.
  7. Josh
    An update for the beta of the new engine is now available with the following changes:
    GLTF loader is now working for most models. A large collection of GLTF files are available online for free from many sources, and they can be loaded right into the engine without any adjustment for materials or textures. Single-file GLB files also work. Added support for GLTF extension KHR_materials_pbrSpecularGlossiness. Disabled PNG loader gamma correction. world->SetSkybox(texture) can now be used to make PBR reflections appear. (The sky will not yet be visible though.) I'm going to try to use a voxel GI system for further reflections, and not use environment probes at all. Window::GetWidth(), Window::GetHeight(), Context::GetWidth(), and Context::GetHeight() are removed. Use GetSize() instead. It will return an iVec2 object with x and y components. JSON material files are changed slightly. in order to accommodate additional per-texture settings. None of these work yet, but you can see where it is going: "albedoMap": { "file": "./Rough-rockface1_Base_Color.jpg", "filter": "linear", "tilingU": "repeat", "tilingV": "repeat" }, So before if you had this:
    "albedoMap": "./Rough-rockface1_Base_Color.jpg" Just change it to this:
    "albedoMap": {"file": "./Rough-rockface1_Base_Color.jpg"} And then it will work.
    The screenshot below is a GLTF loaded and rendered with Vulkan. Note that this model uses baked lighting that is already included in the model. But the fact I can download these things and have them appear correctly with no adjustments is great.

    You can get access to the beta and private forum right now for just $5.
  8. Josh
    A new beta update is available for subscribers. What's new?
    Lighting
    Point and spot lights are now supported in the new Vulkan renderer, with either PBR or Blinn-Phong lighting. Lighting is controlled by the shader in the material file. There are two main shaders you can use, "Shaders/PBR.spv" and "Shaders/Blinn-Phong.spv". See below for more details.

    JSON Materials
    Materials can now be loaded from JSON files. I am currently using the .json file extension instead of "mat", "mtl", or something else. If you load a scene and a JSON file is available with the same name as a material in that scene, the material will be loaded from a JSON file instead of the Leadwerks 4 .mat files. For example, you can create a JSON file named "brick01.json", place it in the same folder as "brick01.mat" and the new engine will load the JSON material if the brick material is used in a scene. However, it is not necessary to do this as the engine can also load Leadwerks 4 material files.
    A Turbo JSON material file looks like this. The string tokens are more or less locked in now and it is safe to start using them.
    { "turboMaterialDef": { "color": [ 1, 1, 1, 1 ], "emission": [ 0, 0, 0 ], "metallic": 0, "roughness": 0.6, "doubleSided": false, "blend": false, "albedoMap": "./concrete_clean_diff.tex", "normalMap": "./concrete_clean_dot3.tex", "metallicRoughnessMap": "", "emissionMap": "", "baseShader": "Shaders/PBR.spv", "shadowShader": "Shaders/Shadow.spv", "depthShader": "Shaders/DepthPass.spv" } } You can also indicate a shader for the new engine to use in an old Leadwerks 4 material file by adding a text line like this to the .mat file:
    baseshader="Shaders/myshader.spv" You do not need to specify a shader unless you are using a custom shader. JSON material files, by default, will use the PBR shader. Leadwerks 4 material files, by default, will use the Blinn-Phong shader.
    BC5 / BC7 Texture Compression
    A ton of new compression formats have been added, including the BC7 and BC5 formats, which provide better quality than DXT compression. Visual Studio 2019 actually has some good built-in DDS tools, although the BC7 compressor Is very slow. A sample material is provided using DDS textures (see "Materials/Rough-rockface1.json").
    Lua Commands
    A set of simple global Lua commands has been added.
    template<typename T> void LuaSetGlobal(const std::string& name, T var) template<typename T> void LuaPushObject(const std::string& name, T var) template<typename T> T LuaToObject(const int index = -1) int LuaCollectGarbage(const int what = LUA_GCCOLLECT, const int data = 0); void LuaPushString(const std::string& s); void LuaPushNumber(const double n); void LuaPushBoolean(const bool b); void LuaPushNil(); void LuaPushValue(const int index = -1); bool LuaIsTable(const int index = -1); bool LuaIsNumber(const int index = -1); bool LuaIsString(const int index = -1); bool LuaIsBoolean(const int index = -1); bool LuaIsObject(const int index = -1); bool LuaIsNil(const int index = -1); bool LuaIsFunction(const int index = -1); bool LuaToBoolean(const int index = -1); std::string LuaToString(const int index = -1); double LuaToNumber(const int index = -1); int LuaType(const int index = -1); int LuaGetField(const std::string& name, const int index = -1); int LuaGetTable(const std::string& name, const int index = -1); int LuaGetGlobal(const std::string& name); void LuaSetField(const std::string& name, const int index = -1); void LuaSetTable(const int index = -1); void LuaPop(const int levels = 1); void LuaRemove(const int index = -1); void LuaSetStackSize(const int sz); int LuaGetStackSize(); void LuaNewTable(); This makes our code simpler and more readable:
    #include "Turbo.h" using namespace Turbo; int main(int argc, const char *argv[]) { //Create a window auto window = CreateWindow("MyGame", 0, 0, 1280, 720); //Create a rendering context auto context = CreateContext(window); //Set some variables in the script environment LuaSetGlobal("mainwindow", window); LuaSetGlobal("maincontext", context); //Create the world auto world = CreateWorld(); //Load a scene auto scene = LoadScene(world, "Maps/start.map"); //Show off a PBR material auto sphere = CreateSphere(world); auto mtl = LoadMaterial("Materials/Rough-rockface1.json"); sphere->SetMaterial(mtl); sphere->Move(0, 1, 0); sphere->SetScale(2); while (window->KeyHit(KEY_ESCAPE) == false and window->Closed() == false) { world->Update(); world->Render(context); } return 0; } You can gain access to the beta and support development by subscribing for just $5.
  9. Josh
    I now have point and spot lights working (without shadows) in the Vulkan renderer. Here are the results, with both "Physically-based rendering" (PBR) and Blinn-Phong shaders: Without the IBL contribution it's not terribly impressive, but this is progress.


  10. Josh
    Vulkan is pretty wonderful because I can take all the optimal techniques I worked out in OpenGL and it just makes everything much faster. I've successfully completed the implementation of early Z-pass, which is important for our lighting system. We are using a forward clustered renderer, similar to the technique id Software's new DOOM games use. Because the fragment shader is fairly intensive, a depth pre-pass is rendered to ensure we only process each screen pixel once.

    This technique also easily supports transparency with multiple layers of shadows.

    Vulkan has a feature called subpasses specifically designed for this type of functionality. It's really fantastic to have this kind of fine control over the hardware, even if it does involved some pretty convoluted code.
    You can read more about this rendering technique below.
     
  11. Josh
    I have basic point lights working in the Vulkan renderer now. There are no shadows or any type of reflections yet. I need to work out how to set up a depth pre-pass. In OpenGL this is very simple, but in Vulkan it requires another complicated mess of code. Once I do that, I can add in other light types (spot, box, and directional) and pull in the PBR lighting shader code. Then I will add support for a cubemap skybox and reflections, and then I will upload another update to the beta.

    Shadows will use variance shadow maps by default. With these, all objects must cast a shadow, but our renderer is so fast that this is not a problem. I've had very good results with these in earlier experiments.
    I then want to complete my work on voxel-based global illumination and reflections. I looked into Nvidia RTX ray tracing but the performance is awful even with a GEForce 2080. My voxel approach should provide good results with fast performance.
    Once these features are in place, I may release the new engine on Steam as a programming SDK, until the new editor is ready.
  12. Josh
    The beta of our new game engine has been updated with a new renderer built with the Vulkan graphics API, and all OpenGL code has been removed. Vulkan provides us with low-overhead rendering that delivers a massive increase in rendering performance. Early benchmarks indicate as much as a 10x improvement in speed over the Leadwerks 4 renderer.
    The new engine features an streamlined API with modern C++ features and an improved binding library for Lua. Here's a simple C++ program in Turbo:
    #include "Turbo.h" using namespace Turbo; int main(int argc, const char *argv[]) { //Create a window auto window = CreateWindow("MyGame", 0, 0, 1280, 720); //Create a rendering context auto context = CreateContext(window); //Set some Lua variables VirtualMachine::lua->set("mainwindow", window); VirtualMachine::lua->set("maincontext", context); //Create the world auto world = CreateWorld(); //Load a scene auto scene = LoadScene(world, "Maps/start.map"); while (window->KeyHit(KEY_ESCAPE) == false and window->Closed() == false) { world->Update(); world->Render(context); } return 0; } Early adopters can get access to beta builds with a subscription of just $4.99 a month, which can be canceled at any time. New updates will come more frequently now that the basic renderer is working.
  13. Josh
    The Vulkan renderer now supports new texture compression formats that can be loaded from DDS files. I've updated the DDS loader to support newer versions of the format with new features.
    BC5 is a format ATI invented (originally called ATI2 or 3Dc) which is a two-channel compressed format specifically designed for storing normal maps. This gives you better quality normals than what DXT compression (even with the DXT5n swizzle hack) can provide.
    BC7 is interesting because it uses the same size as DXT5 images but provides much higher quality results. The compression algorithm is also very long, sometimes taking ten minutes to compress a single texture!  Intel claims to have a fast-ish compressor for it but I have not tried it yet. Protip: You can open DDS files in newer versions of Visual Studio and select the compression format there.
    Here is a grayscale gradient showing uncompressed, DXT5, BC7 UNORM, and BC7 SNORM formats. You can see BC7 UNORM and SNORM have much less artifacts than DXT, but is not quite the same as the original image.





    The original image is 256 x 256, giving the following file sizes:
    Uncompressed: 341 KB DXT1: 42.8 KB (12.6% compression) DXT5, BC7: 85.5 KB (25% compression) I was curious what would happen if I zipped up some of the files, although this is only a minor concern. I guess that BC7 would not work with ZIP compression as well, since it is a more complicated algorithm.
    DXT5: 16.6 KB BC7 UNORM: 34 KB BC7 SNORM: 42.4 KB Based on the results above, I would probably still use uncompressed images for skyboxes and gradients, but anything else can benefit from this format. DXT compression looks like a blocky green mess by comparison.
    I was curious to see how much of a difference the BC5 format made for normal maps so I made some similar renders for normals. You can see below that the benefits for normal maps are even more extreme. The BC5 compressed image is indistinguishable from the original while the DXT5n image has clear artifacts.



    In conclusion, these new formats in the Vulkan renderer, when used properly, will provide compression without visible artifacts.
  14. Josh
    I have it worked out now where the new engine will handle multiple shaders. The renderer groups meshes (renamed from "surfaces" in Leadwerks) by shader. A single draw call renders many batches of instances, with different materials applied. It's a very advanced and complex system, so something that was simple before, changing the shader, now requires a lot of code to make work! You can see here the barbed wire is using an alpha-discard shader that removes pixels while the rest of the scene uses the normal default shader.

    A new material file format will be implemented using the JSON data format. To indicate a shader in an existing Leadwerks material file for the new engine you can add a line of code like this to the file:
    newshader = "Shaders/myshader.spv" This will make it so the new engine can load the new shader, while the old engine will still see the old shader, in case you are using this in Leadwerks Editor.
    Although this new system took a mountain of code to get working, I am starting to feel the tremendous power it offers. The Zone is now rendering 10x faster than it does in Leadwerks.
  15. Josh
    I now have different materials with textures working in Vulkan. The API allows us to access every loaded texture in any shader, although some Intel chips have limitations and will require a fallback. This is interesting because some of our design decisions in Leadwerks 4 were made because we had a limit of 16 textures a shader could access. Terrain clipmaps were a good solution to this problem, but since the same limitations no longer exist it may be time to revisit this design. We could, for example, implement a shader that can access any loaded texture and use a single RGBA texture to indicate which texture should be used for each terrain point. This would allow up to 256 different layers. Best of all, the number of texture layers would have no effect on speed. It would run the same speed no matter how many textures you use. In fact the whole idea of "layers" is obsolete and not descriptive at all of what is happening. This would also eliminate the blurriness and weird filtering that can occur with clipmaps, and give us pixel-perfect terrain at any distance.


    An application has been uploaded for beta subscribers which will load and display any Leadwerks map with the new Vulkan renderer:
    Our new lighting system will work seamlessly with Vulkan. However, before I continue with lighting I want to resolve some problems in the current scope of the renderer. I've worked out a huge piece of the core renderer design now, and I think things will get easier soon.
     
  16. Josh
    In Turbo (Leadwerks 5) all asset types have a list of asset loader objects for loading different file formats. There are a number of built-in loaders for different file formats, but you can add your own by deriving the AssetLoader class or creating a script-based loader. Another new feature is that any scripts in the "Scripts/Start" folder get run when your game starts. Put those together, and you can add support for a new model or texture file format just by dropping a script in your project.
    The following script can be used to add support for loading RAW image files as a model heightmap.
    function LoadModelRaw(stream, asset, flags) --Calculate and verify heightmap size - expects 2 bytes per terrain point, power-of-two sized local datasize = stream:GetSize() local pointsize = 2 local points = datasize / pointsize if points * pointsize ~= datasize then return nil end local size = math.sqrt(points) if size * size ~= points then return nil end if math.pow(2, math.log(size) / math.log(2)) ~= size then return nil end --Create model local modelbase = ModelBase(asset) modelbase.model = CreateModel(nil) local mesh = modelbase.model:AddMesh() --Build mesh from height data local x,y,height,v local textureScale = 4 local terrainHeight = 100 for x = 1, size do for y = 1, size do height = stream:ReadUShort() / 65536 v = mesh.AddVertex(x,height * terrainHeight,y, 0,1,0, x/textureScale, y/textureScale, 0,0, 1,1,1,1) if x > 1 and y > 1 then mesh:AddTriangle(v, v - size - 1, v - size) mesh:AddTriangle(v, v - 1, v - size - 1) end end end --Finalize the mesh mesh:UpdateBounds() mesh:UpdateNormals() mesh:UpdateTangents() mesh:Lock() --Finalize the model modelbase.model:UpdateBounds() modelbase.model:SetShape(CreateShape(mesh)) return true end AddModelLoader(LoadModelRaw) Loading a heightmap is just like loading any other model file:
    auto model = LoadModel(world,"Models/Terrain/island.r16"); This will provide a temporary solution for terrain until the full system is finished.
  17. Josh
    Having completed a hard-coded rendering pipeline for one single shader, I am now working to create a more flexible system that can handle multiple material and shader definitions. If there's one way I can describe Vulkan, it's "take every single possible OpenGL setting, put it into a structure, and create an immutable cached object based on those settings that you can then use and reuse". This design is pretty rigid, but it's one of the reasons Vulkan is giving us an 80% performance increase over OpenGL. Something as simple as disabling backface culling requires recreation of the entire graphics pipeline, and I think this option is going away. The only thing we use it for is the underside of tree branches and fronds, so that light appears to shine through them, but that is not really correct lighting. If you shine a flashlight on the underside of the palm frond it won't brighten the surface if we are just showing the result of the backface lighting.

    A more correct way to do this would be to calculate the lighting for the surface normal, and for the reverse vector, and then add the results together for the final color. In order to give the geometry faces for both direction, a plugin could be added that adds reverse triangles for all the faces of a selected part of the model in the model editor. At first the design of Vulkan feels restrictive, but I also appreciate the fact that it has a design goal other than "let's just do what feels good".
    Using indirect drawing in Vulkan, we can create batches of batches, sorted by shader. This feature is also available in OpenGL, and in fact is used in our vegetation rendering system. Of course the code for all this is quite complex. Draw commands, instance IDs, material IDs, entity 4x4 matrices, and material data all has to be uploaded to the GPU in memory buffers, some of which are more or less static, and some of which are updated each frame, and some for each new visibility set. It is complicated stuff, but after some time I was able to get it working. The screenshot below shows a scene with five unique objects being drawn in one single draw call, and accessing two different materials with different diffuse colors. That means an entire complex scene like The Zone will be rendered in one or just a few passes, with the GPU treating all geometry as if it was a single collapsed object, even as different objects are hidden and shown. Everyone knows that instanced rendering is faster than unique objects, but at some point the number of batches can get high enough to be a bottleneck. Indirect rendering batches the batches to eliminate this slowdown.

    This is one of the features that will help our new renderer run an order of magnitude faster, for high-performance VR and regular 3D games.
  18. Josh
    I finally got a textured surface rendering in Vulkan so we now have officially surpassed StarFox (SNES) graphics:

    Although StarFox did have distance fog. ?

    Vulkan uses a sort of "baked" graphics pipeline. Each surface you want to render uses an object you have to create in code that contains all material, texture, shader, and other settings. There is no concept of "just change this one setting" like in OpenGL. Consequently, the new renderer may be a bit more rigid than what Leadwerks 4 uses, in the interest of speed. For example, the idea of 2D drawing commands you call each frame is absolutely a no-go. (This was likely anyways, due to the multithreaded design.) A better approach for that would be to use persistent 2D primitive objects you create and destroy. I won't lose any sleep over this because our overarching design goal is performance.
    Right now I have everything hard-coded and am using only one shader and one texture, in a single graphics pipeline object. Next I need to make this more dynamic so that a new graphics pipeline can be created whenever a new combination of settings is needed. A graphics pipeline object corresponds pretty closely to a material. I am leaning towards storing a lot of settings we presently store in texture files in material files instead. This does also resolve the problem of storing these extra settings in a DDS file. Textures become more of a dumb image format while material settings are used to control them. Vulkan is a "closer to the metal" API and that may pull the engine in that direction a bit. That's not bad.
    I like using JSON data for file formats, so the new material files might look something like this:
    { "material": { "color": "1.0, 1.0, 1.0, 1.0", "albedoMap": { "file": "brick01_albedo.dds", "addressModeU": "repeat", "addressModeV": "repeat", "addressModeW": "repeat", "filter": "linear" }, "normalMap": { "file": "brick01_normal.dds", "addressModeU": "repeat", "addressModeV": "repeat", "addressModeW": "repeat", "filter": "linear" }, "metalRoughnessMap": { "file": "brick01_metalRoughness.dds", "addressModeU": "repeat", "addressModeV": "repeat", "addressModeW": "repeat", "filter": "linear" }, "emissiveMap": { "file": "brick01_emissive.dds", "addressModeU": "repeat", "addressModeV": "repeat", "addressModeW": "repeat", "filter": "linear" } } } Of course getting this to work in Vulkan required another mountain of code, but I am starting to get the hang of it.
  19. Josh
    It is now possible to compile shaders into a single self-contained file that can loaded by any Vulkan program, but it's not obvious how this is done. After poking around for a while I found all the pieces I needed to put this together.
    Compiling
    First, you need to compile each shader stage from a source code file into a precompiled SPIR-V file. There are several tools available to do this, but I prefer GLSlangValidator because it supports the Google #include extension. Put your vertex shader code in a text file named "shader.vert" and your pixel shader code in a file called "shader.frag". Create a .bat file in the same directory with the following contents:
    glslangValidator.exe "shader.vert" -V -o "vert.spv" glslangValidator.exe "shader.frag" -V -o "frag.spv" Run the bat file and two .spv files will be saved.
    Linking
    Now we want to combine our two files representing different shader stages into a single file. This is done with the link tool from Khronos. Add the following lines to your .bat file to compile the two .spv files into one. It will also delete the existing files to clean things up a little:
    spirv-link "vert.spv" "frag.spv" -o "shader.spv" del "vert.spv" del "frag.spv" This will save a single file named "shader.spv" that you can load as one shader module and use for different stages in Vulkan.
    Here are the required executables and a .bat file:
    BuildShader.zip
    Parsing
    If you always use vertex and fragment stages then there is no problem, but what if the combined .spv file contains other stages, or is missing a fragment stage? We can easily account for this with a minimal SPIR-V file parser. We're not going to include any big bloated libraries to do this because we only need some basic information about what stages are contained in the shader. Fortunately, the SPIR-V specification is pretty simple and it doesn't take much code to extract the information we want:
    std::string entrypointname[6]; auto stream = ReadFile(L"Shaders/shader.spv"); // Parse SPIR-V data Assert(stream->ReadInt() == 0x07230203); int version = stream->ReadInt(); int genmagnum = stream->ReadInt(); int bound = stream->ReadInt(); int reserved = stream->ReadInt(); bool stages[6] = {false,false,false,false,false,false}; // Instruction stream while (stream->Ended() == false) { int pos = stream->GetPos(); unsigned int bytes = stream->ReadUInt(); int opcode = LOWORD(bytes); int wordcount = HIWORD(bytes); if (opcode == 15) { int executionmodel = stream->ReadInt(); Assert(executionmodel >= 0); if (executionmodel < 6) { stream->ReadInt(); // entry point stages[executionmodel] = true; entrypointname[executionmodel] = stream->ReadString(); } } stream->Seek(pos + wordcount * 4); } This code even retrieves the entry point name for each stage, so you can be sure you are loadng the shader correctly.
    Here are the different shader stages from the SPIR-V specification:
    0: Vertex 1: TessellationControl 2: TessellationEvaluation 3: Geometry 4: Fragment 5: GLCompute That's it! We now have a standard single-file shader format for Vulkan programs. Your code for creating these will look something like this:
    VkShaderModule shadermodule; // Create shader module VkShaderModuleCreateInfo shaderCreateInfo = {}; shaderCreateInfo.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO; shaderCreateInfo.codeSize = bank->GetSize(); shaderCreateInfo.pCode = reinterpret_cast<const uint32_t*>(bank->buf); VkAssert(vkCreateShaderModule(device->device, &shaderCreateInfo, nullptr, &shadermodule)); // Create vertex stage info VkPipelineShaderStageCreateInfo vertShaderStageInfo = {}; vertShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; vertShaderStageInfo.stage = VK_SHADER_STAGE_VERTEX_BIT; vertShaderStageInfo.module = shadermodule; vertShaderStageInfo.pName = entrypointname[0].c_str(); VkPipelineShaderStageCreateInfo fragShaderStageInfo = {}; if (stages[4]) { // Create fragment stage info fragShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; fragShaderStageInfo.stage = VK_SHADER_STAGE_FRAGMENT_BIT; fragShaderStageInfo.module = shadermodule; fragShaderStageInfo.pName = entrypointname[4].c_str(); } // Create your graphics pipeline...  
  20. Josh
    I was going to write about my thoughts on Vulkan, about what I like and don't like, what could be improved, and what ramifications this has for developers and the industry. But it doesn't matter what I think. This is the way things are going, and I have no say in that. I can only respond to these big industry-wide changes and make it work to my advantage. Overall, Vulkan does help us, in both a technical and business sense. That's as much as I feel like explaining.

    Beta subscribers can try the demo out here:
    This is the code it takes to add a depth buffer to the swap chain ?:
    //---------------------------------------------------------------- // Depth attachment //---------------------------------------------------------------- auto depthformat = VK_FORMAT_D24_UNORM_S8_UINT; VkImage depthimage = nullptr; VkImageCreateInfo image_info = {}; image_info.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO; image_info.pNext = NULL; image_info.imageType = VK_IMAGE_TYPE_2D; image_info.format = depthformat; image_info.extent.width = chaininfo.imageExtent.width; image_info.extent.height = chaininfo.imageExtent.height; image_info.extent.depth = 1; image_info.mipLevels = 1; image_info.arrayLayers = 1; image_info.samples = VK_SAMPLE_COUNT_1_BIT; image_info.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED; image_info.usage = VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT; image_info.queueFamilyIndexCount = 0; image_info.pQueueFamilyIndices = NULL; image_info.sharingMode = VK_SHARING_MODE_EXCLUSIVE; image_info.flags = 0; vkCreateImage(device->device, &image_info, nullptr, &depthimage); VkMemoryRequirements memRequirements; vkGetImageMemoryRequirements(device->device, depthimage, &memRequirements); VmaAllocation alllocation = {}; VmaAllocationInfo allocinfo = {}; VmaAllocationCreateInfo allocCreateInfo = {}; allocCreateInfo.usage = VMA_MEMORY_USAGE_GPU_ONLY; VkAssert(vmaAllocateMemory(GameEngine::Get()->renderingthreadmanager->instance->allocator, &memRequirements, &allocCreateInfo, &alllocation, &allocinfo)); VkAssert(vkBindImageMemory(device->device, depthimage, allocinfo.deviceMemory, allocinfo.offset)); VkImageView depthImageView; VkImageViewCreateInfo view_info = {}; view_info.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO; view_info.pNext = NULL; view_info.image = depthimage; view_info.format = depthformat; view_info.components.r = VK_COMPONENT_SWIZZLE_R; view_info.components.g = VK_COMPONENT_SWIZZLE_G; view_info.components.b = VK_COMPONENT_SWIZZLE_B; view_info.components.a = VK_COMPONENT_SWIZZLE_A; view_info.subresourceRange.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT; view_info.subresourceRange.baseMipLevel = 0; view_info.subresourceRange.levelCount = 1; view_info.subresourceRange.baseArrayLayer = 0; view_info.subresourceRange.layerCount = 1; view_info.viewType = VK_IMAGE_VIEW_TYPE_2D; view_info.flags = 0; VkAssert(vkCreateImageView(device->device, &view_info, NULL, &depthImageView)); VkAttachmentDescription depthAttachment = {}; depthAttachment.format = depthformat; depthAttachment.samples = VK_SAMPLE_COUNT_1_BIT; depthAttachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR; depthAttachment.storeOp = VK_ATTACHMENT_STORE_OP_DONT_CARE; depthAttachment.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; depthAttachment.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE; depthAttachment.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED; depthAttachment.finalLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; VkAttachmentReference depthAttachmentRef = {}; depthAttachmentRef.attachment = 1; depthAttachmentRef.layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; VkPipelineDepthStencilStateCreateInfo depthStencil = {}; depthStencil.sType = VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO; depthStencil.depthTestEnable = VK_TRUE; depthStencil.depthWriteEnable = VK_TRUE; depthStencil.depthCompareOp = VK_COMPARE_OP_LESS; depthStencil.depthBoundsTestEnable = VK_FALSE; depthStencil.minDepthBounds = 0.0f; depthStencil.maxDepthBounds = 1.0f; depthStencil.stencilTestEnable = VK_FALSE; depthStencil.front = {}; depthStencil.back = {}; I was hoping I would put a month into it and be up to speed with where we were with OpenGL, but it is much more complicated than that. Using Vulkan is going to be tough but we will get through it and I think the benefits will be worthwhile:
    Vulkan makes our new renderer 80% faster Better compatibility (Mac, Intel on Linux) There's a lot of demand for Vulkan products thanks to Khronos and Valve's promotion.
  21. Josh
    Vulkan gives us explicit control over the way data is handled in system and video memory. You can map a buffer into system memory, modify it, and then unmap it (giving it back to the GPU) but it is very slow to have a buffer that both the GPU and CPU can access. Instead, you can create a staging buffer that only the CPU can access, then use that to copy data into another buffer that can only be read by the GPU. Because the GPU buffer may be in-use at the time you want to copy data to it, it is best to insert the copy operation into a command buffer, so it happens after the previous frame is rendered. To handle this, we have a pool of transfer buffers which are retrieved by a command buffer when needed, then released back into the pool once that command buffer is finished drawing. A fence is used to tell when the command buffer completes its operations.
    One issue we came across with OpenGL in Leadwerks was when data was uploaded to the GPU while it was still being accessed to render a frame. You could actually see this on some cards when playing my Asteroids3D game. There was no mechanism in OpenGL to synchronize memory, so the best you could do was put data transfers at the start of your rendering code, and hope that there was enough of a delay before your drawing actually started that the memory copying had completed. With the super low-overhead approach of Vulkan rendering, this problem would become much worse. To deal with this, Vulkan uses explicit memory management with something called pipeline barriers. When you add a command into a Vulkan command buffer, there is no guarantee what order those commands will be executed in, and pipeline barriers allow you to create a point where certain commands must be executed before other ones can begin.
    Here are the order of operations:
    Start recording new command buffer. Retrieve staging buffer from pool and remove from pool. Copy data into staging buffer. Insert command to copy from staging buffer to the GPU buffer. Insert pipeline barrier to make sure data is transferred before drawing begins. Execute the command buffer. When the fence is completed, move all staging buffers back into the staging buffer pool. In the new game engine, we have several large buffers to store the following data:
    Mesh vertices Mesh indices Entity 4x4 matrices (and other info) A list of visible entity IDs Visible light information. Skeleton animation data I found this data tends to fall into two categories.
    Some data is large and only some of it gets updated each frame. This includes entity 4x4 matrices, skeleton animation data, and mesh vertex and index data. Other data tends to be smaller and only concerns visible objects. This includes visible entity IDs and light information. This data is updated completely each time a new visibility set arrives. The first type of data requires data buffers that can be resized, because they can be very large, and more objects or data might be added at any time. For example, the vertex buffer contains all vertices that exist, in all meshes the user creates or loads. If a new mesh is loaded that requires space greater than the buffer capacity, a new buffer must be created, then the full contents of the old buffer are copied over, directly in GPU memory. A new pipeline barrier is inserted to ensure the data transfer to the new buffer is finished, and then additional data is copied.
    The second type of data is a bit simpler. If the existing buffer is not big enough, a new bigger buffer is created. Since the entire contents of the buffer are uploaded with each new visibility set, there is no need to copy any existing data from the old buffer.
    I currently have about 2500 lines of Vulkan-specific code. Calling this "boilerplate" is disingenuous, because it is really specific to the way you set your renderer up, but the core mesh rendering system I first implemented in OpenGL is working and I will soon begin adding support for textures.
     
  22. Josh
    I've now got the Vulkan renderer drawing multiple different models in one single pass. This is done by merging all mesh geometry into one single vertex and indice buffer and using indirect drawing. I implemented this originally in OpenGL and was able to translate the technique over to Vulkan. This can allow an entire scene to be drawn in just one or a few draw calls. This will make a tremendous improvement in performance in complex scenes like The Zone. In that scene in Leadwerks the slow step is the rendering routine on the CPU churning through thousands of OpenGL commands, and this design effectively eliminates that entire bottleneck.

    There is no depth buffer in use in the above image, so some triangles appear on top of others they are behind.
    Vulkan provides a lot of control when transferring memory into VRAM, and as a result we saw an 80% performance improvement over OpenGL in our first performance comparison. I have set up a system that uses staging buffers to transfer bits of memory from the CPU into shared memory buffers on the GPU. Another interesting capability is the ability to transfer multiple chunks of data between buffers in just one command.
    However, that control comes at a cost of complexity. At the moment, the above code works fine on Intel graphics but crashes on my discrete Nvidia card. This makes sense because of the way Vulkan handles memory. You have to explicitly synchronize memory yourself using a pipeline barrier. Since Intel graphics just uses system memory I don't think it will have any problems with memory synchronization like a discrete card will.
    That will be the next step, and it is really a complex topic, but my usage of it will be limited, so I think in the end my own code will turn out to be pretty simple. I expect Vulkan 2.0 will probably introduce a lot of simplified paths that will become the default, because this stuff is really just too hard for both beginners and experts. There’s no reason for memory to not be synced automatically and you’re just playing with fire otherwise.
  23. Josh
    Using my box test of over 100,000 boxes, I can compare performance in the new engine using OpenGL and Vulkan side by side. The results are astounding.
    Our new engine uses extensive multithreading to perform culling and rendering on separate threads, bringing down the time the GPU sits around waiting for the CPU to nearly zero.
    Hardware: Nvidia GEForce GTX 1070 (notebook)
    OpenGL: ~380 FPS

    Vulkan 700+ FPS. FRAPS does not work with Vulkan, so the only FPS counter I have is the Steam one, and the text is very small.

    Vulkan clearly alleviates the data transfer bottleneck the OpenGL version experiences. I am not using a depth buffer in the Vulkan renderer yet, and I expect that will further increase the speed. I'm very happy with these results and I think exclusively relying on Vulkan in the future, together with our new engine designed for modern graphics hardware, will give us great outcomes. 
     
  24. Josh
    Following this tutorial, I have managed to add uniform buffers into my Vulkan graphics pipeline. Since each image in the swapchain has a different graphics pipeline object, and uniform buffers are tied to a pipeline, you end up uploading all the data three times every time it changes. OpenGL might be doing something like this under the hood, but I am not sure this is a good approach. There are three ways to get data to a shader in Vulkan. Push constants are synonymous with GLSL uniforms, although much more restrictive. Uniform buffers are the same in OpenGL, and this is where I store light data in the clustered forward renderer. Shader storage buffers are the slowest to update but they can have a very large capacity, usually as big as your entire VRAM. I have the first two working now. Below you can see a range of instanced boxes being rendered with an offset for each instance, which is being read from a uniform buffer using the instance ID as the array index.

    To prove that Vulkan can render more than just boxes, here is a model loaded from GLTF format and rendered in Vulkan:

    Figuring out the design of the new renderer using OpenGL was very smart. I would not have been able to invent it if I had jumped straight into Vulkan.
  25. Josh
    The Vulkan graphics API is unbelievably complex. To create a render context, you must create a series of images for the front and back buffers (you can create three for triple-buffering). This is called a swap chain. Now, Vulkan operates on the principle of command buffers, which are a list of commands that get sent to the GPU. Guess what? The target image is part of the command buffer! So for each image in your swap chain, you need to maintain a separate command buffer  If anything changes in your program like the camera clearscreen color, you have to recreate the command buffers...all of them! But some of them will still be in use at the time your frame begins, so you need to store a flag that says "recreate this command buffer when it is time to start rendering with this image / command buffer".
    The whole thing is really bad, but admitting that there is any practical limit to the how complex an APi should be opens a developer to ridicule. I make complex technologies easy to use for a living, so I'm just calling it out, this is garbage design. Vulkan is actually good for me because it means fewer people can figure out how to make a game engine, but it's really ridiculous. Khronos has stated that they expect semi-standard open-source code to arise to address these complexities, but I don't see that happening. I'm not going to touch something like AMD's V-EZ because it's just another layer of code that might stop being supported at any time. As a result of the terrible design, Vulkan is going to continue to struggle to gain adoption, and we are now entering an era where the platform holders are in a fight with the application developers about who is responsible for writing graphics drivers.
    I really like some aspects of Vulkan. SPIR-V shaders are great, and I am very glad to be rid of OpenGL's implicit global states, FBO creation, strange resource sharing, and so on. But nobody needs detailed access to the swap chain. Nobody needs to manage their own synchronization. That's what we have graphics drivers for.
    Anyways, here is my test application. The screen will change color when you press the space key, which involves re-creation of the command buffers. The Vulkan stuff is 1300 lines of code.
    vktest3.zip
    The good thing is that although the initial setup is prohibitive, this stuff tends to get compartmentalized away as I add more capabilities, so it gets easier as time goes on. This is very difficult stuff but we will be better off once I get through this.
×
×
  • Create New...