Jump to content

nick.ace

Members
  • Posts

    647
  • Joined

  • Last visited

Blog Comments posted by nick.ace

    Website refresh and Leadwerks 5

    I'm not really sure what you mean by the shaders are handled very differently per vendor. Sure, they can do specific optimizations per architecture (like reorganize instructions to allow for more spaced texture lookups and better instruction-level parallelism), but that should be about it. I'm not sure you can say NVidia handles this the best though. The GPU performance in the initial benchmarks could be due to many things including suboptimal image layouts and poor memory management.

    Yeah, you're not going to get much speedup with multithreaded OpenGL drivers. You should be able to see this if you run an OpenGL program, but it might depend on the vendor. However, drivers can be used to do GLSL compilation on multiple threads and can create command buffers since they OpenGL typically has many frames of latency. So it should be possible to cache these command buffers. How much this is done in practice though (if at all), I'm not sure, and it varies per vendor.

    Website refresh and Leadwerks 5

    @Crazycarpet I think you are arguing the same point as me. My point was that you need to synchronize at some point.

    12 hours ago, Crazycarpet said:

    As for the drivers for Vulkan, they're only easier to write because spir-v is a binary language, where as GLSL and some other shading languages may interpret the standards differently. Khronos provided a new verison of glslang that ensures your GLSL is conformant and compiles it to binary.

    You're point about drivers being easier to write because of Spir-V shouldn't be the reason. GLSL is pretty clearly defined, and you can easily write a shader that compiles on all platforms because of this. The compilation from GLSL to Spir-V isn't that difficult, as you can see from human-readable Spir-V code. Spir-V was made so that shader compilation on each platform would be faster (for some games this is a big deal) and so that other languages (e.g., HLSL) could compile down to it. A common trend with Vulkan is letting the application come up with abstractions. The driver still has to convert Spir-V to it's internal GPU format.

    OpenGL drivers are often multithreaded and do this command buffer generation behind the scenes. One of the problems is that they don't know what commands the application will give next. So driver writers take it on themselves to create heuristics guessing what patterns of commands will be sent. If you look at driver releases, they will often give performance metrics for speedups for certain games. This is because they change heuristics for certain games, something indie developers don't really have access to. Vulkan seeks to largely remove this disconnect, and it largely does if you are familiar with the API.

    There are other things that go on in drivers as well such as how to handle state changes and how memory is managed. Again, these are heuristic based. For state changes for example, certain state combinations can be cached in OpenGL drivers. Vulkan makes it the application's responsibility to cache these (with immutable pipeline objects). Maybe this cached state gets discarded in OpenGL and is needed again, so there will be lag. Yet, you may have expected this state combination to be needed, but the driver doesn't know this.

    The problem is that the implementation for certain things might involve more changes that you would expect, but you don't get many opportunities in OpenGL to work around this. For example, changing blend modes could force your shaders to change behind the scenes. Yes, your shaders, because many architectures implement blending as programmable.

    And don't forget about validation because OpenGL tries to prevent undefined behavior :). Validation is expensive since you have to account for edges cases that many applications will never run into.

    Website refresh and Leadwerks 5

    5 minutes ago, Josh said:

    The Doom 2016 benchmarks show slightly worse performance in Vulkan vs. OpenGL on Nvidia hardware.

    Yes, at 4K resolution, which at that point has little to do with the API as you are so GPU limited. Also, considering those drivers were out for 3-4 months and the devs were probably still learning Vulkan, I don't think that's a fair conclusion. Also, their engine was basically structured similarly to how they structured their OpenGL engine (I know because they gave a talk about it at a Vulkan conference). The drivers are also much easier to write for Vulkan, so there will likely be less problems going forward.

    Website refresh and Leadwerks 5

    On 5/23/2017 at 3:13 PM, Crazycarpet said:

    A programmer can make a renderer with Vulkan that never locks a mutex by simply having 1 (or 2) command pools per thread.

    Not sure you can do this for any benefit. Yes, you can have command pools per thread, but you need to synchronize the submission of command buffers or you can get undefined behavior. Building command buffers is what multithreading is intended for.

    On 5/23/2017 at 5:52 PM, Josh said:

    Notice there are no Vulkan-exclusive games in existence.  It's always just an experimental feature people add on to another renderer.

    Microsoft has a solid working API that reliably does everything Vulkan does, except run on Linux, and Apple is not interested in Vulkan.  So really I would say DX12 is the future for a long time.  We will see.

    Are there any games that support only DX12? And no, DX12 only runs on Windows 10, so they are neglecting a huge chunk of their own consumer base.

    On 5/23/2017 at 3:10 PM, Josh said:

    One will gather up batches of surfaces to be rendered, and the other thread will continuously rendering them.  This is basically how command buffer APIs work, as I understand it.  I don't think Vulkan will offer a big performance increase over the architecture I have in mind.  It's only AMD cards that get a significant boost, and OpenGL is still the only cross-platform graphics API for Windows, Mac, and Linux.

    No, that is not how command buffer APIs work. The point of a command buffer is to bind renderpasses, descriptor sets, pipelines, draw objects, etc. They are basically a list of commands that are "precompiled" in a way. For instance, a post-processing stack would benefit from this since it rarely changes.

    Vulkan offers improvements over OpenGL in many areas. Your deferred rendering with MSAA can be improved by Vulkan by using input attachments, which would substantially reduce memory bandwidth. Subpasses also help with this. Vulkan doesn't have validation, and this was a huge performance hit in OpenGL. Vulkan has immutable state in the form of pipeline object, again a huge performance improvement. Notice that none of these even talk about multithreading. So regardless of how you design your renderer, you will be able to benefit from Vulkan features. I'm not sure how you can say AMD cards only benefit from Vulkan when it's largely CPU improvements (so driver improvements). You can even download NVidia demos if you don't believe me.

    I'm not saying you should use Vulkan, use whatever you like, but the reasons you are dismissing it have no backing.

    Website refresh and Leadwerks 5

    The multithreading sounds great! I think the plugins will add a lot of power to what you can do with the editor.

    Why put culling and rendering on separate threads though? Culling really shouldn't take that long (plus this doesn't reduce the latency in any way for VR): https://www.gamedev.net/topic/675170-frustum-culling-question/

    I really think you shouldn't dismiss Vulkan with superficial arguments though. You should get a better understanding about what Vulkan can do (what problems it solve with OpenGL). Khronos isn't "throwing in the towel", not sure where you would get that from. You know Khronos is made up of members of the GPU vendors who collaborated on the specification, so it's not some random standard. It's been one year since Vulkan was released, how do you expect game studios to completely rewrite their engines in that amount of time?

    On 5/23/2017 at 6:45 PM, Josh said:

    They have a dozen specifications, all with lots of logos from all around the tech industry, and OpenGL has been their only hit.

    OpenGL isn't even their most popular graphics API, let alone API. Also, Vulkan is much more portable than OpenGL ever was. Have you heard of OpenCL, the API used in Autodesk Maya and Adobe Photoshop? Not to mention their other APIs for specialty use (OpenGL SC, OpenVX, etc.).

    On 5/23/2017 at 6:45 PM, Josh said:

    They're not even trying to get people to adopt Vulkan anymore, now they are trying to make a wrapper for DX12 and Metal.  So really shouldn't "Khronos 3D Portability Standard" be considered the future?

    No, they are suggesting an abstraction layer over DX12, Metal, and Vulkan. Vulkan supports more Windows OSes than DX12, so the abstraction layer is accounting mostly for DX12 and Metal's faults. If they were "throwing in the towel", then why are Google and Valve so invested in Vulkan? Why is Vulkan a first-class API of OpenVR? And why would the Khronos Group suggest merging OpenCL with Vulkan?

  1. @Crazycarpet

    Why wouldn't you expect to see big drops at the beginning? Seconds per frame should be linear but frames per second is an inverse function so the drop should be high at first.

     

    @Josh

    It's good to see the bounding boxes recalculation being better designed. Have you tried keeping an internal tree (in array form) using structs for bones instead of classes? This should help speed up parent transformation lookups immensely.

  2. The interior lighting and design looks great!

     

    If it becomes prohibitive, keep a pooling system and only unload mesh and material objects that you don't need to reuse. Of course, then you would need to implement your own map loading logic and only have one map since the map file isn't documented. That kind of seems like a lot of work. I think a streaming system would be beneficial here so that you could load stuff as you get close.

  3. Each line segment is made up of two vertices (pixels) right now, so you have 8 vertices. If you share the vertices between line segments, then you will never have gaps. You need to treat the vertex positions (such as (0,9)) as continuous values rather than discrete values.

     

    ?id=722847405&fileuploadsuccess=1

     

    One way to simplify this is to offset the coordinates by (.5,.5), as this is what GPUs have done in the past. The blue rectangle are the offset coordinates. The yellow fills are to indicate that that pixel is filled by the top/left. The red indicates the bottom/right. It's not super important for 3D, but for 2D UI it makes more sense.

     

    Rules:

    • We will take the floor of the filled in pixel value except at the end of the line segment.
    • If the line segment is the bottom or the right, we will take the ceiling.

    It's mathematically impossible for gaps or overlaps to form with this set of rules. For rasterizing 3D triangles, barycentric coordinates are used in order to generalize these rules, but the same idea applies. Either way, you get crisp, accurate edges. smile.png

  4. If you choose to go the non-global route, why not just use an abstract coordinate shared vertex between lines segments? Then you would only have 4 vertices instead of 8, and this issue would be avoided. Then you just offset the top/left by one or the bottom/right by one.

  5. Why would it be any slower? There are many benchmarks showing substantial improvement speeds (50-90%).

     

    Over time it would have the potential to be faster and provide me with more low-level access to the hardware, although the 1.0 spec doesn't allow anything that OpenGL can't already do.

     

    That's not true at all. I just looked at the spec. There is a vast amount of new material.

     

    Examples:

    -You couldn't multithread with OpenGL rendering, but there are ways to do it with commands listed in the current Vulkan spec.

    -You can assign things to individual cards. So now you could do VRAM stacking if you wanted in SLI. There should be many other uses for this too.

    -Compute shaders. Yes this is in 4.3, but Leadwerks right now is 4.0.

    -Precompiled shaders. I imagine this would improve game start-up time.

     

    Unless the drivers are magically screwed up to the point that they are functionless, I can't see how you would have worse performance. Are there any specifics as to why you think this would be the case?

  6. I agree with cassius. I like the rename of Indie Edition, but why not call the Professional Edition something like Leadwerks Game Engine: C++ Addon. With other pro editions for other software, professional usually includes extra support, more features, sometimes even a different license. Just my 2 cents

  7. The interface and graphics look so crisp! You and Rick have been doing an awesome job so far! Can't wait for a video!

     

    @ Rick, are you loading assets at runtime via Lua or C++? And are you streaming them in somehow (possibly on another thread)? Just curious on how you are approaching this.

  8. I was going to skip the GDC, but I then got invited to the Valve party. I guess they're hiring the band Glitch Mob to play at Folsom Nightclub, which is kind of awesome.

    Congrats! I guess they are not into the California rap scene ;)

     

    I like YouGroove's suggestion though, maybe even allow other users to be able to edit and add onto tutorials if they are out of date or incomplete?

  9. I haven't used LE for about a week or so, but I have found that the editor performance has gotten a lot worse. I only get like 8 fps in the perspective view, whereas I was getting much more (about 60 fps) before. On a positive note, my game seems to have a speedup, so this seems very strange. Is anyone else having this problem?

×
×
  • Create New...