Jump to content
  • entries
    943
  • comments
    5,899
  • views
    920,581

Second Performance Test: nearly 400% faster!


Josh

5,019 views

 Share

After observing the behavior of the previous test, I rearranged the threading architecture for even more massive performance gains. This build runs at speeds in excess of 400 FPS with 100,000 entities....on Intel integrated graphics!

Image1.thumb.jpg.0ca7079accccac555bec07f378bd64e8.jpg

I've had more luck with concurrency in design than parallelism. (Images below are taken from here.)

Splitting the octree recursion up into separate threads produced only modest gains. It's difficult to optimize because the sparse octree is unpredictable.

concurrency-vs-parallelism-2.png.ef122c5d6445bfea18048bc5a4e947e0.png

Splitting different parts of the engine up into multiple threads did result in a massive performance boost.

concurrency-vs-parallelism-1.png.b04c1a22d05e544e0e6ea88de2f901f9.png

The same test in Leadwerks 4 runs at about 9 FPS. making Leadwerks 5 more than 45 times faster under heavy loads like this.

Alpha subscribers can try the test out here.

  • Like 2
 Share

5 Comments


Recommended Comments

When you're doing this threading it's really more about the processor than the gfx card isn't it? Are these threads on the CPU or GPU?

Link to comment

CPU. The rendering code is already very optimized and this is about eliminating all overhead on the CPU side.

Link to comment

as each thread is doing a smaller subset of the work, you are probably getting more cache hits, are you also using thread affinity on your busiest threads to stop them context switching.

 

Link to comment

I got the culling time down to an insanely low amount, and it would actually be much slower if I split it up into multiple threads:

 

  • Haha 1
Link to comment

sorry I assumed that each task was on a thread, running independently. I was suggesting that if you had more active threads than cpus, you would experience contention for those cpus. You would see this in the performance stats, as context switches, which will cause the current context to be saved, and another loaded. If this happens a lot you are loosing useful cpu processing power.

  • Like 1
Link to comment
Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...