Jump to content

Possible Memleak with Instantiate


klepto2
 Share

Recommended Posts

With the provided code I simulate the behaviour of a much more complex problem to show the problem in an extreme way. 

In the production ready code i get a slower mem increase, but it slows down the program and leads to a out of memory exception.

Things to notice in the program: 

  • The memusage increases fast
  • The displayed instance count is much higher than it should
    • I let the program instantiate 100 planes, so the actual count should be 101 (with the orignal plane)
    • In the loop the instances are cleared and reinstantiated.
#include "UltraEngine.h"
#include "ComponentSystem.h"
//#include "Steamworks/Steamworks.h"

using namespace UltraEngine;

SIZE_T PrintMemoryInfo()
{
    auto myHandle = GetCurrentProcess();
    //to fill in the process' memory usage details
    PROCESS_MEMORY_COUNTERS pmc;
    //return the usage (bytes), if I may
    if (GetProcessMemoryInfo(myHandle, &pmc, sizeof(pmc)))
        return(pmc.WorkingSetSize);
    else
        return 0;
}


int main(int argc, const char* argv[])
{
    
#ifdef STEAM_API_H
    if (not Steamworks::Initialize())
    {
        RuntimeError("Steamworks failed to initialize.");
        return 1;
    }
#endif

    RegisterComponents();

    auto cl = ParseCommandLine(argc, argv);
    
    //Load FreeImage plugin (optional)
    auto fiplugin = LoadPlugin("Plugins/FITextureLoader");

    //Get the displays
    auto displays = GetDisplays();

    //Create a window
    auto window = CreateWindow("Ultra Engine", 0, 0, 1280 * displays[0]->scale, 720 * displays[0]->scale, displays[0], WINDOW_CENTER | WINDOW_TITLEBAR);

    if (!AttachConsole(ATTACH_PARENT_PROCESS))
    {
        if (AllocConsole())
        {
            freopen("conin$", "r", stdin);
            freopen("conout$", "w", stdout);
            freopen("conout$", "w", stderr);
        }
    }
    else
    {
        auto consoleHandleOut = GetStdHandle(STD_OUTPUT_HANDLE);
        auto consoleHandleIn = GetStdHandle(STD_INPUT_HANDLE);
        auto consoleHandleErr = GetStdHandle(STD_ERROR_HANDLE);
        if (consoleHandleOut != INVALID_HANDLE_VALUE) {
            freopen("conout$", "w", stdout);
            setvbuf(stdout, NULL, _IONBF, 0);
        }
        if (consoleHandleIn != INVALID_HANDLE_VALUE) {
            freopen("conin$", "r", stdin);
            setvbuf(stdin, NULL, _IONBF, 0);
        }
        if (consoleHandleErr != INVALID_HANDLE_VALUE) {
            freopen("conout$", "w", stderr);
            setvbuf(stderr, NULL, _IONBF, 0);
        }
    }

    //Create a framebuffer
    auto framebuffer = CreateFramebuffer(window);

    //Create a world
    auto world = CreateWorld();

    auto camera = CreateCamera(world);
    camera->SetClearColor(0.125);
    camera->SetFov(70);
    camera->Move(0, 2, -8);

    //Create light
    auto light = CreateDirectionalLight(world);
    light->SetRotation(45, 35, 0);
    light->SetColor(2);

 
    world->RecordStats(true);

    shared_ptr<Entity> main_instance = CreatePlane(world, 10.0,10.0, 256,256);
    vector<shared_ptr<Entity>> instances;


    //Main loop
    while (window->Closed() == false and window->KeyDown(KEY_ESCAPE) == false)
    {
        instances.clear();

        for (int i = 0; i < 100; i++)
        {
            instances.push_back(main_instance->Instantiate(world));
        }

        world->Update();
        world->Render(framebuffer);

#ifdef STEAM_API_H
        Steamworks::Update();
#endif
        window->SetText("Instances: " +String(world->renderstats.instances) + " MEM: " + String(PrintMemoryInfo() / 1024) + " kb");
    }

#ifdef STEAM_API_H
    Steamworks::Shutdown();
#endif

    return 0;
}

 

  • Intel® Core™ i7-8550U @ 1.80 Ghz 
  • 16GB RAM 
  • INTEL UHD Graphics 620
  • Windows 10 Pro 64-Bit-Version
Link to comment
Share on other sites

The number of instances drawn will normally be higher than the number of instances that exist in view:

1 depth prepass + 1 main pass + 3 for each directional light.

That would account for about 500 instances, and your app is reporting 1000. It's probably reporting that number incorrectly based on some stuff that never gets drawn, but I will investigate.

I can confirm the memory increase. It should not be too hard to figure out what is failing to release.

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Okay, I can confirm double the meshes were being drawn as needed. :wacko: This was an easy fix.

I am trying to track down the source of the memory increase. It does not appear to have anything to do with the rendering thread.

  • Like 1

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

With this simple example that excludes the entities from any world we can see the mem usage is very stable.

#include "UltraEngine.h"

using namespace UltraEngine;

int main(int argc, const char* argv[])
{
    auto displays = GetDisplays();
    auto window = CreateWindow("Ultra Engine", 0, 0, 1280 * displays[0]->scale, 720 * displays[0]->scale, displays[0], WINDOW_CENTER | WINDOW_TITLEBAR);
    auto framebuffer = CreateFramebuffer(window);
    auto world = CreateWorld();

    shared_ptr<Entity> main_instance = CreatePivot(nullptr);
    vector<shared_ptr<Entity>> instances;

    while (window->Closed() == false and window->KeyDown(KEY_ESCAPE) == false)
    {
        instances.clear();

        for (int i = 0; i < 100; i++)
        {
            instances.push_back(main_instance->Instantiate(nullptr));
        }

        world->Update();
        world->Render(framebuffer);

        window->SetText("MEM: " + String(GetMemoryUsage() / 1024 / 1024) + " mb");
    }
    return 0;
}

 

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Okay, two more things I found:

  • I added a global list of entities, but since it stores weak pointers it won't get cleaned up until the user calls GetEntities() if large numbers of entities are deleted. I added an internal call to this method in the world Update method, so it will always trim the list of dead entities.
  • Your example is adding entities in faster than the rendering thread can keep up. The entities are being added each frame, but the rendering thread keeps them around a little bit longer, so it is being overwhelmed. There is a renderable entity limit of 65536 (some entities take up more than one slot, so it can be a little less than this). This is because the engine stores entity IDs as unsigned short integers on the GPU.

There is a tendency for things to grow a little bit when numbers of items fluctuate, due to the nature of how memory resizing is implemented. STL vectors for example get bigger when they need to, but they don't release memory if the are resized to a small size, with the idea they may need to grow again. In the same manner, I don't ever make GPU storage buffers smaller, I just let them grow as needed, and if the application needs less space, I just keep the extra memory as a reserve. As long as the memory display in VS studio looks flat after a few moments, then you are good.

I will check to see if there is anything else I can improve and then do another build and upload these fixes.

#include "UltraEngine.h"

using namespace UltraEngine;

int main(int argc, const char* argv[])
{
    //Get the displays
    auto displays = GetDisplays();

    //Create a window
    auto window = CreateWindow("Ultra Engine", 0, 0, 1280 * displays[0]->scale, 720 * displays[0]->scale, displays[0], WINDOW_CENTER | WINDOW_TITLEBAR);

    //Create a framebuffer
    auto framebuffer = CreateFramebuffer(window);

    //Create a world
    auto world = CreateWorld();

    auto camera = CreateCamera(world);
    camera->SetClearColor(0.125);
    camera->SetFov(70);
    camera->Move(0, 2, -8);

    shared_ptr<Entity> main_instance = CreatePlane(world, 10.0, 10.0, 256, 256);
    //shared_ptr<Entity> main_instance = CreateModel(world);
    vector<shared_ptr<Entity>> instances;

    int n = 0;

    //Main loop
    while (window->Closed() == false and window->KeyDown(KEY_ESCAPE) == false)
    {
        instances.clear();

        ++n;
        if (n == 100)
        {
            n = 0;
            for (int i = 0; i < 100; i++) instances.push_back(main_instance->Instantiate(world));
        }

        world->Update();
        world->Render(framebuffer);

        window->SetText("MEM: " + String(GetMemoryUsage() / 1024) + " kb");
    }

    return 0;
}

 

  • Like 1

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

  • 1 month later...

@Josh I need to reopen this, the top sample produces a lot of memory again (0.9.5), also after a short time period it produces INVALID _VALUE errors.

I know the rendering is async, but the instance count always says 505 instances instead of 100 (maybe +1 one for the camera).

 

  • Intel® Core™ i7-8550U @ 1.80 Ghz 
  • 16GB RAM 
  • INTEL UHD Graphics 620
  • Windows 10 Pro 64-Bit-Version
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...