Jump to content

klepto2

Developers
  • Posts

    854
  • Joined

  • Last visited

Posts posted by klepto2

  1. What i have found, is that the IDs are stored in the Render... classes, RenderMaterial, RenderTexture, etc. . These are all stored in private properties. There is a hackish way to access them (creating a separate class and modifying the original headers) but with every update this will be overwritten.

    I would like to add also Texture ID access, which might be useful for PostEffects. 

    Maybe some static class extensions for the IDSystem like

    IDSystem::GetMaterialID(mat) 

    IDSystem::GetTextureID(tex)

    ...

    • Like 1
    • Upvote 1
  2. Just forgot for those who will just take a look how the shader creation is done: 

        /// <summary>
    /// Sample Buffer structure
    /// The Padding is needed because in vulkan the buffer passed to a shader needs to have an alignment of 16.
    /// </summary>
    struct SampleComputeParameters
    {
        float size;
        float padding1;
        float padding2;
        float padding3;
        Vec4 color;
    };
    
    	SampleComputeParameters sampleParameters;
        sampleParameters.size = 64.0;
        sampleParameters.color = Vec4(1.0, 0.0, 0.0, 1.0);
       
        // Create a first computeshader for uniform buffer usage
        // Normally you will use uniform buffers for data which is not changing, this is just for showcase 
        // and shows that the data can still be updated at runtime
        auto sampleComputePipeLine_Unifom = ComputeShader::Create("Shaders/Compute/simple_test.comp.spv");
        auto targetTexture_uniform = CreateTexture(TEXTURE_2D, 512, 512, TEXTURE_RGBA32, {}, 1, TEXTURE_STORAGE, TEXTUREFILTER_LINEAR);
        // Now we define the descriptor layout, the binding is resolved by the order in which the items are added
        sampleComputePipeLine_Unifom->AddTargetImage(targetTexture_uniform); // Seting up a target image --> layout 0
        sampleComputePipeLine_Unifom->AddUniformBuffer(&sampleParameters, sizeof(SampleComputeParameters), false); // Seting up a uniform bufffer --> layout 1
    
        // Create a first computeshader for push constant usage
        // This is the better way to pass dynamic data
        auto sampleComputePipeLine_Push = ComputeShader::Create("Shaders/Compute/simple_test_push.comp.spv");
        auto targetTexture_push= CreateTexture(TEXTURE_2D, 512, 512, TEXTURE_RGBA32, {}, 1, TEXTURE_STORAGE, TEXTUREFILTER_LINEAR);
        sampleComputePipeLine_Push->AddTargetImage(targetTexture_push);
        sampleComputePipeLine_Push->SetupPushConstant(sizeof(SampleComputeParameters)); // Currently used to initalize the pipeline, may change in the future
    
        // For demonstration the push based shader is executed continously
        // The push-constant data is passed here
        sampleComputePipeLine_Push->BeginDispatch(world, targetTexture_uniform->GetSize().x / 16.0, targetTexture_uniform->GetSize().y / 16.0, 1, false, ComputeHook::TRANSFER, &sampleParameters, sizeof(SampleComputeParameters));

    And this is how the push-shader code looks:

    #version 450
    
    #extension GL_GOOGLE_include_directive : enable
    #extension GL_ARB_separate_shader_objects : enable
    #extension GL_ARB_shading_language_420pack : enable
    
    
    layout (local_size_x = 16, local_size_y = 16) in;
    layout (set = 0, binding = 0, rgba32f) uniform image2D resultImage;
    
    layout (push_constant) uniform Contants
    {
    	float size;
    	vec4 color;
    } params;
    
    
    void main()
    {	
        vec2 tex_coords = floor(vec2(gl_GlobalInvocationID.xy) / params.size);
        float mask = mod(tex_coords.x + mod(tex_coords.y, 2.0), 2.0);
    	imageStore(resultImage,ivec2(gl_GlobalInvocationID.xy) , mask * params.color);
    }

     

    • Thanks 1
  3. Ok, Here it is: https://github.com/klepto2/UltraComputeShaderSample

    A first version of my not much cleaned up ComputeShader implementation. 

    It contains a small sample demonstrating the usage of uniform buffers and push constants.

    image.thumb.png.bb69dd7733d50a585df55dbdfb21e186.png

    Note: some things are missing: You can currently only write to textures, not to uniform buffers.

    Also, I plan to make the Descriptor layout creation more vulkan based and reusable, the shader modules will be reusable as well for multiple shaders and you can choose the entry point which should be used.

    • Like 2
    • Thanks 1
  4. Try it like this:

    class NewWidget : public Widget
    {
    protected:
        virtual void Draw(const int x, const int y, const int width, const int height)
        {
            if (blocks.size() < 4)
            {
                for (int i = 0; i < 4; i++)
                {
                    auto px = LoadPixmap("https://raw.githubusercontent.com/UltraEngine/Documentation/master/Assets/Materials/Ground/dirt01.dds");
                    int block = AddBlock(px, iVec2(i * 64, 0), Vec4(1));
                    blocks[block].size = iVec2(64, 64);
                }
            }
        }
    
    public:
        static shared_ptr<NewWidget> create(const int x, const int y, const int width, const int height, shared_ptr<Widget> parent, const int style = 0)
        {
            struct Struct : public NewWidget {
            };
            auto instance = std::make_shared<Struct>();
            instance->Initialize("", x, y, width, height, parent, style);
            return instance;
        }
    
        NewWidget()
        {
            
            
        }
    };

     

    • Like 1
    • Thanks 1
  5. Hi, sry that i wasn't here lately, i will check my compute-shader implemntation with the latest version and will provide a small sample together with my implementation.  

    The idea behind the callbacks is, that josh is maintaining all shader and texture code optimized for his internal rendering code (which makes absolutely sense). So in order to get access to the VulkanInstance etc. you hook into the TRANSFER or RENDER Hook (due to the architecture they are not initialized at the beginning, but on the fly, when the first vulkan code is used) and in this hooks you need to setup your shader pipeline with vulkan api yourself (you can load the shaderModule with the ultraengine command and use that as a base). 

     

    While i am preparing the sample, here is some pseudo code:

    class ComputeDispatchInfo : public Object
    {
    public:
    	shared_ptr<ComputeDispatchInfo> Self;
    	shared_ptr<ComputeShader> ComputeShader;
    	int Tx;
    	int Ty;
    	int Tz;
    	shared_ptr<World> World;
    	void* pushConstants = nullptr;
    	size_t pushConstantsSize = 0;
    	int pushConstantsOffset = 0;
    	ComputeHook hook = ComputeHook::RENDER;
    	int callCount = 0;
    };
    
    // RENDER/TRANSFER HOOK
    void BeginComputeShaderDispatch(const UltraEngine::Render::VkRenderer& renderer, shared_ptr<Object> extra)
    {
    	auto info = extra->As<ComputeDispatchInfo>();
    	if (info != nullptr)
    	{
    			info->ComputeShader->Dispatch(renderer.commandbuffer, info->Tx, info->Ty, info->Tz, info->pushConstants, info->pushConstantsSize, info->pushConstantsOffset);	
    	}
    }
    
    void ComputeShader::init(VkDevice device)
    {
    	if (!_initialized)
    	{
    		_computePipeLine = make_shared<ComputePipeline>();
    
    		initLayout(device);
    
    		VkComputePipelineCreateInfo info = { VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO };
    		info.stage.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
    		info.stage.stage = VK_SHADER_STAGE_COMPUTE_BIT;
    		info.stage.module = _shaderModule->GetHandle();
    		info.stage.pName = "main";
    		info.layout = _computePipeLine->pipelineLayout;
    
    		VK_CHECK_RESULT(vkCreateComputePipelines(device, VK_NULL_HANDLE, 1, &info, nullptr, &_computePipeLine->pipeline));
    
    		initLayoutData(device);
    
    		_initialized = true;
    	}
    }
      
    void ComputeShader::BeginDispatch(shared_ptr<World> world, int tx, int ty, int tz, bool oneTime, ComputeHook hook, void* pushData, size_t pushDataSize, int pushDataOffset)
    {
    	auto info = make_shared<ComputeDispatchInfo>();
    	info->ComputeShader = this->Self()->As<ComputeShader>();
    	info->Tx = tx;
    	info->Ty = ty;
    	info->Tz = tz;
    	info->World = world;
    	info->Self = info;
    	info->pushConstants = pushData;
    	info->pushConstantsSize = pushDataSize;
    	info->pushConstantsOffset = pushDataOffset;
    	info->hook = hook;
    
    	if (ComputeShader::DescriptorPool == nullptr)
    	{
    		ComputeShader::DescriptorPool = make_shared<ComputeDescriptorPool>(world);
    	}
    	switch (hook)
    	{
    	case ComputeHook::RENDER:
    		world->AddHook(HookID::HOOKID_RENDER, BeginComputeShaderDispatch, info, !oneTime);
    		break;
    	case ComputeHook::TRANSFER:
    		world->AddHook(HookID::HOOKID_TRANSFER, BeginComputeShaderDispatch, info, !oneTime);
    		break;
    	}
    }
    
    void ComputeShader::Dispatch(VkCommandBuffer cBuffer, int tx, int ty, int tz, void* pushData, size_t pushDataSize, int pushDataOffset)
    {
    	auto manager = UltraEngine::Core::GameEngine::Get()->renderingthreadmanager;
    
    	VkDevice device = manager->device->device;
    
    	_timestampQuery->Init(manager->device->physicaldevice, manager->device->device);
    
    	_timestampQuery->Reset(cBuffer);
    
    	//just for testing
    	
    	vector<VkImageMemoryBarrier> barriers;
    
    	bool barrierActive = false;
    	bool isValid = true;
    	auto path = _shaderModule->GetPath();
    
    	for (int index = 0; index < _bufferData.size(); index++)
    	{
    		if (_bufferData[index]->Texture != nullptr && _bufferData[index]->IsWrite)
    		{
    			/*auto rt = TextureMemberAccessor::GetRenderTexture(_bufferData[index]->Texture);
    			for (int miplevel = 0; miplevel < rt->miplevels; miplevel++)
    			{
    				auto layout = rt->GetLayout(miplevel);
    				if (layout == VK_IMAGE_LAYOUT_UNDEFINED)
    				{
    					return;
    				}
    			}*/
    
    			u_int layercount = 1;
    			if (_bufferData[index]->Texture->GetType() == TEXTURE_CUBE)
    			{
    				layercount = 6;
    			}
    
    			vks::tools::setImageLayout(cBuffer, _bufferData[index]->Texture->GetImage(), VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_GENERAL,
    				{ VK_IMAGE_ASPECT_COLOR_BIT, (u_int)_bufferData[index]->Texture->CountMipmaps()-1,VK_REMAINING_MIP_LEVELS, 0, layercount },VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT);
    
    			vks::tools::insertImageMemoryBarrier(cBuffer, _bufferData[index]->Texture->GetImage(), 0, VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_SHADER_READ_BIT,
    				VK_IMAGE_LAYOUT_UNDEFINED,
    				VK_IMAGE_LAYOUT_GENERAL,
    				VK_PIPELINE_STAGE_TRANSFER_BIT,
    				VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    				{ VK_IMAGE_ASPECT_COLOR_BIT, (u_int)_bufferData[index]->Texture->CountMipmaps() - 1,VK_REMAINING_MIP_LEVELS, 0, layercount });
    			//VkImageMemoryBarrier imageMemoryBarrier = {};
    
    			//u_int layercount = 1;
    			//if (_bufferData[index]->Texture->GetType() == TEXTURE_CUBE)
    			//{
    			//	layercount = 6;
    			//}
    			//imageMemoryBarrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
    			//imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
    			//imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_GENERAL;
    			//imageMemoryBarrier.image = _bufferData[index]->Texture->GetImage();
    			//imageMemoryBarrier.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, (u_int)_bufferData[index]->Texture->CountMipmaps()-1,VK_REMAINING_MIP_LEVELS, 0, layercount};
    
    			//// Acquire barrier for compute queue
    			//imageMemoryBarrier.srcAccessMask = 0;
    			//imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
    			//imageMemoryBarrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    			//imageMemoryBarrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    			//vkCmdPipelineBarrier(
    			//	cBuffer,
    			//	VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
    			//	VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    			//	0,
    			//	0, nullptr,
    			//	0, nullptr,
    			//	1, &imageMemoryBarrier);
    
    			//barriers.push_back(imageMemoryBarrier);
    			//barrierActive = true;
    			break;
    		}
    
    	}
    
    	//initializes the layout and Writedescriptors
    	init(device);
    
    	//updates the uniform buffer data when needed
    	updateData(device);
    
    	vkCmdBindPipeline(cBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, _computePipeLine->pipeline);
    
    	// Bind descriptor set.
    	vkCmdBindDescriptorSets(cBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, _computePipeLine->pipelineLayout, 0, 1,
    		&_computePipeLine->descriptorSet, 0, nullptr);
    
    	// Bind the compute pipeline.
    	
    
    	if (pushData != nullptr)
    	{
    		vkCmdPushConstants(cBuffer, _computePipeLine->pipelineLayout, VK_SHADER_STAGE_COMPUTE_BIT, pushDataOffset, pushDataSize, pushData);
    	}
    
    	_timestampQuery->write(cBuffer, 0);
    	// Dispatch compute job.
    	vkCmdDispatch(cBuffer, tx, ty, tz);
    
    	_timestampQuery->write(cBuffer, 1);
    	
    
    	if (barrierActive)
    	{
    		for (int i = 0; i < barriers.size(); i++)
    		{
    			// Release barrier from compute queue
    			barriers[i].srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
    			barriers[i].dstAccessMask = 0;
    			barriers[i].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    			barriers[i].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
    			vkCmdPipelineBarrier(
    				cBuffer,
    				VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
    				VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
    				0,
    				0, nullptr,
    				0, nullptr,
    				1, &barriers[i]);
    		}
    	}
    }
      

    Note: There is much more needed as you need to manage you own DescriptorSets VKImages etc. The idea is that the non ultraengine extensions are nearly completely unaware of the internal rendering,

    • Thanks 2
  6. I like the idea, but i doubt this will be done, as it would be ingame only. The native gui interface relies on the native drawing methods provided by the os. This is why can't use the texture block when you add an interface to a window. 

    An alternative would be some kind of tiled pixmap block for this purpose, with that you can define the margins needed and the block itself creates this kind of behaviour. but it would be hard to animate (at least performance wise) as we can't use 3d textures.

    I have an idea which i will try tomorrow :)

    • Like 1
  7. #define ALLOCEVENTID EventId(10001 + __COUNTER__) 
    #define RegisterEventEnum(T) \
    bool operator==(const T& t, const EventId& g) { return static_cast<EventId>(t) == g; }     \
    bool operator==(const EventId& g, const T& t) { return static_cast<EventId>(t) == g; }     \
                                                                                               \
    bool operator!=(const T& t, const EventId& g) { return static_cast<EventId>(t) != g; }     \
    bool operator!=(const EventId& g, const T& t) { return static_cast<EventId>(t) != g; }     \
    \
    EventId& operator<<=(EventId& g, T t) { g = static_cast<EventId>(t); return g; } \
    T& operator<<=(T& t, EventId g) { t = static_cast<T>(g); return t; }; \
    
    
    enum CustomEvent
    {
        TEST_1 = ALLOCEVENTID,
        TEST_2 = ALLOCEVENTID,
    };
    
    RegisterEventEnum(CustomEvent);

    Try this.

    [Edit] Forget it, it is not working. switch statements work out of the box, but functions will not work without proper casting

  8. #include "UltraEngine.h"
    
    using namespace UltraEngine;
    
    int main(int argc, const char* argv[])
    {
        //Get the displays
        auto displays = GetDisplays();
    
        //Create a window
        auto window = CreateWindow("Ultra Engine", 0, 0, 800, 600, displays[0], WINDOW_TITLEBAR | WINDOW_RESIZABLE);
    
        //Create User Interface
        auto ui = CreateInterface(window);
        auto sz = ui->root->ClientSize();
    
        //Create widget
        auto panel = CreatePanel(50, 50, sz.x - 100, sz.y - 100, ui->root);
        panel->SetColor(0, 0, 0, 1);
        panel->SetLayout(1, 1, 1, 1);
    
        auto pixmap = CreatePixmap(256, 256);
        pixmap->Fill(0xFF0000FF);
        panel->SetPixmap(pixmap);
    
        auto btnChangeColor1 = CreateButton("Change Color without fill", 50, 10, 200, 30, ui->root);
        auto btnChangeColor2 = CreateButton("Change Color with fill", 260, 10, 200, 30, ui->root);
    
        while (true)
        {
            const Event ev = WaitEvent();
            switch (ev.id)
            {
            case EVENT_WIDGETACTION:
            {
                if (ev.source == btnChangeColor1)
                {
                    int color = Rgba(Random(255), Random(255), Random(255), 255);
                    for (int x = 0; x < pixmap->size.width; x++)
                        for (int y = 0; y < pixmap->size.height; y++)
                            pixmap->WritePixel(x, y, color);
                }
                else if (ev.source == btnChangeColor2)
                {
                    pixmap->Fill(Rgba(Random(255), Random(255), Random(255), 255));
                }
    
                panel->Paint();
                break;
            }
    
            case EVENT_WINDOWCLOSE:
                return 0;
                break;
            }
        }
        return 0;
    }

    I am currently experimenting with redirecting the scintilla rendering to a pixmap and found a small bug.

    The pixmap is only updated, in this case on the panel, but also when using just the WidgetBlock, when you use the Fill method. any other pixel manipulation is not working when using pixmaps in the ui. I assume, that it might have to do with the underlying Bitmap object is not updated. 

    A nice way would be to have something to mark the pixmap as dirty, as i need to use memcpy for performance reasons and then the pixmap will not know if it has chnaged or not.

  9. The pixmap and texture block uses the alpha value of the pixmap or texture itself.  I do something like this to render alpha based pixmaps:

    auto data = _scintillaRenderTarget->pixels->Data();
    
    				for (i = 0; i < _scintillaRenderTarget->pixels->GetSize(); i += 4)
    				{
    					index = i;
    					B = data[index]; G = data[index + 1]; R = data[index + 2];
    					data[index] = R; data[index + 1] = G; data[index + 2] = B;
    					data[index + 3] = char(color[WIDGETCOLOR_BACKGROUND].a * 255);
    				}

    but i agree, the source alpha should be mulitplied with the destination alpha in the case of widget usage.

    • Upvote 1
  10. #include "UltraEngine.h"
    
    using namespace UltraEngine;
    
    int main(int argc, const char* argv[])
    {
        //Get the displays
        auto displays = GetDisplays();
    
        //Create a window
        auto window = CreateWindow("Ultra Engine", 0, 0, 1280, 720, displays[0], WINDOW_CENTER | WINDOW_TITLEBAR);
    
        //Create a world
        auto world = CreateWorld();
    
        //Create a framebuffer
        auto framebuffer = CreateFramebuffer(window);
    
        //Create a camera
        auto camera = CreateCamera(world);
        camera->SetClearColor(0.125);
        camera->SetFov(70);
        camera->SetPosition(0, 0, -3);
    
        auto texture = CreateTexture(TEXTURE_2D, 512, 512);
        auto pixmap = CreatePixmap(512, 512);
    
        //Main loop
        while (window->Closed() == false and window->KeyDown(KEY_ESCAPE) == false)
        {
            texture->SetPixels(pixmap);
            world->Update();
            world->Render(framebuffer);
        }
        return 0;
    }

    Using Texture::SetPixels (pixmap or buffer) leads to fast rising memory. 

  11. When the KTX2 plugin is loaded before the FreeImage plugin, the Pixmap::Save method throws an exception in the KTX2 plugin when trying to save a jpg or png file:

    #include "UltraEngine.h"
    
    using namespace UltraEngine;
    
    #define BUG
    
    int main(int argc, const char* argv[])
    {
    #ifdef BUG
        auto plg_2 = LoadPlugin("Plugins/KTX2TextureLoader");
        auto plg_1 = LoadPlugin("Plugins/FITextureLoader");
    #else
        auto plg_1 = LoadPlugin("Plugins/FITextureLoader");
        auto plg_2 = LoadPlugin("Plugins/KTX2TextureLoader");
    #endif
    
        auto pixmap = CreatePixmap(256, 256);
        pixmap->Save("test.jpg");
        pixmap->Save("test.png");
    
        return 0;
    }

     

  12. Keep in mind that Threads are managed by the operating system and depending on the OS some threads can be blocked or used by other programs. Also if you have a modern cpu you normally have also something called Hyperthreading, which means you normally you can use MaxThreads() * 2.0 in parallel, but this depends on the cpu.

    Normally the os threadmanagement is highly optimized, so you could push 100 or 1000 threads at once and the os will handle the execution order by itself. So don't worry so much about the actual CPU usage or amount of threads you push to the cpu.

    • Thanks 1
  13. Small addition: This might not be the case for the int values in this case, they are just used for simplicity. int operations are atomic, and should work, without using  a lock for reading. More complex objects of course can have other behavior and may need read and write mutex or other types of memory barriers.

  14. Normally, I would use a mutex for writing and reading. 

    Sample:

    With only read mutex:

    Thread A : Writes to node x the value 1 --> Just begins writing

    Thread B : Locks the Mutex and reads the value 0 and unlocks the mutex --> Thread A hasn't finished writing the 1 into the memory

    Thread A : Finishes

    Thread C : Locks the Mutex and reads the value 1 --> Thread A has finished writing the 1 into the memory and unlocks the mutex

    The read results might get out of sync.

     

     

    With  read and write mutex:

    Thread A : Locks the mutex and writes to node x the value 1 --> Just begins writing

    Thread B : Waits for the unlocking of the mutex

    Thread A : Unlocks the Mutex: --> Finished writing

    Thread B : Locks the mutex and Reads the value 1 from memory and unlocks the mutex afterwards

    Thread C : Locks the Mutex and reads the value 1  from memory and unlocks the mutex afterwards 

    The results are always in sync.

    The read and write approach is of course much slower then just locking the read. You need to make the mutex locks as small as possible and maybe optimize them to only lock when it is really necessary.

     

     

     

     

  15. This is how i have done it with the ShaderWatcher:

    void UltraEngine::Utilities::Shader::ShaderWatcher::RunPreprocess(vector<shared_ptr<ShaderFile>> files)
    {
    	int time = Millisecs();
    	Print("Preprocessing... (" + WString(files.size()) + " Shaders)");
    
    	vector<shared_ptr<Thread>> threads;
    
    	for (auto f : files)
    	{
    		threads.push_back(CreateThread(bind(ThreadPreprocess,_compiler, f), true));
    	}
    
    	for (auto t : threads)
    		t->Wait();
    
    	Print("Preprocessing finished... (" + WString(Millisecs() - time) + "ms)");
    }

    A semaphore or mutex isn't needed here as there are no resources shared by any thread.  A mutex is a good way to sync access to specific functions which are not threadsafe. eg: Print. Semaphores (technically a Mutex is just a specialized Version of a semaphore) can be used for syncing as well, but also to limit the amount of maximum parallel threads used for execution. 

    • Thanks 1
×
×
  • Create New...