Jump to content

Compute Shaders


SpiderPig
 Share

Recommended Posts

I'd like to try using these now.  Is this the way to use them? I found this here.

//Load compute shader
auto module = LoadShaderModule("Shaders/Compute/test.comp.spv");
auto shader = CreateShader();
shader->SetModule(module, SHADER_COMPUTE);

//Create work group
int workercount = 8;
auto workgroup = CreateWorkgroup(shader, workercount, workercount, workercount);

I couldn't find CreateWorkgroup() in the API though.

Link to comment
Share on other sites

No, it's really weird stuff. You call World::AddHook with HOOKID_RENDER or HOOKID_TRANSFER as the argument, along with your function. It will pass a class to your function that contains a lot of Vulkan structures, and then it's up to you.

  • Thanks 1

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

The transfer command buffer is executed once. The rendering command buffer may be reused several times.

If something is an action that gets performed once, use the TRANSFER hook ID. If it is normal rendering, it should be in a RENDER hook.

  • Thanks 1

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Yes, that is the way it's supposed to work.

When you use a hook you aren't calling Vulkan commands that get executed immediately. You are usually adding Vulkan command calls into a command buffer which is executed at a later time. The transfer command buffer only gets executed once, and the rendering command buffer may be executed several times. This is the way Vulkan works, everything is stored in a command buffer, then the whole command buffer is executed at once.

  • Thanks 1

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Hi, sry that i wasn't here lately, i will check my compute-shader implemntation with the latest version and will provide a small sample together with my implementation.  

The idea behind the callbacks is, that josh is maintaining all shader and texture code optimized for his internal rendering code (which makes absolutely sense). So in order to get access to the VulkanInstance etc. you hook into the TRANSFER or RENDER Hook (due to the architecture they are not initialized at the beginning, but on the fly, when the first vulkan code is used) and in this hooks you need to setup your shader pipeline with vulkan api yourself (you can load the shaderModule with the ultraengine command and use that as a base). 

 

While i am preparing the sample, here is some pseudo code:

class ComputeDispatchInfo : public Object
{
public:
	shared_ptr<ComputeDispatchInfo> Self;
	shared_ptr<ComputeShader> ComputeShader;
	int Tx;
	int Ty;
	int Tz;
	shared_ptr<World> World;
	void* pushConstants = nullptr;
	size_t pushConstantsSize = 0;
	int pushConstantsOffset = 0;
	ComputeHook hook = ComputeHook::RENDER;
	int callCount = 0;
};

// RENDER/TRANSFER HOOK
void BeginComputeShaderDispatch(const UltraEngine::Render::VkRenderer& renderer, shared_ptr<Object> extra)
{
	auto info = extra->As<ComputeDispatchInfo>();
	if (info != nullptr)
	{
			info->ComputeShader->Dispatch(renderer.commandbuffer, info->Tx, info->Ty, info->Tz, info->pushConstants, info->pushConstantsSize, info->pushConstantsOffset);	
	}
}

void ComputeShader::init(VkDevice device)
{
	if (!_initialized)
	{
		_computePipeLine = make_shared<ComputePipeline>();

		initLayout(device);

		VkComputePipelineCreateInfo info = { VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO };
		info.stage.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
		info.stage.stage = VK_SHADER_STAGE_COMPUTE_BIT;
		info.stage.module = _shaderModule->GetHandle();
		info.stage.pName = "main";
		info.layout = _computePipeLine->pipelineLayout;

		VK_CHECK_RESULT(vkCreateComputePipelines(device, VK_NULL_HANDLE, 1, &info, nullptr, &_computePipeLine->pipeline));

		initLayoutData(device);

		_initialized = true;
	}
}
  
void ComputeShader::BeginDispatch(shared_ptr<World> world, int tx, int ty, int tz, bool oneTime, ComputeHook hook, void* pushData, size_t pushDataSize, int pushDataOffset)
{
	auto info = make_shared<ComputeDispatchInfo>();
	info->ComputeShader = this->Self()->As<ComputeShader>();
	info->Tx = tx;
	info->Ty = ty;
	info->Tz = tz;
	info->World = world;
	info->Self = info;
	info->pushConstants = pushData;
	info->pushConstantsSize = pushDataSize;
	info->pushConstantsOffset = pushDataOffset;
	info->hook = hook;

	if (ComputeShader::DescriptorPool == nullptr)
	{
		ComputeShader::DescriptorPool = make_shared<ComputeDescriptorPool>(world);
	}
	switch (hook)
	{
	case ComputeHook::RENDER:
		world->AddHook(HookID::HOOKID_RENDER, BeginComputeShaderDispatch, info, !oneTime);
		break;
	case ComputeHook::TRANSFER:
		world->AddHook(HookID::HOOKID_TRANSFER, BeginComputeShaderDispatch, info, !oneTime);
		break;
	}
}

void ComputeShader::Dispatch(VkCommandBuffer cBuffer, int tx, int ty, int tz, void* pushData, size_t pushDataSize, int pushDataOffset)
{
	auto manager = UltraEngine::Core::GameEngine::Get()->renderingthreadmanager;

	VkDevice device = manager->device->device;

	_timestampQuery->Init(manager->device->physicaldevice, manager->device->device);

	_timestampQuery->Reset(cBuffer);

	//just for testing
	
	vector<VkImageMemoryBarrier> barriers;

	bool barrierActive = false;
	bool isValid = true;
	auto path = _shaderModule->GetPath();

	for (int index = 0; index < _bufferData.size(); index++)
	{
		if (_bufferData[index]->Texture != nullptr && _bufferData[index]->IsWrite)
		{
			/*auto rt = TextureMemberAccessor::GetRenderTexture(_bufferData[index]->Texture);
			for (int miplevel = 0; miplevel < rt->miplevels; miplevel++)
			{
				auto layout = rt->GetLayout(miplevel);
				if (layout == VK_IMAGE_LAYOUT_UNDEFINED)
				{
					return;
				}
			}*/

			u_int layercount = 1;
			if (_bufferData[index]->Texture->GetType() == TEXTURE_CUBE)
			{
				layercount = 6;
			}

			vks::tools::setImageLayout(cBuffer, _bufferData[index]->Texture->GetImage(), VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_GENERAL,
				{ VK_IMAGE_ASPECT_COLOR_BIT, (u_int)_bufferData[index]->Texture->CountMipmaps()-1,VK_REMAINING_MIP_LEVELS, 0, layercount },VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT);

			vks::tools::insertImageMemoryBarrier(cBuffer, _bufferData[index]->Texture->GetImage(), 0, VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_SHADER_READ_BIT,
				VK_IMAGE_LAYOUT_UNDEFINED,
				VK_IMAGE_LAYOUT_GENERAL,
				VK_PIPELINE_STAGE_TRANSFER_BIT,
				VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
				{ VK_IMAGE_ASPECT_COLOR_BIT, (u_int)_bufferData[index]->Texture->CountMipmaps() - 1,VK_REMAINING_MIP_LEVELS, 0, layercount });
			//VkImageMemoryBarrier imageMemoryBarrier = {};

			//u_int layercount = 1;
			//if (_bufferData[index]->Texture->GetType() == TEXTURE_CUBE)
			//{
			//	layercount = 6;
			//}
			//imageMemoryBarrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
			//imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
			//imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_GENERAL;
			//imageMemoryBarrier.image = _bufferData[index]->Texture->GetImage();
			//imageMemoryBarrier.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, (u_int)_bufferData[index]->Texture->CountMipmaps()-1,VK_REMAINING_MIP_LEVELS, 0, layercount};

			//// Acquire barrier for compute queue
			//imageMemoryBarrier.srcAccessMask = 0;
			//imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
			//imageMemoryBarrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
			//imageMemoryBarrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
			//vkCmdPipelineBarrier(
			//	cBuffer,
			//	VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
			//	VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
			//	0,
			//	0, nullptr,
			//	0, nullptr,
			//	1, &imageMemoryBarrier);

			//barriers.push_back(imageMemoryBarrier);
			//barrierActive = true;
			break;
		}

	}

	//initializes the layout and Writedescriptors
	init(device);

	//updates the uniform buffer data when needed
	updateData(device);

	vkCmdBindPipeline(cBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, _computePipeLine->pipeline);

	// Bind descriptor set.
	vkCmdBindDescriptorSets(cBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, _computePipeLine->pipelineLayout, 0, 1,
		&_computePipeLine->descriptorSet, 0, nullptr);

	// Bind the compute pipeline.
	

	if (pushData != nullptr)
	{
		vkCmdPushConstants(cBuffer, _computePipeLine->pipelineLayout, VK_SHADER_STAGE_COMPUTE_BIT, pushDataOffset, pushDataSize, pushData);
	}

	_timestampQuery->write(cBuffer, 0);
	// Dispatch compute job.
	vkCmdDispatch(cBuffer, tx, ty, tz);

	_timestampQuery->write(cBuffer, 1);
	

	if (barrierActive)
	{
		for (int i = 0; i < barriers.size(); i++)
		{
			// Release barrier from compute queue
			barriers[i].srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
			barriers[i].dstAccessMask = 0;
			barriers[i].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
			barriers[i].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
			vkCmdPipelineBarrier(
				cBuffer,
				VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
				VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
				0,
				0, nullptr,
				0, nullptr,
				1, &barriers[i]);
		}
	}
}
  

Note: There is much more needed as you need to manage you own DescriptorSets VKImages etc. The idea is that the non ultraengine extensions are nearly completely unaware of the internal rendering,

  • Thanks 2
  • Intel® Core™ i7-8550U @ 1.80 Ghz 
  • 16GB RAM 
  • INTEL UHD Graphics 620
  • Windows 10 Pro 64-Bit-Version
Link to comment
Share on other sites

Ok, Here it is: https://github.com/klepto2/UltraComputeShaderSample

A first version of my not much cleaned up ComputeShader implementation. 

It contains a small sample demonstrating the usage of uniform buffers and push constants.

image.thumb.png.bb69dd7733d50a585df55dbdfb21e186.png

Note: some things are missing: You can currently only write to textures, not to uniform buffers.

Also, I plan to make the Descriptor layout creation more vulkan based and reusable, the shader modules will be reusable as well for multiple shaders and you can choose the entry point which should be used.

  • Like 2
  • Thanks 1
  • Intel® Core™ i7-8550U @ 1.80 Ghz 
  • 16GB RAM 
  • INTEL UHD Graphics 620
  • Windows 10 Pro 64-Bit-Version
Link to comment
Share on other sites

Just forgot for those who will just take a look how the shader creation is done: 

    /// <summary>
/// Sample Buffer structure
/// The Padding is needed because in vulkan the buffer passed to a shader needs to have an alignment of 16.
/// </summary>
struct SampleComputeParameters
{
    float size;
    float padding1;
    float padding2;
    float padding3;
    Vec4 color;
};

	SampleComputeParameters sampleParameters;
    sampleParameters.size = 64.0;
    sampleParameters.color = Vec4(1.0, 0.0, 0.0, 1.0);
   
    // Create a first computeshader for uniform buffer usage
    // Normally you will use uniform buffers for data which is not changing, this is just for showcase 
    // and shows that the data can still be updated at runtime
    auto sampleComputePipeLine_Unifom = ComputeShader::Create("Shaders/Compute/simple_test.comp.spv");
    auto targetTexture_uniform = CreateTexture(TEXTURE_2D, 512, 512, TEXTURE_RGBA32, {}, 1, TEXTURE_STORAGE, TEXTUREFILTER_LINEAR);
    // Now we define the descriptor layout, the binding is resolved by the order in which the items are added
    sampleComputePipeLine_Unifom->AddTargetImage(targetTexture_uniform); // Seting up a target image --> layout 0
    sampleComputePipeLine_Unifom->AddUniformBuffer(&sampleParameters, sizeof(SampleComputeParameters), false); // Seting up a uniform bufffer --> layout 1

    // Create a first computeshader for push constant usage
    // This is the better way to pass dynamic data
    auto sampleComputePipeLine_Push = ComputeShader::Create("Shaders/Compute/simple_test_push.comp.spv");
    auto targetTexture_push= CreateTexture(TEXTURE_2D, 512, 512, TEXTURE_RGBA32, {}, 1, TEXTURE_STORAGE, TEXTUREFILTER_LINEAR);
    sampleComputePipeLine_Push->AddTargetImage(targetTexture_push);
    sampleComputePipeLine_Push->SetupPushConstant(sizeof(SampleComputeParameters)); // Currently used to initalize the pipeline, may change in the future

    // For demonstration the push based shader is executed continously
    // The push-constant data is passed here
    sampleComputePipeLine_Push->BeginDispatch(world, targetTexture_uniform->GetSize().x / 16.0, targetTexture_uniform->GetSize().y / 16.0, 1, false, ComputeHook::TRANSFER, &sampleParameters, sizeof(SampleComputeParameters));

And this is how the push-shader code looks:

#version 450

#extension GL_GOOGLE_include_directive : enable
#extension GL_ARB_separate_shader_objects : enable
#extension GL_ARB_shading_language_420pack : enable


layout (local_size_x = 16, local_size_y = 16) in;
layout (set = 0, binding = 0, rgba32f) uniform image2D resultImage;

layout (push_constant) uniform Contants
{
	float size;
	vec4 color;
} params;


void main()
{	
    vec2 tex_coords = floor(vec2(gl_GlobalInvocationID.xy) / params.size);
    float mask = mod(tex_coords.x + mod(tex_coords.y, 2.0), 2.0);
	imageStore(resultImage,ivec2(gl_GlobalInvocationID.xy) , mask * params.color);
}

 

  • Thanks 1
  • Intel® Core™ i7-8550U @ 1.80 Ghz 
  • 16GB RAM 
  • INTEL UHD Graphics 620
  • Windows 10 Pro 64-Bit-Version
Link to comment
Share on other sites

  • 6 months later...

you might need to change the

ComputeHook::TRANSFER

to 

 

ComputeHook::RENDER

Since a few updated the behaviour of the Hooks have changed internally.

 

 

  • Thanks 1
  • Intel® Core™ i7-8550U @ 1.80 Ghz 
  • 16GB RAM 
  • INTEL UHD Graphics 620
  • Windows 10 Pro 64-Bit-Version
Link to comment
Share on other sites

Thanks, I did this in BeginDispatch():

switch (hook)
{
case ComputeHook::RENDER:
	world->AddHook(HookID::HOOKID_RENDER, BeginComputeShaderDispatch, info, !oneTime);
	break;
case ComputeHook::TRANSFER:
	world->AddHook(HookID::HOOKID_RENDER, BeginComputeShaderDispatch, info, !oneTime);
	break;
}

It doesn't crash now and the cubes spin but remain pink.

ComputeShaderCubes.thumb.png.0fe4c85bc8fba5a7739d5bf9b08cfbfe.png

Link to comment
Share on other sites

You need to copy the material, shader and maybe the plugin  folder  from an existing project. The sample is based on a very old beta.

Also you  need to remove the LoadShaderFamily part in the section where the materials are loaded. (or rename pbr.json to pbr.fam)

I have debugged the app with nvidia nsight and the textures are generated correctly.

  • Thanks 1
  • Intel® Core™ i7-8550U @ 1.80 Ghz 
  • 16GB RAM 
  • INTEL UHD Graphics 620
  • Windows 10 Pro 64-Bit-Version
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...