Regular buffers#
The way rendering resources are managed by VTK internally imposed some constraints on the design of the compute API. The goal of this document is to explain the order / flow in which WebGPU resources are managed by the compute API and why it was made that way.
Let’s start with a usage example of the compute API when not integrated in an existing rendering pipeline:
// 1)
vtkNew<vtkWebGPUComputeBuffer> inputBuffer;
inputBuffer->SetGroup(0);
inputBuffer->SetBinding(0);
inputBuffer->SetMode(...);
inputBuffer->SetData(...);
vtkNew<vtkWebGPUComputePipeline> computePipeline;
computePipeline->SetShaderSource(...);
computePipeline->SetShaderEntryPoint(...);
// 2)
computePipeline->AddBuffer(inputBuffer);
// ...
// 3)
computePipeline->Dispatch();
vtkWebGPUComputeBufferobjects just hold buffer references and parameters, the real buffer-setup work is done inAddBuffer()andDispatch().AddBuffer()uses the information contained invtkWebGPUComputeBufferto create the buffer on the device and upload the given data (for input buffers). The buffers (wgpu::Bufferas well asvtkWebGPUComputeBuffer) are added to their respective list, kept as member variables of thevtkWebGPUComputeAPI. These two lists are mainly used byReadBufferFromGPU()when the size of the buffer is needed (theBufferslist is used to retrieve the size) and thewgpu::Bufferitself is needed in the mapping process.
// Adding the buffers to the lists
this->Buffers.push_back(buffer);
this->WGPUBuffers.push_back(wgpuBuffer);
The AddBuffer() method also creates the BindGroupLayoutEntry and BindGroupEntry associated with
the buffer. These entries will be used when creating the BindGroupLayout and BindGroup in Dispatch():
// Creating the layout entry and the bind group entry for this buffer. These entries will be used
// later when creating the bind groups / bind group layouts
AddBindGroupLayoutEntry(buffer->GetGroup(), buffer->GetBinding(), buffer->GetMode());
AddBindGroupEntry(wgpuBuffer, buffer->GetGroup(), buffer->GetBinding(), buffer->GetMode(), 0);
Dispatch()not only dispatches the compute shader but also finishes the creation of the pipeline:
if (!Initialized)
{
CreateShaderModule();
CreateBindGroups();
CreateComputePipeline();
Initialized = true;
}
The BindGroups are created in the Dispatch() call because it is only when the user dispatches
the compute that we can be sure that all the buffers have been given and thus that we can actually
create the BindGroups (and the pipeline that goes with it).
The shader is compiled and the pipeline is also created in the Dispatch() call.
The takeaway point is that there are some operations that need to be done only when Dispatch() is called
because this is the only moment where we can be sure the user has provided all the relevant information we
need to fully setup the pipeline. An alternative would have been to provide a Finish() method that would
be called by the user before Dispatch(). However, this would be a non-intuitive additional step for the user
to perform as having to call a Finish() function would slightly expose the internal workings of the API
(that some operations need to be completed before actually being able to dispatch).
Render buffers#
We also uses the same logic (complete the setup of the pipeline when everything is configured) when the compute pipeline is integrated in a rendering pipeline:
// Getting the WebGPUMapper to access the point attribute render buffers
vtkWebGPUPolyDataMapper* webGPUMapper = vtkWebGPUPolyDataMapper::SafeDownCast(mapper);
// 1)
int bufferGroup = 0, bufferBinding = 0;
int uniformsGroup = 0, uniformsBinding = 1;
vtkSmartPointer<vtkWebGPUComputeRenderBuffer> pointColorsRenderBuffer =
webGPUMapper->AcquirePointAttributeComputeRenderBuffer(
vtkWebGPUPolyDataMapper::PointDataAttributes::COLORS, bufferGroup, bufferBinding,
uniformsGroup, uniformsBinding);
vtkNew<vtkWebGPUComputePipeline> myPipeline;
// 2)
myPipeline->AddRenderBuffer(pointColorsRenderBuffer);
// ...
// ...
// ...
// 3)
renWin->Render();
The role of
AcquirePointAttributeComputeRenderBuffer()(or the cell attribute equivalent) is to set up thevtkWebGPUComputeRenderBuffer(which is only a parameter holder, same asvtkWebGPUComputeBuffer) so that it is ready to be created when we have all the information required. The buffer (which is returned by the function) is also added to a list ofvtkWebGPUComputeRenderBufferheld by the WebGPUMapper that will be used later.AddRenderBuffer()also acts as a setup step. No creation is done here (contrary toAddBuffer()which does some buffer creation / data upload) because we do not have the data attributes buffer of the mapper yet. This data attribute buffer is created only on calling render so that’s where most of the work is done.Calling the render window’s
Render()function will in turn end up callingvtkWebGPURenderer::DeviceRender(). It is theDeviceRender()method that will actually create the device buffers by callingUpdateComputeBuffers():
void vtkWebGPURenderer::DeviceRender()
{
// 1)
// mappers prepare geometry SSBO and pipeline layout.
this->UpdateGeometry();
this->CreateBuffers(); // 2)
this->UpdateBufferData(); // 3)
// 4)
this->UpdateComputePipelines();
// 5)
this->ComputePass();
// ...
}
The actual data buffer that contains the point/cell attributes is created by the
UpdateGeometry()call.CreateBuffers()creates the transform / light buffers of the scene.UpdateBufferDatauploads the transform / light data to their buffers.Now that the mapper data buffer has been created, it is possible to completely setup the render buffer (previously added to a list held by the mapper) and add it to the pipeline. This is done by
UpdateComputePipelines(). This function loops over all the pipelines that have yet to be configured and sets theirvtkWebGPUConfigurationto be the same as the one from thevtkWebGPURenderWindow. This is necessary because if we want our compute pipeline to use buffers that have been created by thewgpu::Deviceof the render window, the compute pipeline is going to have to use the samewgpu::Device(and adapter). If a compute pipeline was set up (which means that it wasn’t set up before), we also loop through all the actors of the renderer. For each actor, we retrieve the render buffers that have yet to be set up and we set them up. The setup of each buffer can now be completed since we have access to the size of the mapper’s data (point/cell attributes) buffer which wasn’t created until now. Once the render buffer is set up, it is added to its compute pipeline withvtkWebGPUComputePipeline::SetupRenderBuffer():
void vtkWebGPURenderer::UpdateComputePipelines()
{
for (vtkSmartPointer<vtkWebGPUComputePipeline> computePipeline : this->NotSetupComputePipelines)
{
computePipeline->SetWGPUConfiguration(webGPURenderWindow->GetWGPUConfiguration());
this->UpdateComputeBuffers(computePipeline);
}
}
void vtkWebGPURenderer::UpdateComputeBuffers()
{
for (vtkSmartPointer<vtkWebGPUComputeRenderBuffer> renderBuffer : wgpuMapper->GetComputeRenderBuffers())
{
vtkWebGPUPolyDataMapper::PointDataAttributes bufferAttribute = renderBuffer->GetBufferAttribute();
// Setup of the render buffer
renderBuffer->SetMode(vtkWebGPUComputeBuffer::BufferMode::READ_WRITE_COMPUTE_STORAGE);
renderBuffer->SetByteSize(wgpuMapper->GetPointAttributeByteSize(bufferAttribute));
renderBuffer->SetRenderBufferOffset(wgpuMapper->GetPointAttributeByteOffset(bufferAttribute) / sizeof(float));
renderBuffer->SetRenderBufferElementCount(wgpuMapper->GetPointAttributeByteSize(bufferAttribute) / wgpuMapper->GetPointAttributeElementSize(bufferAttribute));
renderBuffer->SetWGPUBuffer(wgpuMapper->GetPointDataWGPUBuffer(bufferAttribute));
// Setup done, the render buffer can be added to the pipeline
vtkWebGPUComputePipeline* associatedPipeline = renderBuffer->GetAssociatedPipeline();
associatedPipeline->SetupRenderBuffer(renderBuffer);
}
}
After all the render buffers have been added to their corresponding pipelines (done only once on the first frame), the pipelines can be dispatched:
void vtkWebGPURenderer::DeviceRender()
{
// ...
this->ComputePass();
// ...
}
void vtkWebGPURenderer::ComputePass()
{
// Executing the compute pipelines before the rendering so that the
// render can take the compute pipelines results into account
for (vtkWebGPUComputePipeline* pipeline : this->ComputePipelines)
{
pipeline->Dispatch();
}
}
The bind groups of the pipelines will be created by the Dispatch() call as explained
in the ‘Regular buffers’ section and everything will be in order.
So overall, the reason why some parts of the setup are done immediately while other parts are done
only when rendering a frame / calling Dispatch() is because we may not have all required
pieces of information until rendering the first frame / calling Dispatch().
Buffer/Textures registration#
Because it may be useful to use a buffer created for a compute pass in another compute pass, every time a buffer (or texture or any object) is added to a compute pass, it is also registered in the compute pipeline associated with the compute pass. This is so that next time the same buffer is added to another compute pass, it is found in the “registry” of the compute pipeline and can be reused for that compute pass without creating a new WebGPU object.
This registry bookkeeping is done thanks to the two maps of vtkWebGPUComputePipeline
std::unordered_map<vtkSmartPointer<vtkWebGPUComputeBuffer>, wgpu::Buffer> RegisteredBuffers;
std::unordered_map<vtkSmartPointer<vtkWebGPUComputeTexture>, wgpu::Texture> RegisteredTextures;
They both map a vtkWebGPUComputeXXX to its wgpu::XXX equivalent because we want to be able to find the wgpu::XXX object from the vtkWebGPUComputeXXX object
Compute passes#
Compute passes are contained in a std::vector in a compute pipeline. A new vtkWebGPUComputePass
is added to the vector when vtkWebGPUComputePipeline::CreateComputePass() is called. Every compute
pass in a compute pipeline uses the device on the compute pipeline. This is so that buffers/textures created
for a compute pass can be reused by other compute passes. This wouldn’t be possible if a different device was
used for each compute pass (as WebGPU objects are not shareable between devices).
The Update() method of vtkWebGPUComputePipeline is responsible for executing the work queued by
compute passes on the GPU and waiting for the completion of the work. Because all the compute passes use
the device of the compute pipeline, Dispatch(), ReadFromGPU() and other such calls queue their work
onto the device of the compute pipeline. The Update() call can then naturally access the work
queued by the compute passes and wait for the completion of their execution.