/r/vulkan
News, information and discussion about Khronos Vulkan, the high performance cross-platform graphics API.
Vulkan is the next step in the evolution of graphics APIs. Developed by Khronos, current maintainers of OpenGL. It aims at reducing driver complexity and giving application developers finer control over memory allocations and code execution on GPUs and parallel computing devices.
Vulkan Subreddit Scope
Vulkan Resources
Tutorials
Books
Related subreddits
/r/vulkan
The Khronos Group has announced the release of Vulkan 1.4, the latest version of its cross-platform 3D graphics and compute API. Vulkan 1.4 integrates and mandates support for many proven features into its core specification, expanding the functionality that is consistently available to developers, greatly simplifying application development and deployment across multiple platforms.
The Vulkan 1.4 specification consolidates numerous previously optional extensions, features, and increased minimum hardware limits, many of which were defined in the Vulkan Roadmap 2022 and 2024 milestones and associated profiles, including:
Learn more: https://khr.io/vulkan14
Or another way to phrase the question might be: Does host coherent memory get implicitly transferred over the bus upon CPU write, or on GPU read? I'd guess GPU read, unless there is some automatic device side caching.
Or yet another: Is it better to persistently map host cached memory or not? (Where host writes and reads but device just reads. Is there a down side or consideration?)
Background... I have some CPU code that writes an image into a host allocated buffer. That buffer is mapped to a VkBuffer via VkImportMemoryHostPointerInfoEXT. The reason for importing the host pointer is to avoid an extra staging step for the host to device copy. The type of compatible memory is VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT | VK_MEMORY_PROPERTY_HOST_CACHED_BIT. So far, this is fine and appears to work well. (I don't care for the )
Now I'd like to add some additional steps where the CPU also reads from that buffer as well as write. This means I want CPU read and write cache performance. Later (after CPU processing) I want to copy this buffer into a device tiling-optimal image for display. I'm trying to determine if CPU read and write creates a problem. Perhaps it's better not to map (vkMapMemory) this memory? (Since it is already host allocated, mapping is not necessary for the current use.)
Checking my understanding here.
Heaps
The Vulkan spec is very general in this area. There are a huge number of options.
The Vulkan spec says there's some number of heaps for each implementation. There's no indication in the spec of how many. One? Two? 65535? I gather from this 2018 GDC presentation that there are very few, rarely more than three. Apparently there is rarely if ever more than one heap of a given type. Is that correct? The main types seem to be unshared CPU memory, unshared device memory, and various slow shared variants which may or may not be supported. Or the other extreme, the integrated graphics case, where everything is in one memory system. Are those pretty much the real world options, or are there other variants?
The Vulkan spec describes allocate and free functions. But the GDC presentation indicates these are very limited, or at least were back in 2018. The number of allocations is limited; that presentation suggests 4K. (Where does that number come from? Can it be read from the Vulkan API?) So you can't just allocate space for each texture with its own Vulkan allocate call. I think. The general idea seems to be to allocate big blocks (256MB was suggested) and then subdivide them with some kind of suballocator. Is that correct? Any comments on memory fragmentation problems.
Finding out how much device local memory is available was apparently hard back in 2018. Is that fixed? What's best practice today on getting a lot of device memory but not locking up the system because you grabbed all of it and nothing else can run?
Spilling from device memory to slower CPU memory accessed via the PCI bus is apparently something some Vulkan implementations can do. Or will do without being asked. When that happens, there's a big performance drop. How is that detected, prevented, or managed?
Is there something I should read that's more current than that 2018 presentation but covers the same material? Thanks.
I'm new to Vulkan, and decided to take on a ray tracing project to learn the API. Currently I have a bug where my TLAS is not being built correctly. I am completely stumped. According to Nsight graphics, my BLAS is being built fine.
However, it shows my TLAS contains no instances,
From what I can see, I have provided the correct info in AccelerationStructureBuildGeometryInfo
and AccelerationStructureBuildRangeInfoKHR
to the buildAccelerationStructuresKHR
function when building the TLAS (I have compared to Sascha Willems' raytracingbasic example). Here are some of the relevant fields Nsight shows for the TLAS build info:
Name | Value |
---|---|
pInfos | |
type | VK_ACCELERATION_STRUCTURE_TYPE_TOP_LEVEL_KHR |
flags | VkBuildAccelerationStructureFlagsKHR(VK_BUILD_ACCELERATION_STRUCTURE_PREFER_FAST_TRACE_BIT_KHR) |
mode | VK_BUILD_ACCELERATION_STRUCTURE_MODE_BUILD_KHR |
srcAccelerationStructure | VK_NULL_HANDLE |
dstAccelerationStructure | 0x2860e200000000bb |
geometryCount | 1 |
pGeometries | |
geometryType | VK_GEOMETRY_TYPE_INSTANCES_KHR |
geometry | |
instances | |
arrayOfPointers | VK_FALSE |
data | |
deviceAddress | 0xc4bec900000000b0 |
flags | VkGeometryFlagsKHR(VK_GEOMETRY_OPAQUE_BIT_KHR) |
scratchData | |
deviceAddress | 0x88693900000000c0 |
ppBuildRangeInfos | |
ppBuildRangeInfos[0] | |
ppBuildRangeInfos[0][0] | |
primitiveCount | 1 |
primitiveOffset | 0 |
firstVertex | 0 |
transformOffset | 0 |
I also have a pipeline barrier between the BLAS and TLAS build commands. I do not think it's a synchronisation issue, as I have also tried coarser synchronisation without success (BLAS and TLAS built in separate submits with fence in between).
Nsight also shows the instance buffer contents are as it should:
transform | instanceCustomIdx | mask | instanceSBTOffset | flags | asReference |
---|---|---|---|---|---|
[1.0000, 0.0000, 0.0000, 0.0000]""[0.0000, 1.0000, 0.0000, 0.0000]""[0.0000, 0.0000, 1.0000, 0.0000] | 0 | 255 | 0 | 1 | 0xE0001F5200 |
I obtained the acceleration structure reference using getAccelerationStructureAddressKHR
. It does not correspond to the buffer address for the BLAS, and I can't find the AS address in Nsight. Not sure if that is suspicious. The instance buffer should be alive while the TLAS is being built.
Repo: https://github.com/arrebarritra/vulkan-raytracer
The relevant code is here: https://pastebin.com/1cx0PYSC
The code is mostly vulkan-hpp + a few of my own abstractions. Some details which might be good to know about the code:
Hopefully it is easy enough to read. Would appreciate any help!
Edit: trying to fix tables but reddit is not complying :(
Does anyone know a X Server that allows you to run nested Vulkan applications?
So far I look at Xephyr and xvfb. Both of them only support GL tho.
Thank you so much for any help :)
Hi all, I have this problem. I want to render to the depth buffer and sampling from it at the same time. I know this is not possibile so what i do is to copy the depth buffer to a another texture before rendering and then sampling from this texture. My question: is there a smarter way to solve this problem without copy each time the depth to another depth texture?
Thanks!
I was revisiting my boilerplate code and noticed a TODO note for myself to check for separate Graphics and Presentation queues (on Windows and may be Linux).
Is this supported now?
Supposedly turnip is a driver for adreno gpu (android) that replaces the system driver, but only for the application that uses it. But this is something I don't understand, how can a driver be loaded as a shared library and perform the same functions as a driver. Shouldn't this be impossible to do in user mode? Applications like citra and yuzu offered the option to load a custom vulkan driver.
Hi everyone, I'm working on an implementation of batch rendering using SDL3 GPU API with Vulkan backend.
I'm trying to reproduce the performance of https://github.com/re-esper/BunnyMarkGame which on my machine is at frametimes of 5ms for 1M sprites (~190-200 fps). My implementation has frametimes twice as long for the same number of sprites (~100-110 fps).
Even when doing nothing, only acquiring a command buffer and a swapchain texture, but not clearing the screen, the window idles with frametimes of 0.3 ms while the benchmark above has frametimes of 0.1 ms when it's doing more like clearing the screen and rendering the basic imgui UI (but no sprites). This suggests to me that there is some form of persistent overhead/latency somewhere. I checked the SDL's backend and it looks fine with no glaring mistakes, so very confused about this.
Here is what RenderDoc reports:
Same amount of instances, same texture, same number of total draw calls per frame, yet one has draw calls that take twice as long. My implementation is not even doing rotation or scaling. I checked my CPU and it can build a command buffer for a frame in far less than 1ms, so it shouldn't be CPU bound. What's going on?
Hey! I have started picking up Vulkan again but its got me thinking, what are some entry-level positions or career paths where Vulkan is used?
I would love to hear from more experienced folks about what its like and what kind of projects do you get to work on? Whats the coolest stuff you get to do with Vulkan?
Hello everyone! I am building a neural network from scratch in C++ and was wondering which of the two would best tackle the task?
My computer is far from being considered a beast in computing/graphics power, so I would like to get the highest performance out of it. I have some experience with writing a 3D graphics renderer with Vulkan, so I am aware that the coding overhead sucks, but that is not a problem. I am shooting to get the most performance out of my program, so that is not playing a factor in my decision.
Some additional information about my driver specs:
Hi,
as a beginner in vulkan programming I am following the tutorial:
With chapter2 a compute shader is introduced and the result should be drawn to the screen. (https://vkguide.dev/docs/new\_chapter\_2/vulkan\_shader\_code/)
But I only see a black screen.
I get through the chapter twice and see no difference btw. the tutorial and my local code. ( https://github.com/Seim2k17/SolarSystem3DV/tree/solEngine/src/engine )
Could this be a hardware issue ?
Can someone help to find out whats wrong?
I have validation layers activated, captured a frame with renderdoc, but at this time i am not able to interprete the output ...
project:
https://github.com/Seim2k17/SolarSystem3DV/tree/solEngine
Thanks
Renderdoc-capture: (Linux)
https://github.com/Seim2k17/SolarSystem3DV/blob/solEngine/_captures/rdoc_capture_sol_blackscreen.rdc
I use some optional extension for my GLSL shader and compile it to the SPIR-V file using automated CMake script. In the shader file, the source code is guarded by extension's availability (e.g. #extension GL_KHR_extension_name : enable
-> enclose the code with #if GL_KHR_extension_name == 1
and #endif
).
I want to produce the SPIR-V file with differing the extension availability, using my existing CMake script. It means, I want to control this extension usage by CLI parameters. Note that currently glslc assuems all extension enabled.
How should I do?
I am trying to send a 1000x1000 image to the gpu for rendering in runtime. I have tried what the following errors suggest but to no success.
I get the following error:
Error:Validation Error: [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object 0: handle = 0x1d41b4b7a70, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x4dae5635 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] command buffer VkCommandBuffer 0x1d41b4b7a70[] expects VkImage 0x521e2f0000001f86[] (subresource: aspectMask 0x1 array layer 0, mip level 1) to be in layout VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_UNDEFINED.
Followed by the error:
Error:Validation Error: [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object 0: handle = 0x1d42b04e300, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x4dae5635 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] command buffer VkCommandBuffer 0x1d42b04e300[] expects VkImage 0x521e2f0000001f86[] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL.
Create Image Code:
bool vk::Vulkan_Buffers::createImage(PhysicalDevice& physicalDevice, loadObject& objectToLoad, VkImage& image, VkDeviceMemory& memory, dt::vec2i imageDimentions, uint32_t mipMapLevels, VkImageUsageFlags usage, VkImageTiling tiling,Console& console) {
VkImageCreateInfo imageCreateInfo{};
imageCreateInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
imageCreateInfo.imageType = VK_IMAGE_TYPE_2D;
imageCreateInfo.extent.width = imageDimentions.x;
imageCreateInfo.extent.height = imageDimentions.y;
imageCreateInfo.extent.depth = 1;
imageCreateInfo.mipLevels = mipMapLevels;
imageCreateInfo.arrayLayers = 1;
imageCreateInfo.format = VK_FORMAT_R8G8B8A8_SRGB;
imageCreateInfo.tiling = tiling;
imageCreateInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
imageCreateInfo.usage = usage;
imageCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
imageCreateInfo.samples = VK_SAMPLE_COUNT_1_BIT;
imageCreateInfo.flags = 0;
if (vkCreateImage(physicalDevice.logicalDevice.handle, &imageCreateInfo, nullptr, &image) == VK_SUCCESS) {
console.printSucsess("Vulkan Image created");
}
else {
console.printError("Vulkan Image Failed to be created");
}
VkMemoryRequirements memoryRequirements;
vkGetImageMemoryRequirements(physicalDevice.logicalDevice.handle, image, &memoryRequirements);
VkMemoryAllocateInfo allocInfo{};
allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
allocInfo.allocationSize = memoryRequirements.size;
allocInfo.memoryTypeIndex = findMemoryType(memoryRequirements.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, physicalDevice);
if (vkAllocateMemory(physicalDevice.logicalDevice.handle, &allocInfo, nullptr, &memory) != VK_SUCCESS) {
console.printError("Memory failed to be allocated");
}
vkBindImageMemory(physicalDevice.logicalDevice.handle, image, memory, 0);
return true;
}
Create Texture Buffer:
bool vk::Vulkan_Buffers::createTextureBuffer(PhysicalDevice& physicalDevice, SDL_Surface* surface, VkImage& image, uint32_t mipMapLevels,Console& console) {
VkBuffer stagingBuffer;
VkDeviceMemory stagingBufferMemory;
size_t imageSize = (sizeof(((Uint32*)surface->pixels)[0])) * (surface->w * surface->h);
createBuffer(physicalDevice, imageSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer, stagingBufferMemory,console);
void* data;
vkMapMemory(physicalDevice.logicalDevice.handle, stagingBufferMemory, 0, imageSize, 0, &data);
memcpy(data, surface->pixels, imageSize);
//transition to the correct image format
Vulkan_Image vulkanImageHandle;
transitionImageLayout(physicalDevice, image, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1,console);
copyBufferToImage(physicalDevice, stagingBuffer, image, static_cast<uint32_t>(surface->w), static_cast<uint32_t>(surface->h),console);
transitionImageLayout(physicalDevice, image, VK_FORMAT_R8G8B8A8_SRGB, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL, mipMapLevels,console);
vulkanImageHandle.generateMipMaps(physicalDevice, image, mipMapLevels, dt::vec2i(surface->w, surface->h),console);
return true;
}
Transition Image Layout:
void vk::Vulkan_Buffers::transitionImageLayout(PhysicalDevice& physicalDevice, VkImage image, VkFormat format, VkImageLayout oldLayout, VkImageLayout newLayout, uint32_t mipMapLevels,Console& console) {
Vulkan_CommandBuffers commandBuffersHandle;
VkCommandBuffer commandBuffer = commandBuffersHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.graphicsCommandPool);
VkImageMemoryBarrier imageMemoryBarrier{};
imageMemoryBarrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
imageMemoryBarrier.oldLayout = oldLayout;
imageMemoryBarrier.newLayout = newLayout;
imageMemoryBarrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
imageMemoryBarrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
imageMemoryBarrier.image = image;
imageMemoryBarrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
imageMemoryBarrier.subresourceRange.baseMipLevel = 0;
imageMemoryBarrier.subresourceRange.levelCount = mipMapLevels;
imageMemoryBarrier.subresourceRange.baseArrayLayer = 0;
imageMemoryBarrier.subresourceRange.layerCount = 1;
VkPipelineStageFlags sourceStage;
VkPipelineStageFlags destinationStage;
if (oldLayout == VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL && newLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL) {
VkCommandBuffer commandBuffer = commandBuffersHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.graphicsCommandPool);
imageMemoryBarrier.srcAccessMask = 0;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
sourceStage = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
destinationStage = VK_PIPELINE_STAGE_TRANSFER_BIT;
vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
commandBuffersHandle.endSingleTimeCommands(physicalDevice, commandBuffer, physicalDevice.logicalDevice.graphicsCommandPool, physicalDevice.logicalDevice.queueFamilies[physicalDevice.logicalDevice.graphicsQueueFamily].queues[0].handle, console);
}
else if (oldLayout == VK_IMAGE_LAYOUT_GENERAL && newLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL) {
VkCommandBuffer commandBuffer = commandBuffersHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.computeCommandPool);
imageMemoryBarrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
sourceStage = VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT;
destinationStage = VK_PIPELINE_STAGE_TRANSFER_BIT;
vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
commandBuffersHandle.endSingleTimeCommands(physicalDevice, commandBuffer, physicalDevice.logicalDevice.computeCommandPool, physicalDevice.logicalDevice.queueFamilies[physicalDevice.logicalDevice.graphicsQueueFamily].queues[0].handle, console);
}
else if (oldLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL && newLayout == VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL) {
VkCommandBuffer commandBuffer = commandBuffersHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.computeCommandPool);
imageMemoryBarrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
sourceStage = VK_PIPELINE_STAGE_TRANSFER_BIT;
destinationStage = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;
vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
commandBuffersHandle.endSingleTimeCommands(physicalDevice, commandBuffer, physicalDevice.logicalDevice.computeCommandPool, physicalDevice.logicalDevice.queueFamilies[physicalDevice.logicalDevice.graphicsQueueFamily].queues[0].handle, console);
}
else if (oldLayout == VK_IMAGE_LAYOUT_UNDEFINED && newLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL) {
VkCommandBuffer commandBuffer = commandBuffersHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.graphicsCommandPool);
imageMemoryBarrier.srcAccessMask = 0;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
sourceStage = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
destinationStage = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;
vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
commandBuffersHandle.endSingleTimeCommands(physicalDevice, commandBuffer, physicalDevice.logicalDevice.graphicsCommandPool, physicalDevice.logicalDevice.queueFamilies[physicalDevice.logicalDevice.graphicsQueueFamily].queues[0].handle, console);
}
else if (oldLayout == VK_IMAGE_LAYOUT_UNDEFINED && newLayout == VK_IMAGE_LAYOUT_GENERAL) {
VkCommandBuffer commandBuffer = commandBuffersHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.graphicsCommandPool);
imageMemoryBarrier.srcAccessMask = 0;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
sourceStage = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
destinationStage = VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT;
vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
commandBuffersHandle.endSingleTimeCommands(physicalDevice, commandBuffer, physicalDevice.logicalDevice.graphicsCommandPool, physicalDevice.logicalDevice.queueFamilies[physicalDevice.logicalDevice.graphicsQueueFamily].queues[0].handle, console);
}
else if (oldLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL && newLayout == VK_IMAGE_LAYOUT_GENERAL) {
VkCommandBuffer commandBuffer = commandBuffersHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.graphicsCommandPool);
imageMemoryBarrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
sourceStage = VK_PIPELINE_STAGE_TRANSFER_BIT;
destinationStage = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT | VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT;
vkCmdPipelineBarrier(commandBuffer, sourceStage, destinationStage, 0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
commandBuffersHandle.endSingleTimeCommands(physicalDevice, commandBuffer, physicalDevice.logicalDevice.graphicsCommandPool, physicalDevice.logicalDevice.queueFamilies[physicalDevice.logicalDevice.graphicsQueueFamily].queues[0].handle, console);
}
else {
console.printError("Layout transition is not supported");
}
}
generateMipMaps code:
bool vk::Vulkan_Image::generateMipMaps(PhysicalDevice& physicalDevice, VkImage& vkimage, uint32_t mipMapLevels, dt::vec2i dimentions,Console& console) {
VkFormatProperties formatProperties;
vkGetPhysicalDeviceFormatProperties(physicalDevice.handle, VK_FORMAT_R8G8B8A8_SRGB, &formatProperties);
if (!formatProperties.linearTilingFeatures & VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT) {
console.printError("Linear Blitting is not supported");
}
Vulkan_CommandBuffers commandBufferHandle;
VkCommandBuffer commandBuffer = commandBufferHandle.beginSingleTimeCommands(physicalDevice, physicalDevice.logicalDevice.graphicsCommandPool);
VkImageMemoryBarrier barrier{};
barrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
barrier.image = vkimage;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
barrier.subresourceRange.baseArrayLayer = 0;
barrier.subresourceRange.layerCount = 1;
barrier.subresourceRange.levelCount = 1;
int32_t mipWidth = dimentions.x;
int32_t mipHeight = dimentions.y;
for (uint32_t i = 1; i < mipMapLevels; i++) {
barrier.subresourceRange.baseMipLevel = i - 1;
barrier.oldLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
barrier.newLayout = VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL;
barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
vkCmdPipelineBarrier(commandBuffer, VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0, 0, nullptr, 0, nullptr, 1, &barrier);
VkImageBlit blit{};
blit.srcOffsets[0] = { 0, 0, 0 };
blit.srcOffsets[1] = { mipWidth, mipHeight, 1 };
blit.srcSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
blit.srcSubresource.mipLevel = i - 1;
blit.srcSubresource.baseArrayLayer = 0;
blit.srcSubresource.layerCount = 1;
blit.dstOffsets[0] = { 0, 0, 0 };
blit.dstOffsets[1] = { mipWidth > 1 ? mipWidth / 2 : 1, mipHeight > 1 ? mipHeight / 2 : 1, 1 };
blit.dstSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
blit.dstSubresource.mipLevel = i;
blit.dstSubresource.baseArrayLayer = 0;
blit.dstSubresource.layerCount = 1;
vkCmdBlitImage(commandBuffer, vkimage, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, vkimage, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, &blit, VK_FILTER_LINEAR);
barrier.oldLayout = VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL;
barrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
barrier.srcAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
vkCmdPipelineBarrier(commandBuffer, VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 0, nullptr, 0, nullptr, 1, &barrier);
if (mipWidth > 1) mipWidth /= 2;
if (mipHeight > 1) mipHeight /= 2;
}
barrier.subresourceRange.baseMipLevel = mipMapLevels - 1;
barrier.oldLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
barrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
vkCmdPipelineBarrier(commandBuffer, VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, 0, 0, nullptr, 0, nullptr, 1, &barrier);
commandBufferHandle.endSingleTimeCommands(physicalDevice, commandBuffer, physicalDevice.logicalDevice.graphicsCommandPool, physicalDevice.logicalDevice.queueFamilies[physicalDevice.logicalDevice.graphicsQueueFamily].queues[0].handle,console);
return true;
}
Hello,
I have a problem updating the UBO Buffer on Fragment stage, any rules on construction of UBO Buffer if using different stages. UBO and PCO on Vertex stage is fine working and output for fragment is ok, but I seems that UBO buffer on fragment is not reflecting on shader.
I'm pretty sure my pipeline layout is correct and descriptor writers seems fine also, any hint where to look at if UBO buffer seems not reflecting to the shader ?
TIA.
Hey folks,
I got basic shadow mapping working. But it's... basic. Variance Shadow Maps is a technique that promises affordable soft shadows while offering solutions to common problems like Shadow Acne, or Peter Panning. So I started working on it.
My current setup has one D32_SFLOAT
z-buffer for each frame in flight (which I have 2 of). To implement Variance Shadow Maps:
I created a R32G32B32A32_SFLOAT
color image as attachment (2x for frames in flight) to store the depth and depth squared images. Apparently, GPUs don't like This is a huge investment already. EDIT: The GPU does like R32G32
so 2 channels are wasted.R32G32
, mistake on my side. See comments below.
Then I noticed that my shadow map is in draw order, not in depth order, and it seems obvious now, but I still need the D32_SFLOAT
z-buffer to get proper depth testing. (This is also because the depth values are supposed to be "linear", i.e., fragment-to-light distance, and not typical non-linear z-buffer distance).
In order to get soft shadows, I need Gaussian blurring passes. Since this cannot happen on the same texture, I need another R32G32B32A32_SFLOAT
texture (for each frame in flight) to do the blurring: shadow map -> temp texture blur pass X -> shadow map blur pass Y.
Finally, the article proposes to use MSAA for the shadow maps, so let's say 4xMSAA for making my point.
To summarize (for 2 frames in flight) I have the following comparision:
D32_SFLOAT
texture (total 2 SFLOAT channels).D32_SFLOAT
(2 channels), 4x R32G32B32A32_SFLOAT
(16 channels), 4x memory for MSAA (total 72 SFLOAT channels).This difference seems intense. And that is just for each light I want to cast shadows. Am I missing something?
Let's say you have a bunch of textures and material parameters. How do you assign those to triangles? So far I only know how to pass information per vertex. I could pass the information about which texture and material to use per vertex, but then I would have to store redundant information, so surely there has to be some better method, right?
Im pretty new to vulkan so Im currently following this tutorial and this also youtube tutorial. However, Im using hyprland on wayland and arch linux and after running the same code (I copied) I cant not see any new windows open. I dont think there are any problems with their code but rather than I dont know that there are some special requirements with my system tho. Thank you for your helps!
Hi, I've recently decided to give the NVK driver a try and I'll admit it works very well most of the time on my RTX 3060 Max-Q. However, I'm experiencing a bug with some vulkan applications that causes them to render only the window border with nothing within it. The best way to reproduce this bug is to run vkcube-wayland as it is the most widely available piece of software that has this bug. Weirdly, the normal vkcube works perfectly and, according to hyprctl (I'm using Hyprland), is running without xwayland. If anyone experienced this bug, it would be very nice to exchange some ideas about it.
In bindless mode, if a shader uses an invalid descriptor index, what happens in these cases?
(Why? Looking into designing a Rust interface and need to know what does and doesn't have to be checke for safety.)
Is it acceptable to bind a graphics pipeline multiple times using different push constants? Does Vulkan copy the push constants at each bind or do I need to hang on to them in memory until it's done with them? i.e. can I just overwrite the same struct in memory for each binding of a given pipeline, or should I be buffering all of the PCs for pipeline binds?
Hi, I want to find out about the event as much as possible, as my friend and I are thinking of going to the Vulkanised 2025 event, problem is aside from the conference agenda shown on the page we don't know what to expect from the event and anything we should know about beforehand. As the passes for the event are pretty expensive, but we would like to go there.
- What is it like?
- What are networking sessions like?
- Most importantly is there food? Or do we leave the venue to get food?
Hello, I have two questions regrading vkUpdateDescriptorSets and Push Constants.
It basically says begin recording is vkBeginCommandBuffer.
But it seems like I can write something below and everything works fine, why?
BeginCmdBuffer();
// before begin render pass, after BeginCmdBuffer
// shouldn't this be the recording state mentioned in the doc?
vkUpdateDescriptorSets();
BeginRenderPass();
BindPipeline();
BindDescSets();
Draw();
EndRenderPass();
Once I move vkUpdateDescriptorSets() inside BeginRenderPass(), validation layer complains.
It works asynchronously and seems handy than vkUpdateDescriptorSets.
Since I'm working on a Vulkan api implementation ( class Vulkan : public GraphicsApiBase ) that has things I'm pretty sure are specific to the company I'm working at, I'm mostly looking for a description of what I'd need to do, maybe which types of variables I need to look for and change, if I'd need to declare and init some things, maybe pseudocode. Anyway, context:
I have a window that's my current only render target, created with handle HWND hWnd0 = CreateWindowEx(...) using HINSTANCE hInstance.
I'd like to have a second window, created using HWND hWnd1 = CreateWindowEx(...), created using the same class as hWnd0, and I'd like to be able to alternate between rendering to hWnd0 and hWnd1, so in essence after it renders to hWnd0, I'd like to be able to switch the render target to hWnd1, and vice-versa.