/r/vulkan

Photograph via snooOG

News, information and discussion about Khronos Vulkan, the high performance cross-platform graphics API.

Vulkan is the next step in the evolution of graphics APIs. Developed by Khronos, current maintainers of OpenGL. It aims at reducing driver complexity and giving application developers finer control over memory allocations and code execution on GPUs and parallel computing devices.


Vulkan Subreddit Scope

This subreddit is aimed at developers and end users, with a strong focus on development of the Vulkan API itself, the development of applications that use the Vulkan API and the state of deployment of implementations available.

Vulkan Resources


Tutorials


Books


Related subreddits

/r/vulkan

19,012 Subscribers

2

Deferred with Dynamic Rendering

Im trying to set-up a deferred pipeline with dynamic rendering, but just found out that input attachments arent supported. Theres the dynamic rendering local read extension but I dont want to rely solely on that at the moment. Are there any examples of deferred pipelines made with dynamic rendering?

2 Comments
2024/05/12
10:25 UTC

2

Host synchronization when not specified as external in the spec

Some API functions have a note in the spec saying that they must be externally synchronized. Does that mean that the ones that don't have such a note, are safe to call from multiple threads? For example the spec for vkCreateBuffer doesn't say it must be externally synchronized, while it obviously writes something to the VkInstance. Is there a mutex (or some fancy non blocking thread safe thing) in the driver?

2 Comments
2024/05/11
19:29 UTC

1

Post Deferred Renderpass depth attachment is upside down

Hi!

Doing a deferred rendering engine with all the bells and whistles - tiled/clustered forward is coming later.

I'm using a framebuffer abstraction similar to a render graph API - essentially allowing for framebuffer construction which are dependent on other framebuffer instances.

I've succesfully finished the G-buffer geometry pass and the deferred pass (with light culling compute in between). After the deferred pass lighting I would like to be able to draw the lights I submitted to my engine from client code - maybe as emissive meshes or like a renderer cube default (?).

Anyway - for this final light geometry pass I attach the output of the deferred pass (colour, and depth) as load operations. I had as a vision to implement the idea blitted lights of Deferred Shading LearnOpenGL,

As context, I use inverse Z-buffer with minDepth and maxDepth set to 1.0 and 0.0 respectively, also using GLM_FORCE_DEPTH_ZERO_ONE. I also perform viewport inversion a la Sascha Willems. I perform a Predepth pass which is then attached to the G-buffer pass to hinder overdraw, and allowing for light culling later on. This is the depth output I use as the attachment for the light pass.

The issue is that the lights render pass gets the depth buffer attached upside down. I've tried rendering the lights pass "non-inverted" (x=0,y=0,width=width,height=height), inverting that render pass's shaders' y-coordinates and others.

I can't seem to understand how to flip the depth attachment at this point, since I'm not sampling it as a texture, it is already loaded as an attachment.

A possible solution, I realised when writing this post, would maybe be to attach it as a fragment shader texture and sample instead, so that I can reject based on gl_FragDepth in the fragment shader? Maybe?

Thank you for your consideration!

4 Comments
2024/05/11
19:06 UTC

1

Why can't I invoke my custom intersection shader for AABB geometry (Ray Tracing)?

I've been struggling to render voxels using the ray tracing pipeline in Vulkan. Rendering with static mesh and triangle meshes works flawlessly—my closest hit shader gets invoked correctly when the geometry type is set to triangles and marked as opaque. However, when I switch to procedural geometry using axis-aligned bounding boxes (AABB), only the Miss Shader executes.

Despite inserting debug statements in my custom intersection shader, there's no output. I've even removed the closest hit shader from my shader binding table (SBT) to ensure there aren't any selection issues. Given that it's the only shader listed in my SBT besides general shaders, I suspect the issue lies elsewhere, perhaps not with the SBT setup itself. Fiddling with the opaqueness settings hasn't resolved the issue either.

I would greatly appreciate it if somebody can help me out. I've been watching and reading about SBTs and the RT pipeline, but I feel like I am missing something in what I thought should be a very simple endeavor.

I tried Nvidia Nsights framedebugging, and the framedebugger shows my axis aligned bounding boxes in the acceleration structure renderer. And I even see my intersection shaders as part of the shader hit group. So they are definitely "registered" in the application, I just can't get it to invoke for some reason.

My intersection shader (currently commented out and just has a print debug to get it to invoke):

#version 460
#extension GL_EXT_ray_tracing : require
#extension GL_EXT_debug_printf : enable

//layout(binding = 0, set = 0) uniform accelerationStructureEXT topLevelAS;
//layout(binding = 1, set = 0, r32f) uniform image3D densityField;

void main() {
    debugPrintfEXT("CUSTOM INTERSECTION SHADER!!!!!!!!!!!!!!!!!!!!!\n"); // comment out
    reportIntersectionEXT(0, 0); // comment out
    return; // comment out

    /**
    // Ray marching
    uint rayFlags = gl_RayFlagsNoneEXT;
    float tmin = 0.0;
    float tmax = 1000.0;  // Farthest distance we want to check
    vec3 rayOrigin = gl_WorldRayOriginEXT;
    vec3 rayDirection = gl_WorldRayDirectionEXT;

    float t = tmin;
    float stepSize = 1.0;  // Distance to step through the density field

    while (t < tmax) {
        vec3 currentPosition = rayOrigin + rayDirection * t;
        float density = imageLoad(densityField, ivec3(currentPosition)).x;

        if (density > 0.5) {  // Assuming 0.5 is the threshold for solid
            reportIntersectionEXT(t, 0);  // Report intersection at this distance
            return;
        }
        t += stepSize;
    }*/
}

Pipeline creation

void VulkanInitializer::createGraphicsPipeline() {
    auto raygenShaderCode = readFile("../shaders/raygen.spv");
    auto missShaderCode = readFile("../shaders/miss.spv");
    // auto closestHitShaderCode = readFile("../shaders/closesthit.spv");
    auto voxelIntersectionShaderCode = readFile("../shaders/intersection.spv");

    VkShaderModule raygenShaderModule = createShaderModule(raygenShaderCode);
    VkShaderModule missShaderModule = createShaderModule(missShaderCode);
    // VkShaderModule closestHitShaderModule = createShaderModule(closestHitShaderCode);
    VkShaderModule voxelIntersectionShaderModule = createShaderModule(voxelIntersectionShaderCode);

    VkPipelineShaderStageCreateInfo raygenShaderStageInfo{};
    // Set up shader stages for the ray tracing pipeline
    raygenShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
    raygenShaderStageInfo.stage = VK_SHADER_STAGE_RAYGEN_BIT_KHR;
    raygenShaderStageInfo.module = raygenShaderModule;
    raygenShaderStageInfo.pName = "main";

    VkPipelineShaderStageCreateInfo missShaderStageInfo{};
    missShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
    missShaderStageInfo.stage = VK_SHADER_STAGE_MISS_BIT_KHR;
    missShaderStageInfo.module = missShaderModule;
    missShaderStageInfo.pName = "main";

    // VkPipelineShaderStageCreateInfo closestHitShaderStageInfo{};
    // closestHitShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
    // closestHitShaderStageInfo.stage = VK_SHADER_STAGE_CLOSEST_HIT_BIT_KHR;
    // closestHitShaderStageInfo.module = closestHitShaderModule;
    // closestHitShaderStageInfo.pName = "main";
    VkPipelineShaderStageCreateInfo voxelIntersectionShaderStageInfo{};
    voxelIntersectionShaderStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
    voxelIntersectionShaderStageInfo.stage = VK_SHADER_STAGE_INTERSECTION_BIT_KHR;
    voxelIntersectionShaderStageInfo.module = voxelIntersectionShaderModule;
    voxelIntersectionShaderStageInfo.pName = "main";

    std::vector<VkPipelineShaderStageCreateInfo> shaderStages = {
            raygenShaderStageInfo, missShaderStageInfo, /*closestHitShaderStageInfo,*/ voxelIntersectionShaderStageInfo
    };

    // Set up ray tracing shader groups (raygen, miss, hit)
    VkRayTracingShaderGroupCreateInfoKHR raygenGeneralGroupInfo{};
    raygenGeneralGroupInfo.sType = VK_STRUCTURE_TYPE_RAY_TRACING_SHADER_GROUP_CREATE_INFO_KHR;
    raygenGeneralGroupInfo.type = VK_RAY_TRACING_SHADER_GROUP_TYPE_GENERAL_KHR;
    raygenGeneralGroupInfo.generalShader = 0; // Index of the ray generation shader in the shaderStages array
    raygenGeneralGroupInfo.closestHitShader = VK_SHADER_UNUSED_KHR;
    raygenGeneralGroupInfo.anyHitShader = VK_SHADER_UNUSED_KHR;
    raygenGeneralGroupInfo.intersectionShader = VK_SHADER_UNUSED_KHR;

    VkRayTracingShaderGroupCreateInfoKHR missGeneralGroupInfo{};
    missGeneralGroupInfo.sType = VK_STRUCTURE_TYPE_RAY_TRACING_SHADER_GROUP_CREATE_INFO_KHR;
    missGeneralGroupInfo.type = VK_RAY_TRACING_SHADER_GROUP_TYPE_GENERAL_KHR;
    missGeneralGroupInfo.generalShader = 1;
    missGeneralGroupInfo.closestHitShader = VK_SHADER_UNUSED_KHR;
    missGeneralGroupInfo.anyHitShader = VK_SHADER_UNUSED_KHR;
    missGeneralGroupInfo.intersectionShader = VK_SHADER_UNUSED_KHR;
/*
    VkRayTracingShaderGroupCreateInfoKHR triangleHitGroupInfo{};
    triangleHitGroupInfo.sType = VK_STRUCTURE_TYPE_RAY_TRACING_SHADER_GROUP_CREATE_INFO_KHR;
    triangleHitGroupInfo.type = VK_RAY_TRACING_SHADER_GROUP_TYPE_TRIANGLES_HIT_GROUP_KHR;
    triangleHitGroupInfo.generalShader = VK_SHADER_UNUSED_KHR;
    triangleHitGroupInfo.closestHitShader = 2;
    triangleHitGroupInfo.anyHitShader = VK_SHADER_UNUSED_KHR;
    triangleHitGroupInfo.intersectionShader = VK_SHADER_UNUSED_KHR;
*/
    VkRayTracingShaderGroupCreateInfoKHR proceduralHitGroupInfo{};
    proceduralHitGroupInfo.sType = VK_STRUCTURE_TYPE_RAY_TRACING_SHADER_GROUP_CREATE_INFO_KHR;
    proceduralHitGroupInfo.type = VK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_KHR;
    proceduralHitGroupInfo.generalShader = VK_SHADER_UNUSED_KHR;
    proceduralHitGroupInfo.closestHitShader = VK_SHADER_UNUSED_KHR;
    proceduralHitGroupInfo.anyHitShader = VK_SHADER_UNUSED_KHR;
    proceduralHitGroupInfo.intersectionShader = 2;

    std::array<VkRayTracingShaderGroupCreateInfoKHR, 3> groups = {raygenGeneralGroupInfo, missGeneralGroupInfo, /*triangleHitGroupInfo,*/ proceduralHitGroupInfo};

My SBT creation:

void VulkanInitializer::createShaderBindingTable() {
    /** VkPhysicalDeviceRayTracingPipelinePropertiesKHR
    * shaderGroupHandleSize: Size in bytes of a shader group handle.
    * shaderGroupBaseAlignment: Required alignment, in bytes, for shader group base addresses.
    * shaderGroupHandleAlignment: Required alignment, in bytes, for shader group handles.
    * maxRayRecursionDepth: Maximum depth of ray recursion allowed.
    * maxShaderGroupStride: Maximum stride in bytes between shader groups.
    */
    const uint32_t shaderGroupHandleSize = vulkanContext.rtPipelineProperties.shaderGroupHandleSize; // 32 for my gpu
    const uint32_t alignment = vulkanContext.rtPipelineProperties.shaderGroupBaseAlignment; // 64 for my gpu

    constexpr int numRaygenShaders = 1;
    constexpr int numMissShaders = 1;
    constexpr int numHitGroups = 1; // 1x shader per hit group (2x: triangle and procedural)-- disabled the triangle hit group for debugging purposes
    constexpr int numCallableShaders = 0;

    const uint32_t raygenRegionSize = numRaygenShaders * shaderGroupHandleSize;
    const uint32_t missRegionSize = numMissShaders * shaderGroupHandleSize;
    const uint32_t hitRegionSize = numHitGroups * shaderGroupHandleSize;
    const uint32_t callableRegionSize = numCallableShaders * shaderGroupHandleSize;

    constexpr uint32_t raygenRegionOffset = 0;
    const uint32_t missRegionOffset = alignUp(raygenRegionOffset + raygenRegionSize, alignment);
    const uint32_t hitRegionOffset = alignUp(missRegionOffset + missRegionSize, alignment);
    const uint32_t callableRegionOffset = alignUp(hitRegionOffset + hitRegionSize, alignment);

    assert(raygenRegionOffset % alignment == 0);
    assert(missRegionOffset % alignment == 0);
    assert(hitRegionOffset % alignment == 0);
    assert(callableRegionOffset % alignment == 0);

    constexpr uint32_t groupCount = 3; // raygen group + miss group + triangle hit group + procedural hit group -- disabled the triangle hit group for debugging purposes
    const uint32_t sbtSize = callableRegionOffset + callableRegionSize;

    std::vector<uint8_t> shaderGroupHandles(groupCount * shaderGroupHandleSize);
    auto pfn_GetRayTracingShaderGroupHandles = reinterpret_cast<PFN_vkGetRayTracingShaderGroupHandlesKHR>(vkGetDeviceProcAddr(
            vulkanContext.device, "vkGetRayTracingShaderGroupHandlesKHR"));
    if (!pfn_GetRayTracingShaderGroupHandles) {
        throw std::runtime_error("Could not load vkGetRayTracingShaderGroupHandlesKHR");
    }
    if (pfn_GetRayTracingShaderGroupHandles(vulkanContext.device, vulkanContext.graphicsPipeline, 0, groupCount,
                        groupCount * shaderGroupHandleSize, shaderGroupHandles.data()) != VK_SUCCESS) {
        throw std::runtime_error("failed to get ray tracing shader group handles!");
    }

    CreateVmaBuffer(sbtSize,
                    VK_BUFFER_USAGE_SHADER_BINDING_TABLE_BIT_KHR | VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT,
                    VMA_MEMORY_USAGE_CPU_TO_GPU, shaderBindingTableBuffer, shaderBindingTableBufferAllocation,
                    vulkanContext.vmaAllocator, "SBT Buffer");

    uint8_t* sbtBuffer;
    vmaMapMemory(vulkanContext.vmaAllocator, shaderBindingTableBufferAllocation, reinterpret_cast<void**>(&sbtBuffer));

    const uint8_t* raygenShaderHandle = shaderGroupHandles.data() + 0 * shaderGroupHandleSize;
    const uint8_t* missShaderHandle = shaderGroupHandles.data() + 1 * shaderGroupHandleSize;
    const uint8_t* triangleHitShaderHandle = shaderGroupHandles.data() + 2 * shaderGroupHandleSize;
    const uint8_t* voxelIntersectionShaderHandle = shaderGroupHandles.data() + 2 * shaderGroupHandleSize; // only adding the voxel intersectio nshader for now

    // Copy shader handles to the mapped SBT buffer at the appropriate offsets
    memcpy(sbtBuffer + raygenRegionOffset, raygenShaderHandle, shaderGroupHandleSize); // An entry in the SBT is essentially a group configured during pipeline creation
    memcpy(sbtBuffer + missRegionOffset, missShaderHandle, shaderGroupHandleSize);
    memcpy(sbtBuffer + hitRegionOffset, voxelIntersectionShaderHandle, shaderGroupHandleSize);
    memcpy(sbtBuffer + hitRegionOffset + shaderGroupHandleSize, voxelIntersectionShaderHandle, shaderGroupHandleSize);
    memcpy(sbtBuffer + callableRegionOffset, voxelIntersectionShaderHandle, shaderGroupHandleSize);

    vmaUnmapMemory(vulkanContext.vmaAllocator, shaderBindingTableBufferAllocation);

    const VkDeviceAddress sbtBufferAddress = GetBufferDeviceAddress(vulkanContext.shaderBindingTableBuffer, vulkanContext.device); // : Refactor. Probably can just reference the previous addr

    // Define the SBT regions : Probably can refactor the size with previously defined variables.
    VkStridedDeviceAddressRegionKHR raygenShadersStartAddr{};
    raygenShadersStartAddr.deviceAddress = sbtBufferAddress;
    raygenShadersStartAddr.stride = alignment;
    raygenShadersStartAddr.size = alignUp(raygenRegionSize, alignment);

    VkStridedDeviceAddressRegionKHR missShadersStartAddr{};
    missShadersStartAddr.deviceAddress = sbtBufferAddress + missRegionOffset;
    missShadersStartAddr.stride = alignment;
    missShadersStartAddr.size = alignUp(missRegionSize, alignment);

    VkStridedDeviceAddressRegionKHR hitGroupsStartAddr{};
    hitGroupsStartAddr.deviceAddress = sbtBufferAddress + hitRegionOffset;
    hitGroupsStartAddr.stride = alignment;
    hitGroupsStartAddr.size = alignUp(hitRegionSize, alignment);

    VkStridedDeviceAddressRegionKHR callableShadersStartAddr{};
    callableShadersStartAddr.deviceAddress = sbtBufferAddress + callableRegionOffset;
    callableShadersStartAddr.stride = alignment;
    callableShadersStartAddr.size = alignUp(callableRegionSize, alignment);

    vulkanContext.shaderBindingTableShaderStartAddresses.raygenShadersStartAddr = raygenShadersStartAddr;
    vulkanContext.shaderBindingTableShaderStartAddresses.missShadersStartAddr = missShadersStartAddr;
    vulkanContext.shaderBindingTableShaderStartAddresses.hitGroupsStartAddr = hitGroupsStartAddr;
    vulkanContext.shaderBindingTableShaderStartAddresses.callableShadersStartAddr = callableShadersStartAddr;
}

// part of my AABB blas creation:

void VulkanInitializer::CreateVoxelBlasAABB() {
    // Create AABB blas voxels
    std::vector<VkAabbPositionsKHR> aabbs = {
        {{0.0f, 0.0f, 0.0f}, {1.0f, 1.0f, 1.0f}},
    };

    CreateVmaBuffer(sizeof(VkAabbPositionsKHR) * aabbs.size(),
                    VK_BUFFER_USAGE_ACCELERATION_STRUCTURE_BUILD_INPUT_READ_ONLY_BIT_KHR | VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT,
                    VMA_MEMORY_USAGE_CPU_TO_GPU, vulkanContext.aabbBuffer, vulkanContext.aabbBufferAllocation, vulkanContext.vmaAllocator, "AABB Buffer");

    void* data;
    vmaMapMemory(vulkanContext.vmaAllocator, vulkanContext.aabbBufferAllocation, &data);
    memcpy(data, aabbs.data(), sizeof(VkAabbPositionsKHR) * aabbs.size());
    vmaUnmapMemory(vulkanContext.vmaAllocator, vulkanContext.aabbBufferAllocation);

    // Set up the structure for the AABBs
    VkAccelerationStructureGeometryKHR geometry{};
    geometry.sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_KHR;
    geometry.geometryType = VK_GEOMETRY_TYPE_AABBS_KHR;
    geometry.geometry.aabbs.sType = VK_STRUCTURE_TYPE_ACCELERATION_STRUCTURE_GEOMETRY_AABBS_DATA_KHR;
    geometry.geometry.aabbs.data.deviceAddress = GetBufferDeviceAddress(vulkanContext.aabbBuffer, vulkanContext.device);
    geometry.geometry.aabbs.stride = sizeof(VkAabbPositionsKHR);

// Top level acceleration structure instances

        VkAccelerationStructureInstanceKHR instance{};
        instance.transform = transformMatrix;
        instance.instanceCustomIndex = instanceCustomIndex;
        instance.mask = 0xFF;
        instance.instanceShaderBindingTableRecordOffset = 0; // Since intersection shader is literally the only shader, 0 should work but idk why it doesn't
        instance.flags = 0;
        instance.accelerationStructureReference = vulkanContext.voxelBlasDeviceAddress;

        instanceUpdates.push_back(instance);

Any insights on why my intersection shader doesn't get executed when AABB intersects with it? Could there be an issue with how I'm setting up or using the geometry in my bottom-level acceleration structure or something else entirely?

2 Comments
2024/05/10
13:46 UTC

1

Primary and secondary cmd-bufs within same renderpass

Vulkan Validation Entry:

VUID-vkCmdExecuteCommands-contents-06018

If vkCmdExecuteCommands is being called within a render pass instance begun with vkCmdBeginRenderPass, 

its contents parameter must have been set to VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS , 

or VK_SUBPASS_CONTENTS_INLINE_AND_SECONDARY_COMMAND_BUFFERS_EXT

I want to verify my understanding is correct. Without the use of the extension of

VK_SUBPASS_CONTENTS_INLINE_AND_SECONDARY_COMMAND_BUFFERS_EXT,

this means that it's not possible to have a RenderPass that has multiple subpasses,

where 1 subpass is invoked from primary command buffers, and another subpass

where its contents executed via secondary command buffers.

Like:

vkCmdBeginRenderPass(..., VK_SUBPASS_CONTENTS_INLINE)

// .. Commands in primary cmd-buf

vkCmdNextSubpass(VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS)

vkCmdExecuteCommands(...) // commands in secondary cmd-buf

vkCmdEndRenderPass()

This means that secondary command buffers must be executed within their own RenderPasses.

That is, it's not possible for a single renderpass to mix primary and secondary cmd-bufs.

1 Comment
2024/05/10
12:58 UTC

11

How should I handle defining pipelines in an efficient manner?

I have been reading Intel's Api Without secrets as well as following the vulkan tutorial but I am confused with how I should define pipelines. From my understanding of what a graphics pipeline is, the pipeline essentially defines the entirety of the process of drawing. However it seems that vulkan requires developers to have different pipelines for different ways of rendering objects such as wireframe rendering, or rendering transparent objects. If my understanding is correct then wouldn't I need to manually define each pipeline in an application that might have hundreds of different materials? How would I go about this in an efficient manner?

9 Comments
2024/05/10
07:10 UTC

0

Problem with dynamically sized array for vkEnumeratePhysicalDevices

So I'm trying to follow the Vulkan tutorial, yet I'm coding in C instead of C++. Additionally, I'm coding in Visual Studio with the horrible MSVC compiler which doesnt allow me to simply write `VkPhysicalDevice devices[deviceCount]`.

No problem, I thought, I'm just going to use malloc.

Yes problem. For whatever reason, this fails spectacularily (Access violation writing location ...), no clue why.

uint32_t deviceCount = 0;
vkEnumeratePhysicalDevices(*instance, &deviceCount, 0);
VkPhysicalDevice *devices = malloc(sizeof(VkPhysicalDevice) * deviceCount);
vkEnumeratePhysicalDevices(*instance, &deviceCount, devices);

Whereas this works fine.

uint32_t deviceCount = 0;
vkEnumeratePhysicalDevices(*instance, &deviceCount, 0);
VkPhysicalDevice devices[2]; // in my case deviceCount is 2
vkEnumeratePhysicalDevices(*instance, &deviceCount, devices);

From my understanding, there shouldn't be a difference as arrays are basically pointers to memory space anyways... Can anyone explain?

5 Comments
2024/05/09
20:27 UTC

0

I can't seem to find a proper way to setup vulkan on mac

Every tutorial is irrelevant to my specific situation.

I even coped a makefile from someone else using Mac but nothing works

i get this error

makefile:5: *** missing separator. Stop.

this is my makefile

CFLAGS = -std=c++17 -I.mak -I$(VK_SDK_PATH)\include

LDFLAGS = -L$(VK_SDK_PATH)\lib `pkg-config --static --libs=glfw3` -lvulkan
a.out: *.cpp *.hpp
g++ $(CFLAGS) -o a.out *.cpp $(LDFLAGS)
.PHONY: test clean
test: a.out
./a.out
clean: a.out
rm -f a.out

I set up according to Lu the coder in this video, uptown the 26 minute mark

Please help

2 Comments
2024/05/09
13:43 UTC

5

I can't figure out how to write non-interleaved vertex buffers

Hi, as the title suggests, I've been trying to use non-interleaved vertex buffers in my current application (for various reasons). I've been trying to do a simple demo with just a Quad; the positions are read correctly in the shader, but the tex coords aren't (for debugging purposes, I am trying to draw the tex coords as the color). I just get a black quad.

Buffer Creation:

static std::array<float, 4 * 2> positions = {
    -0.5f, -0.5f,
     0.5f, -0.5f,
     0.5f,  0.5f,
    -0.5f,  0.5f,
};

static std::array<float, 4 * 2> texCoords = {
     0.0f, 0.0f,
     1.0f, 0.0f,
     1.0f, 1.0f,
     0.0f, 1.0f,
};

return {
    Buffer(
        positions,
        VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
        VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
    ),

    Buffer(
        texCoords,
        VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
        VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
    )
};

Binding Descriptions:

std::vector<VkVertexInputBindingDescription> bindingDescriptions(2);

// Positions
bindingDescriptions[0].binding = 0;
bindingDescriptions[0].stride = sizeof(glm::vec2);
bindingDescriptions[0].inputRate = VK_VERTEX_INPUT_RATE_VERTEX;

// Tex Coords
bindingDescriptions[1].binding = 1;
bindingDescriptions[1].stride = sizeof(glm::vec2);
bindingDescriptions[1].inputRate = VK_VERTEX_INPUT_RATE_VERTEX;

Attribute Descriptions:

static std::vector<VkVertexInputAttributeDescription> attributeDescriptions(2);

// Positions
attributeDescriptions[0].binding = 0;
attributeDescriptions[0].location = 0;
attributeDescriptions[0].format = VK_FORMAT_R32G32_SFLOAT;
attributeDescriptions[0].offset = 0;

// TexCoords
attributeDescriptions[1].binding = 1;
attributeDescriptions[1].location = 1;
attributeDescriptions[1].format = VK_FORMAT_R32G32_SFLOAT;
attributeDescriptions[1].offset = 0;

Yes, I'm actually setting them:

vertexStateInfo.vertexBindingDescriptionCount = (uint32_t)bindingDescriptions.size();
vertexStateInfo.pVertexBindingDescriptions = bindingDescriptions.data();

vertexStateInfo.vertexAttributeDescriptionCount = (uint32_t)attributeDescriptions.size();
vertexStateInfo.pVertexAttributeDescriptions = attributeDescriptions.data();

Record Function:

static std::array<VkDeviceSize, 2> offsets = { 0, 0 };
std::array<VkBuffer, 2> buffers = { s_Data.VertexBuffers[0], s_Data.VertexBuffers[1] };

vkCmdBindVertexBuffers(commandBuffer, 0, buffers.size(), buffers.data(), offsets.data());

Vertex Shader:

#version 450

layout(location = 0) in vec2 a_Position;
layout(location = 1) in vec2 a_TexCoords;

layout(location = 0) out vec2 v_TexCoords;

void main()
{
    gl_Position = vec4(a_Position, 0.0, 1.0);
}

Fragment Shader:

#version 450

layout(location = 0) in vec2 v_TexCoords;

layout(location = 0) out vec4 o_Color;

void main()
{
    o_Color = vec4(v_TexCoords, 0.0, 1.0);
}

Does anybody know what I'm doing wrong here?

3 Comments
2024/05/09
10:42 UTC

0

New video tutorial: Command Buffers // Vulkan For Beginners #8

0 Comments
2024/05/08
19:49 UTC

1

[Beginner issue]: The application starts, but the window doesn't appear.

Hello, I just started learning Vulkan today. I'm using SDL3 as my windowing library since I have prior experience with it. I was following a tutorial, and essentially I'm trying to build this code: link. This is the first ever commit to the tutorial source code.

The application compiles fine, but the window doesn't show up. I'm trying to reinstall my Mesa drivers, but it doesn't have any effect.
I tried to find alternative ways of using sdl3 window but most of them seem to use outdated features.

#include <SDL.h>
#include <SDL_vulkan.h>
#include <stdexcept>

#define VULKAN_HPP_ENABLE_DYNAMIC_LOADER_TOOL 0

#include <vulkan/vulkan_raii.hpp>

class Noncopyable
{
public:
  Noncopyable() = default;

  Noncopyable(const Noncopyable &) = delete;

  const Noncopyable &operator=(const Noncopyable &) = delete;
};

class SDLException : private std::runtime_error, Noncopyable
{
  const int code;

public:
  explicit SDLException(const char *message, const int code = 0) : runtime_error(message), code{code} {}

  [[nodiscard]] auto get_code() const noexcept
  {
    return code;
  }
};

class SDL : Noncopyable
{
public:
  explicit SDL(SDL_InitFlags init_flags)
  {
    if (const auto error_code = SDL_Init(init_flags))
      throw SDLException{SDL_GetError(), error_code};
  }

  ~SDL()
  {
    SDL_Quit();
  }
};

class VulkanLibrary : Noncopyable
{
public:
  explicit VulkanLibrary(const char *path = nullptr)
  {
    if (const auto error_code = SDL_Vulkan_LoadLibrary(path))
      throw SDLException{SDL_GetError(), error_code};
  }

  ~VulkanLibrary()
  {
    SDL_Vulkan_UnloadLibrary();
  }

#pragma clang diagnostic push
#pragma ide diagnostic ignored "readability-convert-member-functions-to-static"

  [[nodiscard]] auto get_instance_proc_addr() const
  {
    if (const auto get_instance_proc_addr = reinterpret_cast<PFN_vkGetInstanceProcAddr>(SDL_Vulkan_GetVkGetInstanceProcAddr()))
      return get_instance_proc_addr;
    else
      throw SDLException{"Couldn't load vkGetInstanceProcAddr function from the vulkan dynamic library"};
  }

  [[nodiscard]] auto get_instance_extensions() const
  {
    uint32_t count;
    if (!SDL_Vulkan_GetInstanceExtensions(&count, nullptr))
      throw SDLException{"Couldn't get vulkan instance extensions count"};
    std::vector<const char *> extensions(count);
    if (!SDL_Vulkan_GetInstanceExtensions(&count, extensions.data()))
      throw SDLException{"Couldn't get vulkan instance extensions"};
    return extensions;
  }

#pragma clang diagnostic pop
};

class Window : Noncopyable
{
  SDL_Window *handle;

public:
  Window(const char *title, const int width, const int height,
         const SDL_WindowFlags flags = static_cast<SDL_WindowFlags>(0)) : handle{
                                                                              SDL_CreateWindow(title, width, height, flags)}
  {
    if (!handle)
      throw SDLException{SDL_GetError()};
  }

  ~Window()
  {
    SDL_DestroyWindow(handle);
  }

  [[nodiscard]] auto create_surface(const vk::raii::Instance &instance) const
  {
    vk::SurfaceKHR::NativeType surface_handle;
    if (!SDL_Vulkan_CreateSurface(handle, *instance, &surface_handle))
      throw SDLException{SDL_GetError()};
    return vk::raii::SurfaceKHR{instance, surface_handle};
  }

  [[nodiscard]] auto get_handle() const noexcept
  {
    return handle;
  }
};

using QueueFamily = std::pair<vk::raii::PhysicalDevice, size_t>;

auto main(int argc, char **argv) -> int
{
  const SDL sdl{SDL_InitFlags::SDL_INIT_VIDEO};
  const VulkanLibrary vulkan_library{};

  vk::raii::Context context{vulkan_library.get_instance_proc_addr()};

  vk::ApplicationInfo application_info{};
  application_info.apiVersion = VK_API_VERSION_1_3;

  vk::InstanceCreateInfo create_info{};
  auto extensions = vulkan_library.get_instance_extensions();
  if (SDL_GetPlatform() == std::string_view{"macOS"})
  {
    create_info.flags |= vk::InstanceCreateFlagBits::eEnumeratePortabilityKHR;
    extensions.emplace_back(VK_KHR_PORTABILITY_ENUMERATION_EXTENSION_NAME);
  }
  create_info.setPEnabledExtensionNames(extensions);
  create_info.pApplicationInfo = &application_info;
  const vk::raii::Instance instance{context, create_info};

  const Window window{"Salam", 800, 600, SDL_WindowFlags::SDL_WINDOW_VULKAN};
  const auto surface = window.create_surface(instance);

  std::optional<QueueFamily> queue_family{};
  for (const auto &physical_device : instance.enumeratePhysicalDevices())
  {
    const auto queue_families_properties = physical_device.getQueueFamilyProperties();
    for (std::size_t queue_family_index = 0;
         queue_family_index != queue_families_properties.size(); ++queue_family_index)
    {
      const auto queue_family_properties = queue_families_properties[queue_family_index];

      if (queue_family_properties.queueFlags & vk::QueueFlagBits::eGraphics)
        queue_family = {physical_device, queue_family_index};
    }
  }
  if (queue_family.has_value())
  {
    const auto &[physical_device, queue_family_index] = *queue_family;
    SDL_Log("Found queue family: %s %d", physical_device.getProperties().deviceName.data(), queue_family_index);
  }

  bool should_close{};
  while (!should_close)
  {
    for (SDL_Event event; SDL_PollEvent(&event);)
    {
      switch (event.type)
      {
      case SDL_EventType::SDL_EVENT_QUIT:
        should_close = true;
      }
    }
  }

  return 0;
};

https://preview.redd.it/5gxvhkdv19zc1.png?width=1024&format=png&auto=webp&s=43814c730b54ba5ddf0bdde4b655b7fb4ef181ba

2 Comments
2024/05/08
19:02 UTC

2

How can I learn Vulkan to be a graphics programmer

Hi! I am right now in my third year of engineering degree and I have a good knowledge of C++ and also due to intrest in Machine learning I have solid grasp on vectors and matrices and I now want to become a graphics engineer. It seems that there are many apis to get the job done and vulkan seems right to me. do you peeps know any tutorials or books where I can learn vulkan from basics and not just vulkan but computer graphics also . something like frank lunas book on directx but for vulkan . I mostly prefer text content over videos.

4 Comments
2024/05/07
19:18 UTC

7

Need help with optimizing command buffer recording

Hi,

I have a problem with the build time of my command buffer. It's created by recording about 10 secondary command buffers (5ms each on separate threads) and then executing them in the primary one. After that I end recording my primary command buffer, and it takes 10ms. What can cause this behavior?

13 Comments
2024/05/07
13:48 UTC

3

Proper way to sample VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16 texture

I have set up a VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16 texture as best as I can based on the examples I've read; no validation errors/warnings.

The source I am copying from is little-endian, do I need to perform <<6 as I am copying to the GPU buffers (which in turn is copied to the image memory)?

Is this the proper sampling of a 10-bit 3-plane texture?

layout(binding = 3) uniform sampler2D samplerColor;
layout(location = 0) in vec2 inUV;
layout(location = 0) out vec4 outFragmentColor;
void main() {
outFragmentColor = texture(samplerColor, inUV);
}
1 Comment
2024/05/07
04:54 UTC

5

How to create BC6H encoded KTX2 texture from Vulkan R32B32B32A32_SFLOAT image?

I generated cubemap and prefiltered map with R32G32B32A32_SFLOAT using Vulkan, but it's too huge so I want to compress it as BC6H_UFLOAT. I success to write the image data into KTX container, but have no idea for texture compression.

5 Comments
2024/05/06
14:07 UTC

2

Making sense of VkSamplerYcbcrConversionImageFormatProperties.combinedImageSamplerDescriptorCount

Working with YCbCr extension. VkImage has 3 disjoint pieces of memory and a format of VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16

According to this, "an implementation can use 1, 2, or 3 descriptors for each combined image sampler used", In my case, the value returned is 1 and I cannot wrap my mind around that: Shouldn't a 3 planar texture with disjoint memory use more than 1, or am I missing something in the way I query?

(By the way, the call returns VK_SUCCESS)

VkSamplerYcbcrConversionImageFormatProperties samplerYcbcrConversionImageFormatProperties{};
samplerYcbcrConversionImageFormatProperties.sType = VK_STRUCTURE_TYPE_SAMPLER_YCBCR_CONVERSION_IMAGE_FORMAT_PROPERTIES;
samplerYcbcrConversionImageFormatProperties.pNext = nullptr;
samplerYcbcrConversionImageFormatProperties.combinedImageSamplerDescriptorCount = 0u;

VkImageFormatProperties imageFormatProperties{};

VkImageFormatProperties2 imageFormatProperties2{};
imageFormatProperties2.sType = VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2;
imageFormatProperties2.pNext = &samplerYcbcrConversionImageFormatProperties;
imageFormatProperties2.imageFormatProperties = imageFormatProperties;

VkPhysicalDeviceImageFormatInfo2 physicalDeviceImageFormatInfo2{};
physicalDeviceImageFormatInfo2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2;
physicalDeviceImageFormatInfo2.format = VK_FORMAT_G10X6_B10X6_R10X6_3PLANE_422_UNORM_3PACK16;
physicalDeviceImageFormatInfo2.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT;
physicalDeviceImageFormatInfo2.flags = VK_IMAGE_CREATE_DISJOINT_BIT;
auto result = vkGetPhysicalDeviceImageFormatProperties2(mPhysicalDevice, &physicalDeviceImageFormatInfo2, &imageFormatProperties2);

std::wcout << "combinedImageSamplerDescriptorCount = " << samplerYcbcrConversionImageFormatProperties.combinedImageSamplerDescriptorCount << std::endl;
1 Comment
2024/05/06
14:02 UTC

5

Handling device limits when using dynamic uniform buffers

Hello, I have been reading up on dynamic uniform buffers and the example code by SaschaWillems - https://github.com/SaschaWillems/Vulkan/blob/master/examples/dynamicuniformbuffer

I understood the concept when it comes to making it work in an isolated example, however I can't wrap my head around how it would work in a larger system, for instance:

  1. Using it as per-object transform matrices and binding with a different offset between each draw call
  2. Using it as per-material instance properties for each material type, since each material instance has the same parameter types, just different values, so like in the per-object transforms, just bind a different offset per-material instance

The issue here is that different devices may have a different limit for UBO size, as well as maximum offset counts, as well as maximum descriptor counts.

So one way I imagine it would work (for transforms) is like this:

layout(push_constant) uniform TransformIndex{
    int index;
} pc;
layout(set = 0, binding = 0) uniform TransformsBuffer {
    mat4 transform[];
} ubo;

And some combination of push constant TransformIndex for the right index in the 'transform' array that will be used like ubo.transform[pc.index] and the dynamic offset in the TransformsBuffer would determine exactly which matrix would end up being used in the vertex shader.

With the materials it's the same, let's say I have a material that has the parameters SparkleIntensity and TextureDirection and material instances differ by just the values of these two parameters:

layout(set = 0, binding = 0) uniform MaterialParameters {
    float SparkleIntensity[];
    float4 TextureDirection[];
} uboMaterial;

Now, the problems are:

  1. I have to be sure I never exceed any device limits and if I do, if the maximum offset count and/or descriptor count and/or UBO sizes are not enough to fit my per-frame data, I would have to batch the data and update them from CPU to GPU mid-frame as many times as needed. This would kill my performance, as first I have to dispatch the command buffer to do work with the current data that is inside the UBOs and only after it finishes, I have to fill them with the next batch of data. All this during one frame. Wouldn't this just completely kill the performance advantages of using UBOs over just a big storage buffer?
  2. How do I even know if there is a performance benefit to using dynamic uniform buffers for material instances? I am assuming the benefit would come from the fact that a dynamic uniform buffer is contiguous in memory because it's just several buffers one after another. However, can't the descriptor pool also allocate sets in contiguous memory if I allocate multiple at once? Is it really such a performance penalty to use a separate uniform buffer for each material instance for parameters?

The complexity of the renderer skyrockets when switching to this from previously just using a big storage buffer for transforms and regular descriptors for the materials.

Does anyone have experience with this, I looked online but there weren't any big examples of a robust system using dynamic uniform buffers, just small examples that show how to create and bind one.

9 Comments
2024/05/06
02:09 UTC

1

Vulkan based libraries?

Are there any cpp libraries which are built on top of vulkan? For example, raylib is a decent library (slightly opinionated as a result but still low level) built on top of opengl. I'm going to dive into raylib for now to try to make some stuff, but in the future it would be cool to have a vulkan option as well (in case the vulkan overlords fully get rid of opengl in say 50 years)

4 Comments
2024/05/05
23:58 UTC

1

Why does this code make my program only render clockwise faces even when set to counter-clockwise mode? And what is the purpose of multiplying this element of this matrix?

This code closely resembles the Vulkan tutorial code:

void updateUniformBuffer(uint32_t currentImage)

{

UniformBufferObject ubo{};

ubo.model = glm::mat4(1.0f);

ubo.view = glm::lookAt(camera.position, camera.position + camera.front, camera.top);

ubo.proj = glm::perspective(glm::radians(camera.fov), (*swapChainExtentGlobalAccessCopy).width / (float)(*swapChainExtentGlobalAccessCopy).height, 0.1f, 100.0f);

ubo.proj[1][1] *= -1;

memcpy(uniformBuffersMapped[currentImage], &ubo, sizeof(ubo));

}

And I am trying to understand why do my counter-clockwise faces not render when rasterizer.frontFace is set to VK_FRONT_FACE_COUNTER_CLOCKWISE ? Edit: You can ignore the second question in the post title, reddit wont let me remove it for some reason.

3 Comments
2024/05/05
20:06 UTC

3

Weird vkDestroyInstance crash

I've found a very weird vkDestroyInstance crash.

I have a Windows laptop with an Intel integrated GPU and an Nvidia discrete one. If I disable Optimus and run NVidia GPU only - no crash. But if both are enabled vkDestroyInstance occasionally crashes. Maybe one out of ten times I run my hello triangle app. Prior to the crash everything works as expected. This happens in both Debug and Release configurations. Attaching the debugger in Debug (but not in Release) config seems to prevent the crash.

I tried running with Visual Studio ASAN and it says this at vkDestroyInstance:

==5408==Failed to commit shadow memory at '0x0466d33f7d58'. VirtualAlloc failed with 0x1e7

Though occasionally the app still crashes without any output. If I comment out vkDestroyInstance, I get no complaints from the ASAN.

What could that be? A driver bug?

Edit: crash disappears if I select the integrated GPU instead of the discrete one. ASAN still complains though.

Edit 2: crash disappears if I set the preference for the app to run on discrete graphics in Nvidia control panel. Doesn't matter which device I actually select.

17 Comments
2024/05/05
19:49 UTC

10

Any tutorial for designing a generic Vulkan-based rendering engine?

Hello everyone.

I have spent some time learning Vulkan and developed some small projects based on it (e.g. a PBR renderer, a parallel computing application), and I start to wonder how generic rendering engines are developed based on Vulkan.

When I am developing small projects, I know how many descriptor sets there should be and how to bind them. I know whether VkImages are used as sampled textures or storage images and how to do layout transitions and queue ownership transfers on them accordingly. I also know how many graphics/compute command buffers I need and their sychronization order. So I hardcode them into my applications. But as the programs grow large, the codes become messy and difficult to read.

So I tried to design an "Engine" class for each of my project, to decouple all Vulkan-related stuffs from other modules. But each "Engine" class is specialized for its project. If I want to use one engine in another projects, I must greatly modify its API and underlying code.

I want to design a high-level but also generic engine based on Vulkan. "High-level" means users need only a few more lines to load scenes, render PBR material, apply IBL, shadow mapping, etc, without calling Vulkan's low-level API. "Generic" means users can extend its functionality, like adding several compute kernels and synchronizing them with the engine's built-in graphics rendering shaders.

However I found the second point is very difficult to implement. The engine knows nothing about what descriptor sets, images, buffers, image layouts, command buffers, semaphores are needed. So unavoidably the users need to call Vulkan's API to achieve their goals. But this conflicts with the idea of "high-level".

Do you have any recommendations for tutorials or code repositories that I can learn from, or for how to design such an engine?

5 Comments
2024/05/05
16:20 UTC

3

Is it possible to skip a fragment shader?

Right now I have confígured my pipeline to only take in a vertex buffer and I am drawing into an image without depth/stencil attachments only a color attachment (which was mandatory i think otherwise my validation layers complained). But I am only seeing a black screen. I set the color of the vertices in the shader so that's not the problem. But now I am kind of stuck on what to do, where can I go wrong?

Thanks for the help in advance

17 Comments
2024/05/05
13:45 UTC

1

How to copy VkBuffer's to a VkImage with disjoint memory

I am using the YCbCr extension.

It is advantageous for me to keep the data in separate planes because of the processing to be done on Y data alone.

Incoming data is interleaved YCbCr in one buffer.

I memcpy() that data into three VkBuffers, but not sure about how to transfer them to the 3-plane VkImage that has 3 disjoint pieces of memory.

Is setting the imageSubresource.aspectMask = VK_IMAGE_ASPECT_PLANE_{0/1/2}_BIT; field of VkBufferImageCopy enough for this?

1 Comment
2024/05/05
11:02 UTC

0

validation layers requested, but not available!

Recently switched from C to C++ so I could follow along with this tutorial and create a game from scratch with Vulkan; it had all the low level components that made me love C, was written in C. I would've written this whole thing in C if the tutorial didn't use a bunch of weird C++ syntax (every for loop gets turned into an unreadable "auto& randomBS : wtf" mess).

Currently I'm writing in VScode and running Linux Mint XFCE, and at this point in the tutorial. I have a working make file, and all the proper libraries and SDK installed, but when I run the program, I get the error, "validation layers requested, but not available!"

After using vkconfig, I think I've tracked down the source of the issue. When running it, it shows that a file called VK_LAYER_KHRONOS_validation is missing, but when I check my lib files, the file is there; I thought maybe I was supposed to have that open as part of my system environment and that was causing the issue, but checking my environment files, the file was still there.

If anyone smarter than me knows how to fix this, it would be greatly appreciated

5 Comments
2024/05/05
01:35 UTC

1

VK_ERROR_INITIALIZATION_FAILED with GLFW on Wayland

When I try to create a Window Surface with GLFW on Wayland I always get VK_ERROR_INITIALIZATION_FAILED. I noticed that GLFW does not include the `VK_KHR_wayland_surface` Extension in it's required Extensions but even when I include it myself the code fails.

Do I have to tell GLFW that I'm on Wayland somehow?

Here's my code: https://github.com/darman96/CardGame

Edit: The VulkanSDK is supposed to be in the Dependencies folder but I did not commit it because of file size.

6 Comments
2024/05/04
18:21 UTC

3

What is the maximum amount of threads you can use for a workgroup?

If I understand correctly, a workgroup is mapped to a warp and Nvidia GPUs have thousands of warps. So we can easily dispatch, for example, 16 by 16 workgroups for our compute shader. However, from what I've read, each warp only contains a maximum of 32 threads.

So, wouldn't the maximum amount of threads for each workgroup be something like (16,2,1) because 16x2x1 = 32 threads?

8 Comments
2024/05/04
16:44 UTC

8

What does the VulkanAPI do that OpenGL and CUDA can't?

I'm wondering because I'm about to choose which one I'm going to use for visualizations that involve a high number of objects being super zoomed in and out of.

41 Comments
2024/05/04
05:09 UTC

7

Vulkan Distinct Compute Queue Family

I have implemented an image processing pipeline that synchronizes a render target image between a fragment shader and compute shader then back to the graphics pipeline such that I have accounted for both cases when graphicsQueue == computeQueue and graphicsQueue != computeQueue. The former case works fine. However, the Geforce RTX 3080 in my dev laptop doesn't seem to offer any compute queue families that aren't also a graphics family so I am unable to test the latter (it does report several distinct transfer queue families which I am using for async resource loading). How does one spec to know whether hardware will be capable of this prior to purchase? Is there a public list of reported queue family combinations for consumer GPUs in the wild?

I am about to do the same thing with DX12 and I suppose I will find out if it is a limitation of the hardware or driver. Is it safe to make the claim that Geforce devices <= RTX 3080 are going to report the same results or could this differ between the laptop version of the 3080 and the full sized card?

Conclusion:

My host does offer a compute family that isn't shared with graphics. I was choosing the queue family for compute incorrectly. See comments below.

5 Comments
2024/05/03
16:56 UTC

Back To Top