/r/GraphicsProgramming

Photograph via snooOG

A subreddit for everything related to the design and implementation of graphics rendering code.

Rule 1: Posts should be about Graphics Programming.
Rule 2: Be Civil, Professional, and Kind


Suggested Posting Material:
- Graphics API Tutorials
- Academic Papers
- Blog Posts
- Source Code Repositories
- Self Posts
(Ask Questions, Present Work)
- Books
- Renders
(Please xpost to /r/ComputerGraphics)
- Career Advice
- Jobs Postings (Graphics Programming only)


Related Subreddits:

/r/ComputerGraphics

/r/Raytracing

/r/Programming

/r/LearnProgramming

/r/ProgrammingTools

/r/Coding

/r/GameDev

/r/CPP

/r/OpenGL

/r/Vulkan

/r/DirectX


Related Websites:
ACM: SIGGRAPH
Journal of Computer Graphics Techniques

Ke-Sen Huang's Blog of Graphics Papers and Resources
Self Shadow's Blog of Graphics Resources

/r/GraphicsProgramming

47,226 Subscribers

4

How to antialias the sdf edge?

https://preview.redd.it/g03j2tp7lo0e1.jpg?width=529&format=pjpg&auto=webp&s=172406cfc2ff1ce3a73bec67a2aef9dfb5cbaf4b

https://discourse.threejs.org/t/how-to-antialias-the-sdf-edge/73976 -three.js forum
https://jsfiddle.net/m6oe7c9f/26/ - demo using three.js/glsl
I was trying smooth the edge using fwidth and smoothstep for anti alias and it obviously works for particles but I just don’t how to get the distance to the edge of the sdf shape in this case a sphere radius of 1. I found some blog posts about it but I think it just comes down to storing a variable for distance to the edge then we can smooth or clamp the edges.

4 Comments
2024/11/13
14:46 UTC

12

Graphics library

I like graphics programming ,but to be honest I'm more interested in the math part ,and I'm working on building a math library for game development .I am looking for a graphics library (c language ) to test my math and demonstrations ,I was going to use graphics.h but apparently i need to use c++ for that .

Thanks in advance for your suggestions .

6 Comments
2024/11/13
12:55 UTC

8

Libraries for 3D triangle mesh boolean operations.

Has anyone tried any libraries for 3D triangle mesh boolean operations. I'm more interested in robust, accurate results than performance.

2 Comments
2024/11/13
10:09 UTC

4

Learning dx11 or dx12

So I'm a cs student in my 3rd year and wish to learn a graphics API ( already know a bit of math stuff and general graphics), without being familiar with any of the APIs is it a good place to start with dx11 because it might be easier than dx12 and stuff like that. Also i don't care about portability since I'm using Windows as my primary OS.

5 Comments
2024/11/12
21:47 UTC

0

Direct3D 12 Device Removed Error while trying to use NVIDIA's Falcor Framework

I have tried downloading and using NVIDIA's Falcor framework, but after building the Visual Studio solution and running the Mogwai project, an exception is thrown while creating the swapchain in Direct3D 12 and I get DXGI_ERROR_DEVICE_REMOVED. The error happens when trying to load igc1464.dll, which it loads, unloads, and tries to load again causing the error.

I have tried updating my drivers, adding graphics driver registry keys, running a Windows memory diagnostic check, reading the documentation, and checking forums for similar issues; none of which have helped.

I am running on a laptop with 32GB memory, RTX 4060 laptop GPU, and Windows 11.

Any help on how to fix this error and get Falcor to run would be appreciated, as I'd like to start using it for projects. Thank you!

1 Comment
2024/11/12
20:59 UTC

17

Can't understand how to use Halton sequences

It's very clear to me how halton / sobol and low-discrepancy sequences can be used to generate camera samples and the drawback of clumping when using pure random numbers.

However the part that I'm failing to understand is how to use LDSs everywhere in a path tracer, including hemisphere samping, here's the thought that makes it confusing for me:

Imagine that on each iteration of a path-tracer (using the word "iteration" instead of "sample" to avoid confusion) we have available inside our shader 100 "random" numbers, each generated from a 100-dimensional halton sequence (thus using 100 prime numbers)

On the next iteration, I'm updating the random numbers to use the next index of the halton sequence, for each of the 100 dimensions.

After we get our camera samples and ray direction using the numbers from the halton array, we'll always land on a different point of the scene, sometimes even on totally different objects / materials, in that case how does it make sense to keep on using the other halton samples of the array? aren't we supposed to "use" them to estimate the integral at a specific point? if the point always changes, and even worse, if at each light bounce we can get to a totally different mesh compared to the previous path-tracing iteration, how can I keep on using the "next" sample from the sequence? doesn't that lead to a result that is potentially biased or that it doesn't converge where it should?

13 Comments
2024/11/12
13:24 UTC

8

Stitching Graph joints

I have spent a bit thinking about the problem of meshing topological skeletons and I came up with a solution I kinda like.

So I am sharing here in case other people are interested: https://gitlab.com/dryad1/documentation/-/blob/master/src/math_blog/Parametric%20Polytopology/parametric_polytopology.pdf?ref_type=heads

0 Comments
2024/11/12
01:32 UTC

3

Intro to CS topics important for computer graphics/ CG programming tailored CS curriculum?

As I've been studying basic DSA and discrete mathematics, I have felt a bit listless despite trying to recognize the overall importance of these concepts. I wanted to pursue computer graphics programming since teaching a computer to process space, vertexes, form, light, movement etc felt more interesting and comprehensible than systems of search engines and user data. in websites and apps. It's hard to understand why all these algorithms exist and relate the topics to computer graphics. For programming/computer science beginners, what are important topics to know for computer graphics?

1 Comment
2024/11/11
23:58 UTC

0

Architecture question-What HW stage would be the most efficient for per vertex position alteration? | Possibly geomorphing related.

So I've had this idea regarding a heatmap that records the size of triangles in a meshes single vertex channel.
I've been looking into the VRAM cost of LODs(higher density) but not a fan of recent cluster implementations(might look into a very conservative streaming plan). So in order to take advantage of faster hardware quad rendering, I want to stop the view samples from sampling small triangles.

Basically the distance of the camera multiplies a sinking effect on small triangles(vertices under a threshold) and closure intensity of neighboring vertices(larger triangles end up occluding the smaller tris).

Up to 12m tris could be processed but I'm aware that some stages in the HW pipeline such as GS are slow and whatever HW stage unreal's WPO uses also had large documented overhead(haven't done serious performance measures).

Target hardware would be 20 series+, rnda2+, and arc gpus(in terms of HW support which are all pretty synced outside of MSAA support I've heard).

A point in the right direction would be helpful and just asking all GPs spaces I can reference 👍

Thanks.

8 Comments
2024/11/11
13:34 UTC

66

tiny_bvh.h version 0.4.2 now available

Last week I released on github a new single-header file library for constructing and traversing BVHs on the CPU, with (short-term) plans to take traversal (not construction) to the GPU as well. Link:

https://github.com/jbikker/tinybvh

Features (version 0.4.2):

  • No dependencies at all. Just add the header to your project and it works.
  • Simple interface. Build using a flat array of triangle vertices.
  • Builds state-of-the-art SAH binned BVH using AVX - 34ms for 260k triangles.
  • Or, ~250ms for the same data without AVX, for cross-platform purposes.
  • Builds wide BVHs: 4-wide and 8-wide, also in GPU-friendly format.
  • BVH refitting.
  • BVH optimizing: post-process your static scene BVH for ~15% more speed.
  • Comparison against Embree is included (not winning yet but closing in).

Coming up:

  • CWBVH for billions of rays/second on the GPU.
  • OpenCL examples.
  • TLAS/BLAS traversal.

The code is an implementation and continuation of my articles on BVH construction:

https://jacco.ompf2.com/2022/04/13/how-to-build-a-bvh-part-1-basics

Support / questions:

Greets

Jacco.

8 Comments
2024/11/11
08:42 UTC

7

Rendering a big .OBJ file

Hi everyone,

I am part of a university project where I need to develop an app. My team has chosen Python as the programming language. The app will feature a 3D map, and when you click on an institutional building, the app will display details about that building.

I want the app to look very polished, and I’m particularly focused on rendering the 3D map, which I have exported as an .OBJ file from Blender. The file represents a real-life neighborhood.

However, the file is quite large, and libraries like PyOpenGL, Kivy, or PyGame don’t seem to handle the rendering effectively.

Can anyone suggest a way to render this large .OBJ file in Python?

18 Comments
2024/11/10
18:30 UTC

16

Best colleges in the US to get a masters in? (With the intention of pursuing graphics)

I've been told colleges like UPenn (due to their DMD program) and Carnegie Mellon are great for graphics due to the fact they have designated programs geared towards CS students seeking to pursue graphics. Are their any particular colleges that stand out to employers or should one just apply to the top 20s and hope for the best?

18 Comments
2024/11/10
16:12 UTC

3

Is it worth it to learn dx11?

So I am new to graphics programming and have worked with opengl and made renderers and stuff before and wanted to jump into more recent graphics apis. I thought of starting with dx12 but seen lots of posts saying to start with dx11. Any thought?

37 Comments
2024/11/10
14:45 UTC

1

Best Way to Render Multiple Objects with Different Transformations in One Render Pass?

4 Comments
2024/11/10
14:38 UTC

11

Spectral rendering - how do you resolve scales of CIE curves?

For spectral rendering, we rely on CIE curves which contain measured Spectral Power Distribution functions (SPD) in order to accurately model color and eventually convert spectral information back into sRGB for our displays to see.

Examples of these curves from CIE's official dataset are linked below :

CIE_XYZ_1931_2deg

CIE_Standard_Illuminant_A

CIE_Standard_Illuminant_D65

The part I'm having a hard time wrapping my head around is the scales of the values. The standard illuminants are scaled such that they take on a value of 100.0 at 560nm. The XYZ color matching curves seem to be scaled wrt to Y(555) which is itself relative to the spectral response curve.

If I were to use the curve for the standard illuminant and convert it into XYZ colors (for example), then wouldn't the scales of the inner product all be screwed up? Do raytracing engines do something special to rescale these curves from the official datasets or does it not matter?

4 Comments
2024/11/10
07:33 UTC

4

Is Shader Model is a Direct X Only concept

One thing that kind of confuses me - Shader Model is a Direct X only thing, correct?

In other words requiring SM5 support or SM6 means nothing to programs using Vulkan, OpenGL, GCN or Metal, correct?

When googling or using ChatGPT this seems to be mixed up constantly....

4 Comments
2024/11/10
01:25 UTC

29

Visual improvements for a relativistic renderer ?

Hey !

Two months ago i asked for advice to port a program from VEX (Python like) to C++. Well, time has passed as it tends to do and i got results to show.

https://preview.redd.it/fiesk0g1jyzd1.png?width=2000&format=png&auto=webp&s=ee6967f329caa4527374c9d858f68249c245e69d

There is obviously a lot going on and to cover it all we would need like a 50 page paper. We obviously managed to port the entire VEX code to C++, but also improved certain aspects massively. Here is a quick and non-exhaustive rundown of the changes and improvements

  • The program now is now called VMEC instead of untitled.hip (true story)
  • The Astrophysical jet got a complete makeover and is now skirting dangerously close of GRMHDs
  • We added accretion wind, which causes the glow around the BH, Disk and Jet. Its just a bunch of really hot but diffuse plasma moving out
  • Everything is written using VS Code, non of this AI bs (i am halfway joking, VS Studio drove me crazy)

Perhaps the most important chance is not in the code, but philosophical. The VEX code had no real objective. Me and Mr. Norway just kinda stumbled along.

VMEC has an objective. We want to make a free Black Hole rendering and education software that could, in principle, be used for Movie grade effects.

The Education bit is not important for this post, it basically boils down to a few options (such as replacing the Volumetric disk with a 2D one, visualizing Geodesics in the scene etc). Those are not hard to do.

What is hard to do is the "Movie grade" bit. Sure, the render above looks very nice, but it is a lot more technically impressive than visually. Then the question becomes what we can do to improve the look. We have two big ticket items on our to do list right now.

  • Axis Misaligned Jet and Disk (Precession)
  • In-Build Lens Flare system (I know Flares are almost always added in post, but they would still be useful to guide artists. I have worked in VFX for a few years after all)
  • Multiple Scattering

That last point carries a lot of hope on our end. Right now VMEC is a "0th Scattering" renderer. The only light a ray sees is that along its direct path. There are no secondary rays because there are no light sources to do Single Scattering with.
We hope Multiple Scattering will improve the volumetrics to the point where they become useful in a production environment. The reason we have avoided Multiple Scattering thus far is the performance cost. But trial GPU ports have given us reasonable confidence in the render time feasibility of a "Multiple Scattering" option for VMEC.

Ofc, there are non-visual features we want to implement as well

  • Animation graph editor
  • 360 Degree rendering

amongst other. We will probably not add .obj support or anything similar because that would run into conflict with some very fundamental assumptions we have made. VMEC is build in natural units were c=G=M=1. So the Black Hole is actually just 1.4 units across. The disk is 120 units in radii and the jet is 512 units long.

Anyways, the whole point of this post is to ask for advice.

Right now, while VMEC´s renders look nice, they are very clearly CGI. We think the main reason they do is the lack of Multiple Scattering, judging by other volumetric renderers. But we might miss something. So any advice on how to improve the look would be highly appreciated !

1 Comment
2024/11/09
23:34 UTC

1

Why are the HIPRTC and CUDARTC APIs for compiling kernels at runtime single-threaded?

CUDA/HIP kernels can be compiled at runtime with the CUDARTC and HIPRTC APIs (NVIDIA and AMD respectively).

In my experience, starting multiple std::thread to compile multiple kernels in parallel just doesn't seem to work: launching 2 std::thread in parallel doesn't take less time than compiling two kernels in a row on the main thread.

The 'lock' seems to be deep in the API DLLs as that's where the thread is stuck when breaking into the debugguer.

Why is it like that? If a compiler is "simply" parses the kernel code to "translate" it to bitcode/PTX/... then why does it have to be synchronized like that?

2 Comments
2024/11/09
22:36 UTC

25

I want to learn graphics programming. What API should I learn?

I work as a full-time Flutter developer, and have intermediate programming skills. I’m interested in trying my hand at low-level game programming and writing everything from scratch. Recently, I started implementing a ray-caster based on a tutorial, choosing to use raylib with C++ (while the tutorial uses pure C with OpenGL).

Given that I’m on macOS (but could switch to Windows in the future if needed), what API would you recommend I use? I’d like something that aligns with modern trends, so if I really enjoy this and decide to pursue a career in the field, I’ll have relevant experience that could help me land a job.

23 Comments
2024/11/09
20:50 UTC

1

Why is my Vulkan TLAS build causing device lost

Hi everyone,

I'm working on a Vulkan-based TLAS (Top-Level Acceleration Structure) build, and after adding copy commands to the instance buffer, my application crashes with VkResult -4 (device lost) once the command vkCmdBuildAccelerationStructuresKHR is recorded and submitted with the validation error:

validation layer: Validation Error: [ VUID-vkDestroyFence-fence-01120 ] Object 0: handle = 0xb8de340000002988, type = VK_OBJECT_TYPE_FENCE; | MessageID = 0x5d296248 | vkDestroyFence(): fence (VkFence 0xb8de340000002988[]) is in use. The Vulkan spec states: All queue submission commands that refer to fence must have completed execution (https://vulkan.lunarg.com/doc/view/1.3.275.0/windows/1.3-extensions/vkspec.html#VUID-vkDestroyFence-fence-01120)

The fence crash is a result of the program hanging there due to something in the TLAS which is not correct, though I am struggling to understand what exactly. I followed the vulkan basic example closely on their Github and can't find too much difference from theirs and mine to cause a crash like this.

Here’s the part of the code where I do the copy to the instance buffer. It seems correct to me: Full code

auto instancesBuffer = new Buffer(V::CreateBuffer(sizeof(VkAccelerationStructureInstanceKHR) * instances.size(), VK_BUFFER_USAGE_ACCELERATION_STRUCTURE_STORAGE_BIT_KHR | VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT | VK_BUFFER_USAGE_ACCELERATION_STRUCTURE_BUILD_INPUT_READ_ONLY_BIT_KHR | VK_BUFFER_USAGE_TRANSFER_DST_BIT, VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT, VMA_MEMORY_USAGE_AUTO_PREFER_DEVICE));

std::vector<VkAccelerationStructureInstanceKHR> instances;
for (size_t i = 0; i < 1; ++i) {
    AS& blas = allBlas[i];  

    VkAccelerationStructureInstanceKHR instance = {};
        ...
    instance.accelerationStructureReference = blas.deviceAddress;
    instances.push_back(instance);
}

auto stagingBuffer = new Buffer(V::CreateBuffer(context.allocator, sizeof(VkAccelerationStructureInstanceKHR) * instances.size(),VK_BUFFER_USAGE_TRANSFER_SRC_BIT,VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT,VMA_MEMORY_USAGE_AUTO_PREFER_HOST));

void* mappedData;
vmaMapMemory(context.allocator.allocator, stagingBuffer->allocation, &mappedData);
memcpy(mappedData, instances.data(), sizeof(VkAccelerationStructureInstanceKHR) * instances.size());
vmaUnmapMemory(context.allocator.allocator, stagingBuffer->allocation);

VkBufferCopy copyRegion = {};
copyRegion.size = sizeof(VkAccelerationStructureInstanceKHR) * instances.size();
vkCmdCopyBuffer(cmdBuff, stagingBuffer->buffer, instancesBuffer->buffer, 1, &copyRegion);

VkBufferMemoryBarrier bufferBarrier{ VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER };
bufferBarrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
bufferBarrier.dstAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_KHR | VK_ACCESS_SHADER_READ_BIT;
bufferBarrier.buffer = instancesBuffer->buffer;
bufferBarrier.size = VK_WHOLE_SIZE;
bufferBarrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
bufferBarrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;

// Copy data from CPU staging buffer to GPU
vkCmdPipelineBarrier(cmdBuff,VK_PIPELINE_STAGE_TRANSFER_BIT | VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR,VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR, 0,0, nullptr1, &bufferBarrier, 0, nullptr);

EndAndSubmitCommandBuffer(context, cmdBuff);

The error occurs at this line where I end and submit the command buffer

VkCommandBuffer buildCmd = AllocateCommandBuffer(context, m_renderCommandPools[V::currentFrame].handle);
BeginCommandBuffer(buildCmd);
vkCmdBuildAccelerationStructuresKHR(
    buildCmd,
    1,
    &accelerationBuildGeometryInfo,
    accelerationBuildStructureRangeInfos.data());
 
EndAndSubmitCommandBuffer(context, buildCmd);

Aftermath report which I do not understand

https://preview.redd.it/exa5z57exxzd1.png?width=1048&format=png&auto=webp&s=8987fbd5e7b336b203e28db9ec9018b55c9ddad7

6 Comments
2024/11/09
20:40 UTC

26

Why is wavefront path tracing 5x times faster than megakernel in a fully closed room, no russian roulette, no ray sorting/reordering?

u/BoyBaykiller experimented a bit on the Sponza scene (can be found here) with the wavefront approach vs. the megakernel approach:

| Method      | Ray early-exit  | Time    |
|------------ |----------------:|-------: |
| Wavefront   | Yes             | 8.74ms  |
| Megakernel  | Yes             | 14.0ms  |
| Wavefront   | No              | 19.54m  |
| Megakernel  | No              | 102.9ms |

Ray early-exit "No" meaning that there is a ceiling on the top of Sponza and no russian roulette: all rays bounce exactly 7 times, wavefront or not.

With 7 bounces, the wavefront approach is 5x times faster but:

  • No russian roulette means no "compaction". Dead rays are not removed from the computation and still occupy "wavefront slots" on the GPU.
  • No ray sorting/reordering means that there should be as much BVH traversal divergence/material divergence with or without wavefront.
  • This was implemented with one megakernel launch per bounce, nothing more: this should mean that the wavefront approach doesn't have a register pressure benefit over megakernel.

Where does the speedup come from?

12 Comments
2024/11/09
17:51 UTC

11

Why am I getting energy gains whith a sheen lobe on top of a glass lobe in my layered BSDF?

I'm having some issues combining the lobes of my layered BSDF in an energy preserving way.

The sheen lobe alone (with white lambertian diffuse below instead of glass lobe) passes the furnace test. The glass lobe alone passes the furnace test.

But sheen on top of glass doesn't pass it at all, there's quite a lot of energy gains so if the lobes are fine on their own, it must be a combination issue.

How I currently do things:

For sampling a lobe:

  • 50/50 between sheen or glass.
  • If currently inside the object, only the glass lobe is sampled.

PDF:

  • 0.5f * sheenPDF + 0.5f * glassPDF (comes from the 50/50 proba in sampling routine)
  • If refracting in or out of object from sampling the glass lobe, the PDF is just 1.0f * glassPDF because the sheen BRDF does not deal with directions below the normal hemisphere so the sheen BRDF has 0 proba to sample such a direction.

Evaluating the layered BSDF: sheen_eval() + (1.0f - sheen_reflectance) * glass_eval().

  • If refracting in or out, then only the glass lobe is evaluated: glass_eval() (because we would be evaluating the sheen lobe with an incident light direction that is below the normal hemisphere so sheen BRDF would be 0.0f)

And with a glass sphere 0.0f roughness and IOR 1, coming from air IOR 1, this gives this screenshot.

Any ideas what I might be doing wrong?

11 Comments
2024/11/09
10:14 UTC

8

Unit testing gpu code

Hi , let's say I have a project with shaders , calls to graphical api , or gpgpu functions, is there cons in doing unit tests for that part of the code ?
For example , I want to test how a cuda kernel behaves, do you think it's a good idea to create a unit test , with the whole buffer allocation , memcpy , kernel execution , memcpy , test the result , destroy the buffer.
Or I want to test the output of a shader , etc etc...

It does slow down the test a bit , but I don't see that as an issue ... What do you guys think ?

4 Comments
2024/11/08
18:58 UTC

Back To Top