Real-Time Lighting with Gaussian Splats

Applying traditional dynamic lighting techniques to Gaussian splats.

There's been a lot of interest lately in Gaussian splats. This is a graphics primitive popularized by a breakthrough last year that enables much faster, higher-quality reconstructions of real-world scenes than other primitives like 3D meshes.

Recently I saw a few-month old tweet about lighting splats in real-time. I thought it was cool how simple the approach was and I couldn't find the source code, so I re-created it in WebGL. Try it out below or click here for fullscreen. You can use your mouse to move the camera or the lights and press M to toggle between lighting mode and no lighting.

The idea is to use rasterization to render splats just like in the original paper, but to recover surface normals from depth, then light the scene using traditional shading and shadow mapping.

Depth is computed the same way we compute color for each pixel, using the "over" operator to do alpha compositing. Given $ N $ splats with depths $ d_i $ and alphas $ \alpha_i $ ordered front-to-back: $$ D = \displaystyle \sum_{i=1}^N d_i \alpha_i \prod_{j=1}^{i-1} (1-\alpha_j) $$

The formula feels nonphysical because it can theoretically result in a surface in empty space between 2 gaussians, but in practice it captures the rough shapes of hard objects fairly well, even with the simplest implementation possible that uses the depth of each Gaussian's center rather than doing something more complicated to figure out the average depth of the section overlapping a pixel.

Pseudo-normals are computed for each pixel by reconstructing the world-space points of it and its neighbors $ \mathbf{p}_0, \mathbf{p}_1, \mathbf{p}_2 $, then crossing the resulting pseudo-tangent and bitangent vectors $ \mathbf{p}_1 - \mathbf{p}_0 \times \mathbf{p}_2 - \mathbf{p}_0 $.

Because Gaussian splat reconstructions are fuzzy representations optimized to look right on screen rather than form coherent surfaces, the resulting depth is quite noisy, even if the overall shape is captured decently, resulting in bumpy-looking surfaces. As the Twitter thread suggested, we can mitigate the bumpiness somewhat by running a bilateral filter over depth before normal recovery. A bilateral filter is a noise-reducing filter which is similar to a Gaussian blur, but which also preserves edges in the image. The idea is to get a noise-free depth image which still contains accurate object shapes.

In some splat reconstructions, the underlying geometry is still too bumpy. For example, in the Mip-NeRF360 garden dataset, the table and vase are super smooth when viewed normally, but actually have super bumpy depths which show up when rendered with lighting. The high-frequency bumps are mitigated a little with bilateral filtering but the low-frequency bumps are not.

Can we improve the geometry reconstructed during training? A few papers do this. Relightable 3D Gaussians proposes a method that not only learns better geometry but is able to estimate the materials and lighting for all the splats in a scene. The reconstructed splats form better geometry because of a new loss term pushing the reconstructed depth to look more like the depth that a pre-trained neural network thinks is right, and explicit normals are learned per-splat so that they don't need to be recovered at render-time. The result looks way better in the garden sceneThere are some issues where objects became more transparent with this training method, but I suspect that's just me not knowing how to tune the process, and not something inherent to the approach.:

The explicit normals make a huge difference. Compare the same scene reconstructed via Relightable Gaussians but rendered with pseudo-normals vs. explicit normals:

The R3DG paper's reconstructed materials can be used for physically-based re-rendering of objects I haven't been able to get the learned materials working with my viewer yet. If anyone is able to figure this out, let me know!. There are some cool videos on the project page that demonstrate this. The way it does this is super cool: lighting and materials are jointly optimized using a method called NeILFs which doesn't bother to simulate real lighting (which would be super slow to do accurately in complex real-world lighting conditions) and instead iteratively guesstimates the incident light at every point with a neural network.

Ideas for improvement

Following are my scattered thoughts as someone still pretty new to graphics programming.

The surface-based shading technique I used works suprisingly well. Since most of what we see in gaussian splat scenes are hard objects with well-defined boundaries, we're able to recover a fairly accurate surface, which is enough for us to perform traditional lighting. What I built so far only has basic shading and shadow-mapping but could be easily extended with more advanced lighting techniques to provide global illumination and PBR.

One thing this approach doesn't handle well is fuzzy surfaces and objects like grass. In splat reconstructions today these are usually represented with many scattered, mixed size and lower density gaussians, which look terrible with surface-based shading because the recovered surface is so bumpy. How could we improve how these look?

A common technique to render volumes like smoke or fog in real-time is to use ray-marching. We intentionally march multiple steps inside a volume and measure the light at each step to approximate the color reflected along the ray. Lighting can be made fast with techniques like caching lighting in a voxel grid or deep shadow maps.
Gaussian splats could be rendered using ray-marching. But ray-marching within each gaussian probably wouldn't help because most of these gaussians are so small that light doesn't vary much within the volume. So it's enough to take a single sample, which is what rasterization is doing. Ray-marching is also a lot slower than rasterization and we'd be trading off performance just so that the few fuzzy parts of a scene look better.
What we really want here is more accurate light reflection, firstly because there is no “surface” within these fuzzy volumes and secondly because we want to capture the fact that light continues to travel into the volume. The first problem seems trickier to solve because we'd need some way of deciding whether an individual gaussian is part of a surface or not.

There are applications which render objects as fuzzy volumes or point clouds with lighting, animations, etc. successfully at massive scale. One example is Dreams, which came out in 2020 for the PS4 and which used a combination of signed distance fields and point clouds to represent user-generated content. The use of these alternate graphics primitives gave the game a unique "painterly" look and feelI suspect it enabled the sculpting editor to be much more intuitive and flexible, too..

Dreams appears to use something much more like surface-based shading. Alex Evan's SIGGRAPH 2015 talk given early in the game's development describes point splatting yielding a G-buffer with normal, albedo, roughness, and Z, with deferred lighting done on the result. From a resetera thread, the final renderer used surface-based shading on geometry comprised of both splats and solid signed-distance fields, with the latter being added in later so that large scenes could benefit from occlusion culling:

things evolved quite a bit since the siggraph preso. the E3 trailer from that era was rendered using the atomic splat renderer mentioned at the end of the talk, and it's still my favourite 'funky tech' wise, however it suffered because everything had microscopic holes! ... why is this a big problem? because this broke occlusion culling, which we rely on heavily for performance ...

so to sum up: dreams has hulls and fluff; hulls are what you see when you see something that looks 'tight' or solid; and the fluff is the painterly stuff. the hulls are rendered using 'raymarching' of a signed distance field (stored in bricks), (rather than splatted as they were before). then we splat fluffy stuff on top, using lots of layers of stochastic alpha to make it ... fluffy :) this all generates a 'classic' g-buffer, and the lighting is all deferred in a fairly vanilla way (GGX specular model).

Why?

What's the point of all of this? If we want dynamic lighting in scenes captured by Gaussian splats, why don't we just convert the splats to a mesh representation and relight that?

If we convert to meshes, we can also incorporate splats into existing mesh-based workflows and renderers rather than building out specialized paths. This would be awesome for industry adoption of splats. Existing mesh-based renderers are also optimized to the hilt. Geometry systems like Unreal Engine 5's Nanite already make it possible to use extremely detailed, high-poly real-world captures via photogrammetry scans in videogames today. Gaussian splats make it even easier to get these real-world captures. So there is lots of interest here. Papers like SuGaR and Gaussian Frosting demonstrate techniques to do this with good performance on benchmarks. The resulting mesh can be further optimized to have accurate materials via differentiable mesh rendering and methods like NeILF, which isn't specific to splats.

That said working with point cloud and splat representations directly have some advantages. While meshing a splat representation is not too difficult, it takes extra time and resources, and the result is not guaranteed to be accurate. It's also sometimes easier to work with splats, for example to give scenes artistic flair like in Dreams. For similar reasons, SuGaR and Gaussian Frosting propose a hybrid rendering solution where objects are rendered as meshes with a thin layer of Gaussian splats attached to the surface to capture fuzzy details like fur.

Overall, while Gaussian splatting was a breakthrough technology for fast and accurate 3D reconstruction, we're still discovering what the splat format will be useful for beyond just “3D photos”. Time will tell if splats will be useful mainly as an intermediate format that we get from real-world captures, or if they'll be everywhere in future games and 3D design software. In the meantime research into rendering splats continues. Some interesting papers:

Animation: Animatable Gaussians, GAvatar
Lighting: Relightable 3D Gaussian, GaussianShader
Modeling fuzzy things: Gaussian Frosting
Rendering larger scenes faster: CityGaussian, RadSplat

Andrew Chan

Real-Time Lighting with Gaussian Splats

Applying traditional dynamic lighting techniques to Gaussian splats.

Ideas for improvement

Why?