Among all the new features of luminance-0.40, there is one that I think is worth explaining in details, because it’s… pretty cool. If you like type states, refinement typing and on a general level, strong typing, you should enjoy this blog article.
While working on luminance-0.40, I spent two weeks working on a feature that is going to have a huge beneficial impact on how people use luminance. I’ve been wanting to see what people think about it for a long time, because I think it’s both a very powerful feature and allows to do low-level memory stuff in a safe way via the type system. Without further ado, let’s dig in.
- The Tess type
- The magic of type systems
- Deinterleaving compile-time dispatching
- Wrap it up
The Tess type
In pre luminance-0.40, it is possible to gather vertices inside a type called
Tess (it stands for tessellation). That type is responsible for holding vertices to render them later. They might represent point clouds, triangle-based meshes, lines, etc. It contains several properties, but we are going to focus on three:
- The actual vertex data (i.e. the held vertices).
- Indices, that can be used to prevent duplicating vertices and index them to reduce the size of the mesh.
- Vertex instances, which are special data that are instance-dependent. In a graphics library or a graphics engine, an instance often refers to an instance of a template, a pattern, that is to be shared among instances: the instances simply add some specific values, like positions, colors, etc. Imagine in a simplistic video game a garden with trees: it’s likely that the trees (all of them) are rendered via a single draw call, which contains instance data that is going to customize each instance.
In luminance-0.39, vertices, indices and instances can be retrieved, both read-only on read-write, via a mechanism of GPU slicing: you ask luminance to give you a type with which you are able to use
However… we have several problems. First, the type is
Tess and is completely monomorphized. What it means is that if you create a
Tess that contains vertices of type
SimpleVertex with indices of type
u8, that information is not encoded in the type system — it is actually encoded as a value that is read and checked at runtime! When you ask for a slice, for instance to mutate the content of the
Tess, you have to use a function like
Tess::as_slice<V> where V: Vertex, which expects you to pass the type of stored vertices — in our case, it would be
SimpleVertex. What happens if someone passed the wrong type? Well, currently, luminance checks at runtime that the stored type is the right one, but this is both wasted checks and not a very elegant API.
The same applies to indices and instance data: you don’t see them in the type. What happens if you slice the indices with
u32? Runtime checks.
Now, there’s also the problem of building. When you build a
Tess, you have to pass the vertices to the
TessBuilder type, using functions like
TessBuilder::add_vertices. It works, but it’s not practical. More importantly: if you call several times that function, you will create something that is called a deinterleaved memory data structure. Let’s digress a bit about that.
Digression on interleaved and deinterleaved data structures
Interleaving and deinterleaving makes sense when we talk about several objects / items of a same type, laid out in memory — typically in arrays. Imagine a type,
Vertex, such as:
If you take a vector of
Vec<Vertex>), you get a continuous array of properties in memory. If you represent that in memory, you have something similar to this for two vertices (padding omitted):
[x0, y0, z0, r0, g0, b0, a0, x1, y1, z1, r1, g1, b1, a1]
We say that the memory is interleaved, because you’re going to alternate between fields when iterating in memory. Everything is interleaved. This kind of memory is what happens when you put a
struct in an array, slice, etc. and is perfectly suited for most situations (even on the GPU). However, there is a small (yet important) detail: when you iterate (for instance in a
for loop) on that array, you’re going to ask your CPU / GPU to load a bunch of values at once. That is going to fill your cache lines. If at each iteration you need to use every fields of the vertex, then that situation is pretty convenient, because you’re going to have a bunch of fields cached ahead (cache hits).
However… imagine a loop that only needs to access the
position field. What’s going to happen is that your CPU / GPU will still load the same data in cache lines: now you get colors in the cache that you don’t need and your loop will make more cache misses. What could have been better would have been to fill the cache lines only with positions. If we had, instead:
[x0, y0, z0, x1, y1, z1] [r0, g0, b0, a0, r1, g1, b1, a1]
Those two vectors can then be used independently for each need. Because we only need positions, we can simply use the first position vector. Now, when the CPU / GPU is going to load something in the cache, it’s going to cache much more values that we are going to actually use: we get more cache hits and it’s playa party.
That kind of memory layout is called deinterleaved memory. The way we typically do that is by simply moving the fields out of the
struct and make several arrays of each field.
People tend to use two terms to describe both layouts: AoS and SoA.
months years ago, I realized that and decided a needed I better plan. Especially, on luminance-0.39, the way you handle slicing deinterleaved data is… well, inexistent. You cannot slice such data because it was never supported.
The Tess type… revisited
The new type is the following:
As you can see, there are a lot of new things there:
B: the new backend type variable that most types now have; used to forward backend implementations down the API.
V: the vertex type. It’s the type of
Vertexyou have typically defined yourself using luminance-derive.
I: the index type. Most of the time, you’ll use either
u32, depending on the kind of geometry you index.
W: the vertex instance data. Because people might not want that,
()is a good default, but if you really need it, you can basically use any
Vertextype here, since vertex instance data must be part of a
S: the interleaving mode. Either
You will find the same type variables with the
The magic of type systems
The cool thing about that change is how it enabled me to yield much, much better APIs. Consider the previous API to create a deinterleaved
Notice the two
add_vertices. There is no type information checking and ensuring that:
- You are passing vertex data that are compatible with the
Vertextype those fields correspond to.
- You can basically call
add_verticesas many times as you want and get a big buffer in GPU memory that will make no sense.
Now, the new API looks like this:
If you try to call
set_vertices — the name got changed from
set_vertices — on the builder you get from
new_deinterleaved_tess, you will get a compilation error, because you cannot set vertices on deinterleaved tessellations: you need to set attribute vectors. The
set_attributes has the information that you are doing that for a
Vertex, so it can check the input data you pass and ensure it contains values which type is a field type used in
Vertex. If not, you get a compilation error.
Most importantly: because of how vertices work in luminance, a field type is unique to a vertex: it doesn’t make sense to use twice the
VertexPosition type. If you end up in such a situation, it means that your
Semantics type lacks another variant — remember: vertex fields are basically semantics-based attributes. That leads to the possibility to automatically find out where exactly the data you provide needs to go inside the GPU tessellation.
The super cool part is that you can now slice deinterleaved tessellations by simply asking for:
- The slice of vertices.
- A slice of attributes.
- The slice of indices.
In our case, we have a deinterleaved tessellation, which means we cannot slice whole vertices. If you try to get a slice of
Vertex, you will get a compilation error. However, we can retrieve slices of vertex fields. The way we do this is super simple: we simply call the
tess.vertices_mut() methods. It will infer the type of slices you are asking to automatically slice the right GPU buffer. This is all possible because our types are unique as the vertex fields.
Remark: you need to have the types of
colorsinferred by setting / reading / passing them around, or you will have to put type ascriptions so that
vertices()know what fields you are referring to.
Deinterleaving compile-time dispatching
So let’s dig a bit into how all this works. The first thing you need to know is that deinterleaving — the raw concept — is really simple. If you have a type such as:
We say that
Vec<Foo> is the interleaved representation of the collection. The deinterleaved representation needs to have three vectors of fields:
In order for a
Vertex type to be valid in deinterleaving contexts, we need to have that tuple of vectors representation. First, we need a mapping between
(Vec<A>, Vec<B>, Vec<C>). This is the role of two traits:
Deinterleave<T> gives, for
T, the field rank for
T. For instance:
<Foo as Deinterleave<A>>::RANK == 0
<Foo as Deinterleave<B>>::RANK == 1
<Foo as Deinterleave<C>>::RANK == 2
This is mandatory so that we know exactly which GPU buffers will need to be read / written to when creating tessellations and slicing them. You don’t have to implement
Deinterleave<T> by yourself: luminance-derive does that automatically for you when you create a
Vertex type. Also, you might be tempted to think that this rank will be used inside the
Vertex type to retrieve data, but since you cannot pass whole vertices… nah. Also, you shouldn’t assume ranks based on fields declarations in struct (rustc can re-order that).
TessVertexData<S>. It associates an input type — the type of data a tessellation will receive at creation type — for
S, for the implementor type. The easy one is
TessVertexData<Interleaved> for V where V: Vertex. The associated type is simply
Vec<V>, because interleaved geometry simply stores the vertices as a vector of the whole vertex type. Simple.
It gets more complicated when we talk about deinterleaved geometry.
TessVertexData<Deinterleaved> for V where V: Vertex has its associated type set to…
Vec<DeinterleavedData>. Indeed: there is no simple way with the current stable Rust (and even nightly) to know the full type of
Vertex fields at compile-time here. However, don’t get it wrong: that
Vec is not a vector of vertices. It’s a typically small set of attributes (vectors too). If you look at the definition of
DeinterleavedData, you get this:
Yep. Type erasure at its finest. When you pass deinterleaved data, the data is type-erased and passed as a big blob of bytes, glued with its original size (so that we don’t have to store typing information – this will be needed when slicing vertex attributes).
Implementing slicing with these traits and data types is now possible: we can add another trait that we will use to slice vertices, for instance (the
backend::VertexSlice trait in luminance) and based on the type of
Deinterleave<T>, we can go and grab the GPU buffer we want. For instance, in both the OpenGL and WebGL luminance backends, buffers are stored in a
Vec<_>, so in order to know which one we need to lookup, we simple use the
<Vertex as Deinterleave<T>>::RANK constant value (a
usize) and we’re good to go. You need two other traits for vertex instance data (
InstanceSlice) and vertex indices (
IndexSlice), and you’re good to go.
Wrap it up
So, checking at compile-time deinterleaving, in luminance, is done by:
- Using a trait (
Deinterleave<T>) to associate a rank to a field
Tin a data structure.
- Using a trait (
TessVertexData<S>) to get the input type of a tessellation: either
Vec<T>for the whole vertices, or
Vec<DeinterleavedData>for a set of attributes).
- When asking for a typed slice, such as
vertices_mut::<T>(), we basically simply require
Deinterleave<T> for Vso that we can get a
RANK: usize. We can then, in the backend implementation, find the right GPU buffer and slice it. The reconstruction to a typed slice (
&[T]) is statically assured by the length stored in the
DeinterleavedDatatype and by the fact
DeinterleavedData<T>is implemented for
A small note though. luminance — actually, I do that with all my code, whatever the language — considers a lot of operations
unsafe. In this case, implementing
unsafe. The reason is pretty obvious: you can implement
Deinterleave<Whatever>, even if your vertex type doesn’t have a
Whatever field. Doing so would yield a hazardous / undefined behavior when slicing the GPU buffer. luminance-derive takes care of implementing the
unsafe interface for you.
I hope you have learned something new or gotten some ideas. Keep the vibes!