Rust has this cool feature called
impl block. An
impl block is just a scope that introduces a way to augment a type with methods – do not confuse
impl blocks with trait
impls, used to implement a given trait.
The syntax is the following:
This syntax gives you the dot notation and the static method call syntax:
Most of the time, you’ll want
impl blocks for:
- The dot notation.
- Sharing type variables.
Type variables in impl blocks
A type variable is a variable that holds a type. People coming from languages like C++, Java, C#, Python, D, etc. are used to name those template parameters or type parameters.
T here is a type variable, and you can see that you can have an
impl block over
List<T>. What that means is that any declared method will have a hidden type
Self = List<T>. If you declare a static method,
Self will be set as
List<T> and if you declare a regular method,
&mut self will have types – respectively –
That sharing is even more pronounced when you start to deal with more complex data types and want to restrain their type variables with trait bounds. For instance, imagine that you want an ordered list:
You could implement
OrderedList<T> for any
T, but it’s very likely that for a lot of methods, you’ll need
T to be comparable. So you can do this, for example:
All the methods and static methods will have
Self an alias to
List<T> where T: PartialOrd. This will drastically help you when writing methods that all require the trait bound.
Note that you can have several
impl blocks for any kind of refactoring. For instance, if you also want to have two methods that don’t need the trait bound, you can go and add another
impl block that looks like:
Now let’s try something a bit more complex. Let’s try this:
This won’t work and rustc will complain that
E is not contrained enough (it doesn’t appear in
Self, it’s only used in a trait bound on
T). So what’s the problem?
You can test here
Enter surjection and injection
The problem with that last block is that
E is not constrained on
OrderedList<T>. If you don’t know what it means, don’t freak out: I’m going to introduce every concepts you need to know. Hang on, it’s gonna be a bit math–ish, but that is worth it.
In math, a surjection is a property about set morphisms. For a pair of sets
D and a morphism
f : C -> D, we say that this morphism is surjective if all elements from
D appear in a
f application. This is weirdly said, so let’s say it another way: if for all the elements
D, you can find at least one element
C so that
f(x) = y. Another way to swallow that down if you’re still choking is by drawing the
D sets as big bags with a few elements in there. The morphism
f applications are just lines from objects from
D. If all elements from
D have at least one arriving arrow, the morphism is surjective.
We write surjection this way:
Ɐy ∈ D, ∃x ∈ C, f : C -> D / f(x) = y
And we read it this way:
Ɐymeans for all
∈ Dmeans is an element of
Ɐy ∈ Dreads for all y that is an element of D.
,often reads as juxtaposition.
∃xmeans there exists
f : C -> Dis just the type definition of the
/is often used instead of
,to delimit the equation and the hypothesis; read it as
So this whole math stuff reads:
D, there exists
f(x) = y.
It’s quite a non-intuitive notation when you don’t know about it but you can see the concepts are pretty simple to understand.
You might ask, what is the point of knowing that in our case? Here,
Foo is a type-level function. Rust has decided to go with the mainstream way to note that (which is another topic and should be the topic of another blog post, to be honest), so yeah, it’s a bit uglier than in Haskell, but it’s a type-level function. You can picture it as
Foo : E -> Foo<E>. You can easily see that if you are writing the body of a function and that you have a bound like
T: Foo<E> available, you know that you have type
T that implements
Foo<E>, so there’s definitely one type to substitute
E with (otherwise your function cannot be called), but there could be several ones. This means that traits are surjective.
Another way to see why it’s surjective: if I give you
Foo<u32>, you know that there’s at least one type implementing it, but you don’t know which one, since it’s ambiguous.
If I give you
Foo<u32> and tell you “It’s in a bound position of a function”, you know that if the function gets called, there’s at least one type implementing
Foo<u32>… But which one is it?
Injection is the reversed version of surjection. It tells you that all the elements from
C have a departing arrow (i.e. morphism application) ending in
D and the elements in
D have zero or one arriving arrow. Another way to say it is that for all pairs of
f(a) = f(b) implies
a = b, which means that you cannot have two inputs mapping to the same result since if two applications of the morphism yield the same result, it means that the inputs have to be the same.
Ɐ(x, y) ∈ C, f : C -> D / f(x) = f(y) => x = y
=>is the math notation to state implication.
Two interesting properties of injections:
- Some elements from
D(which is called the codomain of the morphism while
Cis its domain) might not have any arriving arrow.
- If you have an arriving arrow in the codomain, then you can easily reverse its direction and move backwards to the element in the domain because it’s not possible that two elements map to the same element.
The contrapositive is also interesting:
Ɐ(x, y) ∈ C, f : C -> D / x ≠ y => f(x) ≠ f(y)
It’s easy to imagine an injection in Rust. For instance, the function transforming a boolean into an integer is an injection. Every value in the domain (
bool) have an arrow into the codomain (
u8 for instance):
f(false) = 0.
f(true) = 1.
You can also see
0 has a single arriving arrow, and if you take it backwards, you end up on
false. Same thing happens for
true. Finally, you can see that
42, which are definitely in the codomain, have no arriving arrow. We are dealing with an injection.
What’s interesting with our specific typesystem problem is that if we had injective traits, that would mean that given
Foo<T>, there’s only one type that can implement this trait. You would then be able to take the bound backwards and recover the type, removing the ambiguity.
This is not correctly possible in Rust. If you’re interested, Haskell can do it via injective type families or functional dependencies, for instance.
A small digression on bijection
For curiosity only, bijection is just the superposition of both the properties. If you have a morphism that is both surjective and injective, it is bijective, and any element in the domain gives you one element in the codomain and any element in the codomain gives you one element in the domain an element in the codomain must have exactly one arriving arrow.
Back to our impl block problem
This is our initial problem. As you can see here, the
impl block has two variables:
T here means that we’re implementing methods for all the
Ts – i.e.
Ɐ. However, we’re not implementing it for all the
E. We would like to state that we need a
E to exist, which means that there exists an
E – i.e.
∃. The current syntax doesn’t support this.
However, what happens if we do this:
Here, you can see that since we can provide
E by the caller of the
invoke_foo function, we don’t require an injective bound: we narrow all the possible types to one provided by the caller of the function.
You can test it by yourself here. The
Clonetrait was added so that we can actually loop with the input argument.
All of this made me think about why do we even use
impl blocks. I mean, Haskell has been around for roughly 30 years and we never seen the need for it arise. Rust has a very special way to express typeclasses (Rust’s traits have an implicit type parameter,
Self, which is a sort of limitation that might get fixed in the future). If we forget about the dot notation, I truly think that I would drop
impl blocks for several reasons:
- No solution currently exists to the problem this blog entry discusses with
implblocks so far.
- Documentation might become very hard to read because it makes every methods dependent on an
implblock in the documentation. This refactoring might be interesting for code readers and writers but is, to me, truly a nightmare to people reading the documentation – it kind of makes reading the documentation stateful, which is ridiculous. However, this point might be a very interesting
rustdocfeature request! ;)
- On a general note, I’m not really adding any constraint to the
implblocks because those depend mostly on the functions / methods and I find it utterly stupid to create
nfunctions. So I basically just spawn the most general form of
implblock and put the constraints next to the methods.
- I often confuse them with
impltrait implementors when reading the code from someone else.
What would be really great would be to allow people to create function with a
self argument without
impl block. You could still have them around for compatibility purposes and for people who actually enjoy them, but to me those are a bit useless and boilerplate.
Again, I’m only talking about
implfor trait are actually quite nice since you can see them and spot them easily (as the
instancekeyword in Haskell).
That’s all for today. Thank you for having read me. Keep the vibes, and write RFCs!