Safely Sharing Data: Reference Capabilities in Pony06 Mar 2016
So you’ve got three actors capable of processing data in parallel. You’d like to be able to share data between these actors. What do you do?
If you simply pass around mutable state, then you’re sure to run into situations where your actors start stepping on each other. Actor A is trying to update an entry in a map at the same time Actor B is trying to read that entry. What will Actor B see? Who knows!
So you know what to do: just use locks. Actor B simply has to wait its turn. Nothing could possibly go wrong. But of course locks mean coordination, and coordination is slow. Maybe B didn’t even really care about seeing the latest, greatest version of that entry. Too bad. It’s still going to have to wait.
But we don’t need locks, you say. After all, we have immutability! We’ll just use a persistent hash map. That way B can read its entry no problem and A can simultaneously “write” the update–by creating an altered copy of the immutable map, of course. But if A knows that after sending the map, B will be the only remaining actor who knows about it, too bad! B is still stuck copying on writes. Of course, B could always copy the map into a mutable one, but that doesn’t sound like a recipe for pure, unadulterated speed.
There’s got to be a better way! Well, now there is! And for only 10 monthly payments of $0.00 each, you too can share data faster, safer, and without the headaches. And it’s all because of the magic of Pony reference capabilities.
What is a Reference Capability?
Continuing our earlier example, imagine that we have a hash map with Strings as both keys and values. We’re probably going to store this map as the value of some variable, like so:
let m = Map[String,String]
What we’ve done here is create a reference called
m to the map.
This is a mutable map, so we are free to read and write from it as
Zoom out to our multi-actor system, however. If I try to send this data structure to actor B, I encounter a problem. B now has a different reference to the same map. I’m no longer free to read and write as I please. For all I know, B is writing to it at this very moment.
This brings us to two related laws of shared references, laws that you should always keep in mind when thinking about reference capabilities:
1) Write Law: Write only when you know no one else can read or write.
2) Read Law: Read only when you know no one else can write.
The Write Law exists because, first, simultaneous writes to the same data structure can lead to unexpected write results and, second, writing while someone else is reading is likely to produce some weird read results. Whether by locks or reference capabilities, the Write Law must be enforced.
The Read Law, similarly, exists because you can only trust the results of a read if you know no one was writing to the data structure at the same time you were trying to read it. And in a perfect world, you could always trust the results of your reads.
In light of these laws, we see the error in sharing a mutable map with no further thought. Given that the two actors are independent of each other, there’s no way for one to know whether the other is at the moment reading or writing. So without locks, we’re bound for trouble.
What we’d like is a situation where an actor can know for certain whether or not other actors can read from or write to a given data structure. And that’s where reference capabilities come in. A reference capability denies some combination of read and write behaviors, either locally, globally, or both.
The idea should be familiar if you’ve had any experience with immutable data. References to immutable data deny write permissions for everyone. If I’ve shared an immutable data structure, I know I can safely conform to the Read Law since there’s no way anyone else can write to it.
In Pony we get something like this but with a much finer grain of control.
Consider, for example, the idea of sharing mutable data. Is this always a bad
idea? Not necessarily. In the example above, we created a named reference
to our mutable map. In our local scope,
m is the only way we can refer
to the map, and hence the only way that we can hope to write to it. What if
we wanted to send this mutable map to actor B with no risk of violating our
two laws? Do we have to first convert it to an immutable data structure and
In fact we do not. Instead, when sending the map to B, we need only destroy
our sole reference to it. If
m no longer exists in our local scope,
then there is no way we can read from or write to the map. And so it doesn’t
concern us in the least whether B is reading from or writing to it.
In Pony, if we know that we have the only reference to a mutable data structure,
then we have what is called an
iso reference, where “iso” stands for
iso is mutable, but it’s safe to send to other
actors as long as we give up our reference to it in the process.
But what if we want to be able to continue reading from our map? It’s just
that we want to afford other actors the same privilege. In that case, an
immutable data structure is called for. Immutable data is the only way
to ensure that multiple actors can conform to the Read Law with respect
to the same data structure at the same time. In Pony, a reference to an
immutable data structure that can be shared among actors is called a
val ensures that no one can write to the data
structure, though anyone can read from it.
In an actor system, actors send messages to other actors. In order to
send a message to actor C, you have know that C exists. What can C send
me to allow me to send to it later? It’s not an
iso, for at least
two reasons. First, an actor is not the kind of thing you can either
read from or write to. Second, C wants many actors to know it exists,
so it’s not interested in sending an isolated reference. So what about a
val? Well, we’ve already said that you can’t read from an actor, and
val only denies write permissions.
What we need is something that can be shared by many actors but which
denies both read and write permissions. In Pony, that’s called a
It’s essentially just a reference to the identity of the data in question.
If the data is an actor with publicly accessible behaviors, then all
we need is a
tag to call those behaviors. But we can’t directly
read from or write to the object.
The Backbone of Reference Capabilities
As you try to get a handle on how reference capabilities work, it’s helpful
to keep in mind the core reason for their existence: safely sharing data across
actors. We’re talking about safety we can prove at compile time. And this
means that the most important reference capabilities are the
sendables. These are references we know we can pass safely to one or more
tag, the very three we’ve discussed
so far. Think of these as the backbone of Pony reference capabilities.
The sendables stand in the following subtyping relation:
First, notice that if we expect a
tag, we can accept any reference
capability. That’s because a
tag denies read and write permissions
to everyone. So our code that expects a
tag will never attempt to
read from or write to it, which means there is no chance we’ll violate
the Laws of Sharing. So our code can treat both
as subtypes of
Our relation above says that
iso is also a subtype of
Why should this be? Well, whenever we share an
iso, we give up our
reference to it. And this means that no one but the actor we’re sending to
will have a reference. If you’re the only one with a reference, you’re free
to decide which reference capability to pick for it. Any code that expects
val can simply treat the
iso sent to it as a
That’s because we can prove at compile time that no one else can write to
it, which is exactly what we want for a
The Local Toolbox
The sendables are not the only reference capabilities available to us.
Pony also provides a toolbox of three reference capabilities that
can be useful within the scope of a single actor. These three, called
box, are made possible because of an important property
of Pony actors. Within an actor, all execution is serial and synchronous.
And this means that within an actor, we do not face the same threats to the
Read and Write Laws that we face when concurrent execution is possible. Without
concurrency, we can’t have simultaneous writes or a read that occurs during a
So in this sense, a single actor need not worry about the Laws of Sharing.
The first consequence of this fact is that a single actor is free to hold
as many references to a mutable data structure as it needs. A reference to
mutable data that makes no guarantees about how many local aliases to that
data exist is called a
ref. Think of a
ref as the old-fashioned
reference familiar from programming languages that don’t enforce immutability.
The following is valid Pony code:
let m: Map[String, String] ref = Map[String, String] let n = m let o = m
Notice that if
m were a
Map iso the second line would fail
to compile. That’s because we can only have one alias for an
ref, on the other hand, we can have as many local aliases as we
What if we have a data structure that is currently mutable but which we
know will at some future point become immutable. For example, perhaps we
are keeping track of votes cast by a fixed number of voters. Once all the
votes are in, we will no longer need to mutate our record of them. At that
point it will be convenient to be able to freely share the results among
actors. In other words, we’ll eventually want a
If we used a
ref while tallying votes, we’d face a problem. Once all
the votes are in, we’d have to find all the existing
ref aliases and
destroy them. After all, the only way we can prove that it’s safe to share
data as a
val is if we can show that no one can write to it. Pony
provides a cleaner solution here, in the form of the transition reference
trn reference is writeable, but allows no other writeable aliases.
iso, however, it allows other readable aliases. Again,
the reason we can allow more than one readable alias to our writeable
reference is because we’ve restricted them to the local actor. Hence, we
know that we’ll never be reading the data at the same time that we’re
writing to it.
The locally readable aliases possess the
box reference capability.
box means that you can read it
locally but not write to it. We may in the course of our vote tallying
refer to our
box aliases within other data structures (for example,
as the field of an object with methods that makes decisions based on
the current tally). Once the time comes to “freeze” the vote tally and
convert to immutable data, all of these
box aliases remain safe to
use. We are free to convert our
trn to a
Converting Between Capabilities
How can we convert a
trn to a
val? And more generally, how can
we convert between capabilities? There are two axes along which these
conversions can take place. The first is according to a set of substitution
rules that can be modeled in terms of subtyping. The second is by “lifting”
reference capabilities, a process called recovery. We’ll begin with
Take a look at the following subtyping diagram:
--> ref -- / \ iso --> trn -- --> box --> tag \ / --> val --
You can read the arrows in this diagram as “is a subtype of” or “can
be substituted for”. For example,
iso --> trn can be read as
iso can be substituted for
trn”. This subtyping relation is
transitive, which means that
iso can be substituted for any of
the reference capabilities.
Why is this? Recall that if we give up our alias for an
we are free to treat it as any reference capability. How do we give up
an alias in Pony? We simply use
consume. For example:
let a: Map[String, String] iso = recover Map[String, String] end let b: Map[String, String] trn = consume a let c: Map[String, String] iso = recover Map[String, String] end let d: Map[String, String] val = consume c
We’ll explain what
recover means later, but for now just think of it
as a way to show that our
Map is an
We can achieve the same effect by consuming our alias when sending an
iso. The same reasoning shows us that we can substitute a
for any reference capability except
iso as long as we give up our
alias. That’s because
trn ensures that there is only one writeable
alias in existence and when we consume it, we destroy that alias.
Why can’t we substitute a
trn for an
iso? Remember that
we can have many locally readable
box aliases for our
This means that when we give up our
trn alias, we know that other
readable aliases can still exist. And if other aliases exist, then we
can’t have an
We can substitute either a
ref or a
val for a
val are locally readable.
box doesn’t say
that there aren’t writeable aliases anywhere. It just says that the
alias itself is not writeable. So it’s consistent with either locally mutable
or globally immutable data. This allows us to write methods that don’t care
whether we get a
ref or a
val as long as the argument is readable.
Finally, for reasons discussed above, any reference capability can be
substituted for a
tag. That’s because a
tag alias is neither
readable nor writeable. So no matter what guarantees we want to enforce regarding
our subtype, any code that expects a
tag will automatically respect
Subtyping is one way to conceptualize the relationship between reference capabilities. Another is in terms of three distinct categories: mutable, immutable, and opaque.
1) A mutable reference capability denies neither read nor write permissions.
This category includes
2) An immutable reference capability denies write permissions but not
read permissions. This category includes
3) An opaque reference capability denies both read and write
permissions. The only example is
It’s worth noting that there is exactly one sendable per category.
Recovery allows us to convert a reference capability to another
by doing work within a protected
recover block. I call it “protected”
because the only aliases external to the block that you can refer to from
within the block are sendable ones (i.e.
You can create any new aliases you like from within the block (following
the normal reference capability rules). The value of a
is the value returned at the end of that block. All the other aliases
created within the block will be destroyed upon leaving it.
From within a
recover block, you can recover the value of the block
according to the following three rules:
1) Mutable reference capabilities can be recovered to any mutable, immutable,
or opaque reference capabilities. So anything, in other words. For
trn, you must consume the alias before you can recover it to
2) Immutable reference capabilities can be recovered to either immutable or opaque reference capabilities.
3) Opaque reference capabilities can be recovered only to opaque. That is,
tag can only be recovered to
tag, which probably isn’t that
useful in practice.
Practically speaking, you can simplify these rules by remembering that immutable reference capabilities can be recovered to immutable or opaque and mutable reference capabilities can be recovered to anything.
What is going on here? Ignoring aliases borrowed from the enclosing scope
for the moment, we know that anything that happens within a recover block
is invisible to the outside world. So if you create a bunch of
and return one of those aliases at the end, it’s safe to recover it to an
iso. That’s because all the other aliases are destroyed when you
leave the block. So we can prove that there is only one reference left.
And of course, if there’s only one reference left, then you are free to
convert that to any reference capability (as discussed in the substitution
section). This explains why a mutable reference created within the block
can be recovered to anything.
But you are also allowed to refer to aliases from the enclosing
scope, so long as they are sendables. Obviously, if we return a
iso, we’re going to have to consume that alias before
we can recover it. That’s because otherwise we’ll have more than one
reference to an
iso, which is forbidden. So just consume the
iso alias when you return it at the end of the block:
let lst: List[String] iso = recover List[String] end let vlst: List[String] val = recover val lst.push("hi") consume lst end
In this example, we start with an
List. Within the
recover block, we write to it and then return it. But since
lst before returning it, we know that there are
no aliases left. We are free to recover it to a
val as in the
example, or anything else for that matter.
So why can’t we recover immutable references to mutable ones? Well,
say we borrow a
val from the enclosing scope, as in the following
(invalid) Pony code:
let vlst: List[String] val = recover val List[String] end ... // we do some stuff with it, which might include sending it ... // to someone let rlst: List[String] ref = recover ref consume vlst end
vlst in the
recover block doesn’t do anything
useful. That’s because we don’t know how many other aliases exist.
After all, we could have sent
vlst to other actors. Since it’s
val, we know we can safely share it while holding on to a
local reference. So recovering it to a
ref violates both of
the Laws of Sharing. Everyone else thinks that no one has write
permissions, and holding a
ref implies that we think no one
else has read permissions.
If you contrast those two examples, you’ll notice that when we create
iso, we write:
recover List[String] end
On the other hand, when we create a
val, we write:
recover val List[String] end
This difference is explained by the default behavior of
blocks. Remember that there was one sendable per reference capability
iso for mutable,
val for immutable, and
for opaque. The default behavior of a
recover block is to recover
a reference capability to the sendable capability in its category. So any
mutable reference capability default recovers to
iso and any immutable
We haven’t covered everything there is to know about Pony reference capabilities, but we’ve hit the basics. This should be enough to get you started. In a future post, I hope to cover some more advanced topics, including viewpoint adaptation (i.e. the way reference capabilities of fields look to the class or actor whose fields they are) and reference capabilities for polymorphic types. In the meantime, stay safe out there!