Rust – memory safety without garbage collector

I’ve spent time with Rust at var­i­ous points in the past, and being a lan­guage in devel­op­ment it was no sur­prise that every time I looked there were break­ing changes and even the doc­u­men­ta­tions look very dif­fer­ent at every turn!

Fast for­ward to May 2015 and it has now hit the 1.0 mile­stone so things are sta­ble and it’s now a good time to start look­ing into the lan­guage in earnest.

The web site is look­ing good, and there is an inter­ac­tive play­ground where you can try it out with­out installing Rust. Doc­u­men­ta­tion is beefed up and read­i­ly acces­si­ble through the web site. I per­son­al­ly find the Rust by Exam­ples use­ful to quick­ly get start­ed.

 

Ownership

The big idea that came out of Rust was the notion of “bor­rowed point­ers” though the doc­u­men­ta­tions don’t refer to that par­tic­u­lar term any­more. Instead, they talk more broad­ly about an own­er­ship sys­tem and hav­ing “zero-cost abstrac­tions”.

Zero-cost what?

The abstrac­tions we’re talk­ing here are much low­er lev­el than what I’m used to. Here, we’re talk­ing about point­ers, poly­mor­phic func­tions, traits, type infer­ence, etc.

Its point­er sys­tem for exam­ple, gives you mem­o­ry safe­ty with­out need­ing a garbage col­lec­tor and Rust point­ers com­piles to stan­dard C point­ers with­out addi­tion­al tag­ging or run­time checks.

It guar­an­tees mem­o­ry safe­ty for your appli­ca­tion through the own­er­ship sys­tem which we’ll be div­ing into short­ly. All the analy­sis are per­formed at com­pile time, hence incur­ring “zero-cost” at run­time.

Basics

Let’s get a cou­ple of basics out of the way first.

image

Note that in Rust, print­ln is imple­ment­ed as a macro, hence the bang (!).

Ownership

When you bind a vari­able to some­thing in Rust, the bind­ing claims own­er­ship of the thing it’s bound to. E.g.

image

When v goes out of scope at the end of foo(), Rust will reclaim the mem­o­ry allo­cat­ed for the vec­tor. This hap­pens deter­min­is­ti­cal­ly, at the end of the scope.

When you pass v to a func­tion or assign it to anoth­er bind­ing then you have effec­tive­ly moved the own­er­ship of the vec­tor to the new bind­ing. If you try to use v again after this point then you’ll get a com­pile time error.

image

image

This ensures there’s only one active bind­ing to any heap allo­cat­ed mem­o­ry at a time and elim­i­nates data race.

There is a ‘data race’ when two or more point­ers access the same mem­o­ry loca­tion at the same time, where at least one of them is writ­ing, and the oper­a­tions are not syn­chro­nized.

Copy trait

Prim­i­tive types such as i32 (i.e. int32) are stack allo­cat­ed and exempt from this restric­tion. They’re passed by val­ue, so a copy is made when you pass it to a func­tion or assign it to anoth­er bind­ing.

image

The com­pil­er knows to make a copy of n because i32 imple­ments the Copy trait (a trait is the equiv­a­lent to an inter­face in .Net/Java).

You can extend this behav­iour to your own types by imple­ment­ing the Copy trait:

image

Don’t wor­ry about the syn­tax for now, the point here is to illus­trate the dif­fer­ence in behav­iour when deal­ing with a type that imple­ments the Copy trait.

The gen­er­al rule of thumb is : if your type can imple­ment the Copy trait then it should.

But cloning is expen­sive and not always pos­si­ble.

Borrowing

In the ear­li­er exam­ple:

image

  • own­er­ship of the vec­tor has been moved to the bind­ing v in the scope of take();
  • at the end of take() Rust will reclaim the mem­o­ry allo­cat­ed for the vec­tor;
  • but it can’t, because we tried to use v in the out­er scope after­wards, hence the error.

What if, we bor­row the resource instead of mov­ing its own­er­ship?

A real world anal­o­gy would be if I bought a book from you then it’s mine to shred or burn after I’m done with it; but if I bor­rowed it from you then I have to make sure I return it to you in pris­tine con­di­tions.

rust_ownership_4

rust_ownership_5

In Rust, we do this by pass­ing a ref­er­ence as argu­ment.

image

Ref­er­ences are also immutable by default.

image

But just as you can cre­ate muta­ble bind­ings, you can cre­ate muta­ble ref­er­ences with &mut.

image

There are a cou­ple of rules for bor­row­ing:

1. the borrower’s scope must not out­last the own­er

2. you can have one of the fol­low­ing, but not both:

2.1. zero or more ref­er­ences to a resource; or

2.2. exact­ly one muta­ble ref­er­ence

Rule 1 makes sense since the own­er needs to clean up the resource when it goes out of scope.

For a data race to exist we need to have:

a. two or more point­ers to the same resource

b. at least one is writ­ing

c. the oper­a­tions are not syn­chro­nized

Since the own­er­ship sys­tem aims to elim­i­nate data races at com­pile time, there’s no need for run­time syn­chro­niza­tion, so con­di­tion c always holds.

When you have only read­ers (immutable ref­er­ences) then you can have as many as you want (rule 2.1) since con­di­tion b does not hold.

If you have writ­ers then you need to ensure that con­di­tion a does not hold – i.e. there is only one muta­ble ref­er­ence (rule 2.2).

There­fore, rule 2 ensure data races can­not exist.

rust_ownership_2

Here are some issues that bor­row­ing pre­vents.

Beyond Ownership

There are lots of oth­er things to like about Rust, there’s immutabil­i­ty by default, pat­tern match­ing, macros, etc.

Pattern Matching

image

Structs

image

Enums

image

Even from these basic exam­ples, you can see the influ­ence of func­tion­al pro­gram­ming. Espe­cial­ly with immutabil­i­ty by default, which bodes well with Rust’s goal of com­bin­ing safe­ty with speed.

Rust also has a good con­cur­ren­cy sto­ry too (pret­ty much manda­to­ry for any mod­ern lan­guage) which has been dis­cussed in detail in this post.

Over­all I enjoy cod­ing in Rust, and the own­er­ship sys­tem is pret­ty mind open­ing too. With both Go and Rust com­ing of age and tar­get­ing a sim­i­lar space around sys­tem pro­gram­ming, it’ll be very inter­est­ing to watch this space devel­op.

 

Links