Lifetimes in Rust

12,682

Update 2015-05-16: the code in the original question applied to an old version of Rust, but the concepts remain the same. This answer has been updated to use modern Rust syntax/libraries. (Essentially changing ~[] to Vec and ~str to String and adjusting the code example at the end.)

Is my understanding vaguely accurate?
[...]
What is the difference between a parameter of type &str and a parameter of type &'a str in the example above?

Yes, a lifetime like that says essentially "no restrictions", sort of. Lifetimes are a way to connect output values with inputs, i.e. fn foo<'a, T>(t: &'a T) -> &'a T says that foo returns a pointer that has the same lifetime as t, that is, the data it points to is valid for the same length of time as t (well, strictly, at least as long as). This basically implies that the return value points to some subsection of the memory that t points to.

So, a function like fn<'a>(path: &'a str) -> Vec<String> is very similar to writing { let x = 1; return 2; }... it's an unused variable.

Rust assigns default lifetimes when writing &str, and this is exactly equivalent to writing the unused-variable lifetime. i.e. fn(path: &str) -> Vec<String> is no different to the version with 'as. The only time leaving off a lifetime is different to including it is if you need to enforce a global pointer (i.e. the special 'static lifetime), or if you want to return a reference (e.g. -> &str) which is only possible if the return value has a lifetime (and this must be either the lifetime of one-or-more of the inputs, or 'static).

What is a lifetime? Where can I learn more about them?

A lifetime is how long the data a pointer points to is guaranteed to exist, e.g. a global variable is guarantee to last "forever" (so it's got the special lifetime 'static). One neat way to look at them is: lifetimes connect data to the stack frame on which their owner is placed; once that stack frame exits, the owner goes out of scope and any pointers to/into that value/data-structure are no longer valid, and the lifetime is a way for the compiler to reason about this. (With the stack frame view, it is as if @ has a special stack frame associated with the current task, and statics have a "global" stack frame).

There's also a lifetimes chapter of the book, and this gist (NB. the code is now outdated but the concepts are still true) is a neat little demonstration of how one can use lifetimes to avoid having to copy/allocate (with a strong safety guarantee: no possibility of dangling pointers).

And while I'm at it, what is 'self?

Literally nothing special, just certain places require types to have lifetimes (e.g. in struct/enum defintions and in impls), and currently 'self and 'static are the only accepted names. 'static for global always-valid pointers, 'self for something that can have any lifetime. It's a bug that calling that (non-static) lifetime anything other than self is an error.


All in all, I'd write that function like:

use std::fs::File;
use std::io::prelude::*;
use std::io::BufReader;
use std::path::Path;

fn read_file_lines(path: &Path) -> Vec<String> {
    match File::open(path) {
        Ok(file) => {
            let read = BufReader::new(file);
            read.lines().map(|x| x.unwrap()).collect()
        }
        Err(e) => panic!("Error reading file: {}", e)
    }
}

fn main() {
   let lines = read_file_lines(Path::new("foo/bar.txt"));
   // do things with lines
}
Share:
12,682
Daniel
Author by

Daniel

SOreadytohelp

Updated on June 06, 2022

Comments

  • Daniel
    Daniel almost 2 years

    Occasionally I've found myself wanting to write functions that can be called in either of two ways:

    // With a string literal:
    let lines = read_file_lines("data.txt");
    
    // With a string pointer:
    let file_name = ~"data.txt";
    let lines = read_file_lines(file_name);
    

    My first guess was to use a borrowed pointer (&str) for the parameter type, but when that didn't work (it only allowed me to use @str and ~str), I tried the following (by copying the Rust libraries), which did work.

    fn read_file_lines<'a>(path: &'a str) -> ~[~str] {
        let read_result = file_reader(~Path(path));
        match read_result {
            Ok(file) => file.read_lines(),
            Err(e) => fail!(fmt!("Error reading file: %?", e))
        }
    }
    

    The problem is that I don't understand what I'm doing. From what I can gather (mostly from compiler errors), I'm declaring a lifetime on which there is no restriction, and using it to describe the path parameter (meaning that any lifetime can be passed as the parameter).

    So:

    • Is my understanding vaguely accurate?
    • What is a lifetime? Where can I learn more about them?
    • What is the difference between a parameter of type &str and a parameter of type &'a str in the example above?
    • And while I'm at it, what is 'self?

    (I'm using Rust 0.7, if it makes a difference to the answer)

  • Daniel
    Daniel almost 11 years
    Thanks! Looks like you're right about specifying the lifetime being redundant in my example. I think I needed it for 0.6 to use the method with 'static strs, but didn't re-test after upgrading to 0.7.
  • Daniel
    Daniel almost 11 years
    Just read the tutorial: it (along with your explanation) was helpful. Thanks again. I've also got a bunch of impl<'self> MyExtraFunction for &'self str (to do "hello".my_extra_function()), but it looks like I'll still need those.
  • huon
    huon almost 11 years
    @Daniel, no problem :) (yes impls are one of the places where borrowed pointers are required to have lifetimes, i.e. you can't write impl MyExtraFunction for &str at the moment.)
  • huon
    huon almost 11 years
    (Also, join us on IRC if you haven't already; there's normally several people more than willing to help.)
  • Mihir Luthra
    Mihir Luthra over 3 years
    There is nothing like 'self now, right? When googling I found people suggesting to use 'self for indicating struct instance liftime but I didn't find any sort of tacking issue or RFC. There are no plans for 'self, right?
  • nalzok
    nalzok almost 3 years
    that is, the data it points to is valid for the same length of time as ·t· (well, strictly, at least as long as). Shouldn't it be as most as long as? As soon as t falls out of its scope, the return value would become invalid.
  • huon
    huon almost 3 years
    @nalzok: You're right in a sense, but this is fairly subtle. The actual value pointed to by the returned reference has to last at least as long as t: if the value lasted for a shorter time, the reference may become invalid before t, and this would be bad. However, the compiler has to assume the shortest/worse case (the return value being valid exactly as long as t), so you're right that when t goes out of scope, the compiler disallows using the return value.