Repasando las closuras en Rust

When I went through the exercises on The Rust Programming Language book I remember finding the example about closures interesting. The provided code was just about implementing a naive cache system to avoid repeated expensive calculations for previously passed inputs. It looked like this:

language-rust
⁠
// Struct accepting a Generic, which has 
⁠// to be a function with an u32 as parameter
⁠// and returning an u32 value
⁠pub struct Cacher<T: Fn(u32) -> u32> {
⁠  pub calculation: T,
⁠  pub value: Option<u32>,
⁠}
⁠
⁠// Implementation for Cacher, where we add our behaviour
⁠impl<T: Fn(u32) -> u32> Cacher<T> {
⁠  // Instantiate the struct with a function, that will be
⁠  // stored within the struct scope as a closure
⁠  pub fn new(calculation: T) -> Cacher<T> {
⁠    Cacher {
⁠      calculation,
⁠      value: None,
⁠    }
⁠  }
⁠
⁠  // Getter to retrieve the current value
⁠  pub fn value(&mut self, input: u32) -> u32 {
⁠    match self.value {
⁠      Some(value) => value,
⁠      None => {
⁠        let value = (self.calculation)(input);
⁠        self.value = Some(value);

⁠⁠        value
⁠      }
⁠    }
⁠  }
⁠}
⁠

Cacher implementation with a closure and generics

Easy enough. If value is empty the cacher will calculate the result and return it; otherwise, it will return the existing value. But there is a catch: if we run the Cacher::value() method twice with different values, it will return the first one that was stored, which we can demonstrate with this test:

language-rust
⁠
#[cfg(test)]
⁠mod tests {
⁠  use super::*;
⁠  use ::std::thread;
⁠  use ::std::time::Duration;
⁠
⁠  #[test]
⁠  fn call_with_different_values() {
⁠    fn my_function(num: u32) -> u32 {
⁠   ⁠   num
⁠    }
⁠
⁠    let mut cacher = Cacher::new(my_function);
⁠
⁠    let value_1 = cacher.value(1);
⁠    let value_2 = cacher.value(2);
⁠
⁠    assert_eq!(value_1, 1);
⁠    assert_eq!(value_2, 2);
⁠  }
⁠}

Failing test for our cacher: run it here in the playground

The issue is that we are storing the value within Cacher.value directly, so once it is calculated the same value will be returned the rest of the times. What we need is a structure that can store the resulting values in relation to the inputs, and one with fast search and insert times. We need a Hashmap.

So instead of using value as an Option<u32> to store a single value, we can use values with an std::collections::HashMap, as follows:

language-rust
⁠
use std::collections::HashMap; 
⁠
⁠struct Cacher<T: Fn(u32) -> u32> {
⁠  calculation: T,
⁠  values: HashMap<u32, u32>,
⁠}

Our hashmap will have u32 keys with u32 values. We can now proceed to instantiate it on our constructor:

language-rust
⁠
fn new(calculation: T) -> Cacher<T> {
⁠  Cacher {
⁠    calculation,
⁠    values: HashMap::new(),
⁠  }
⁠}

And finally create or getter/setter method with the match pattern that will decide what to do:

language-rust
⁠
fn value(&mut self, input: u32) -> u32 {
⁠  match self.values.get(&input) {
⁠    Some(value) => *value,
⁠    None => {
⁠      let value = (self.calculation)(input);
⁠      self.values.insert(input, value);
⁠
⁠      value
⁠    }
⁠  }
⁠}

So if the input already exists in self.values —self.values.get(&input)—, retrieve it; otherwise, set into self.values both the input and value as key/value pair. Notice that when a given value exists —Some(value)—, we retrieve it as *value: that's because what we store is the reference &input, so we need to dereference it to retrieve its actual value. If we remove the dereference we will get the corresponding error notifying that Rust expected `u32`, found `&u32`.

language-none
⁠
⁠⣿ Standard Error
⁠
⁠Compiling playground v0.0.1 (/playground)
⁠error[E0308]: mismatched types
⁠  --> src/lib.rs:18:28
⁠   |
⁠16 |     pub fn value(&mut self, input: u32) -> u32 {
⁠   |                    --- expected `u32` because of return type
⁠17 |         match self.values.get(&input) {
⁠18 |             Some(value) => value,
⁠   |                            ^^^^^ expected `u32`, found `&u32`
⁠   |
⁠help: consider dereferencing the borrow
⁠   |
⁠18 |             Some(value) => *value,
⁠   |                            +

Finally note the parenthesis on (self.calculation)(input). They seem unnecesary, but actually they are. Lets break things first and remove them:

language-none
⁠
⣿ Standard Error
⁠
Compiling playground v0.0.1 (/playground)
⁠error[E0599]: no method named `calculation` found for mutable reference 
⁠`&mut Cacher<T>` in the current scope
⁠  --> src/lib.rs:20:34
⁠   |
⁠20 |                 let value = self.calculation(input);
⁠   |                                  ^^^^^^^^^^^ field, not a method
⁠   |
⁠help: to call the function stored in `calculation`, surround the field access 
⁠with parentheses
⁠   |
⁠20 |                 let value = (self.calculation)(input);
⁠   |                             +                +

As previously, Rust informs us both about the error and possible fix. It is intelligent enough to identify that we are trying to call the function we stored as a closure, thus the need of the parenthesis. If we restore them and try to run our test again, this what we get:

language-none
⁠
running 1 test
⁠test tests::call_with_different_values ... ok

Fantastic, it is passing \(^O^)/.
⁠
⁠Of course this is something really simple, but I like the example because it has a reference and a dereference, a structure from std , a closure, a generic with a bound type and a match pattern. You can't ask for more!
⁠

Repasando las closuras en Rust

28 de abril de 2022