Build HashSet from a vector in Rust

24,857

Solution 1

Because the operation does not need to consume the vector¹, I think it should not consume it. That only leads to extra copying somewhere else in the program:

use std::collections::HashSet;
use std::iter::FromIterator;

fn hashset(data: &[u8]) -> HashSet<u8> {
    HashSet::from_iter(data.iter().cloned())
}

Call it like hashset(&v) where v is a Vec<u8> or other thing that coerces to a slice.

There are of course more ways to write this, to be generic and all that, but this answer sticks to just introducing the thing I wanted to focus on.

¹This is based on that the element type u8 is Copy, i.e. it does not have ownership semantics.

Solution 2

The following should work nicely; it fulfills your requirements:

use std::collections::HashSet;
use std::iter::FromIterator;

fn vec_to_set(vec: Vec<u8>) -> HashSet<u8> {
    HashSet::from_iter(vec)
}

from_iter() works on types implementing IntoIterator, so a Vec argument is sufficient.

Additional remarks:

  • you don't need to explicitly return function results; you only need to omit the semi-colon in the last expression in its body

  • I'm not sure which version of Rust you are using, but on current stable (1.12) to_iter() doesn't exist

Solution 3

Moving data ownership

let vec: Vec<usize> = vec![1, 2, 3, 4];
let hash_set: HashSet<usize> = vec.into_iter().collect();

Cloning data

let vec: Vec<usize> = vec![1, 2, 3, 4];
let hash_set: HashSet<usize> = vec.iter().cloned().collect();
Share:
24,857
Jared Beck
Author by

Jared Beck

Updated on September 25, 2020

Comments

  • Jared Beck
    Jared Beck over 3 years

    I want to build a HashSet<u8> from a Vec<u8>. I'd like to do this

    1. in one line of code,
    2. copying the data only once,
    3. using only 2n memory,

    but the only thing I can get to compile is this piece of .. junk, which I think copies the data twice and uses 3n memory.

    fn vec_to_set(vec: Vec<u8>) -> HashSet<u8> {
        let mut victim = vec.clone();
        let x: HashSet<u8> = victim.drain(..).collect();
        return x;
    }
    

    I was hoping to write something simple, like this:

    fn vec_to_set(vec: Vec<u8>) -> HashSet<u8> {
        return HashSet::from_iter(vec.iter());
    }
    

    but that won't compile:

    error[E0308]: mismatched types
     --> <anon>:5:12
      |
    5 |     return HashSet::from_iter(vec.iter());
      |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected u8, found &u8
      |
      = note: expected type `std::collections::HashSet<u8>`
      = note:    found type `std::collections::HashSet<&u8, _>`
    

    .. and I don't really understand the error message, probably because I need to RTFM.

  • Shepmaster
    Shepmaster about 5 years
    collect uses FromIterator — this is the same as the above two answers except more verbose.
  • Fuji
    Fuji about 5 years
    fixed type error. This code does not require repeating HashSet in the conversion code
  • Shepmaster
    Shepmaster about 5 years
    There's no repeating of HashSet in the original answers either. Their code: let hash_set = HashSet::<_>::from_iter(vec); vs yours: let hash_set: HashSet<usize> = vec.into_iter().collect()
  • Fuji
    Fuji about 5 years
    You removed the type in your first example to hide the repeat of HashSet code 1: let hash_set: HashSet<usize> = HashSet::<_>::from_iter(vec); code 2: let hash_set: HashSet<usize> = vec.into_iter().collect() Example restored
  • User
    User almost 4 years
    This doesn't compile anymore.