Extracting a file extension from a given path in Rust idiomatically

12,152

Solution 1

In idiomatic Rust the return type of a function that can fail should be an Option or a Result. In general, functions should also accept slices instead of Strings and only create a new String where necessary. This reduces excessive copying and heap allocations.

You can use the provided extension() method and then convert the resulting OsStr to a &str:

use std::path::Path;
use std::ffi::OsStr;

fn get_extension_from_filename(filename: &str) -> Option<&str> {
    Path::new(filename)
        .extension()
        .and_then(OsStr::to_str)
}

assert_eq!(get_extension_from_filename("abc.gz"), Some("gz"));

Using and_then is convenient here because it means you don't have to unwrap the Option<&OsStr> returned by extension() and deal with the possibility of it being None before calling to_str. I also could have used a lambda |s| s.to_str() instead of OsStr::to_str - it might be a matter of preference or opinion as to which is more idiomatic.

Notice that both the argument &str and the return value are references to the original string slice created for the assertion. The returned slice cannot outlive the original slice that it is referencing, so you may need to create an owned String from this result if you need it to last longer.

Solution 2

What's more idiomatic than using Rust's builtin method for it?

Path::new(&filename).extension()
Share:
12,152

Related videos on Youtube

ansrivas
Author by

ansrivas

Updated on September 15, 2022

Comments

  • ansrivas
    ansrivas over 1 year

    I am trying to extract the extension of a file from a given String path.

    The following piece of code works, but I was wondering if there is a cleaner and more idiomatic Rust way to achieve this:

    use std::path::Path;
    
    fn main() {
    
        fn get_extension_from_filename(filename: String) -> String {
    
            //Change it to a canonical file path.
            let path = Path::new(&filename).canonicalize().expect(
                "Expecting an existing filename",
            );
    
            let filepath = path.to_str();
            let name = filepath.unwrap().split('/');
            let names: Vec<&str> = name.collect();
            let extension = names.last().expect("File extension can not be read.");
            let extens: Vec<&str> = extension.split(".").collect();
    
            extens[1..(extens.len())].join(".").to_string()
        }
    
        assert_eq!(get_extension_from_filename("abc.tar.gz".to_string()) ,"tar.gz" );
        assert_eq!(get_extension_from_filename("abc..gz".to_string()) ,".gz" );
        assert_eq!(get_extension_from_filename("abc.gz".to_string()) , "gz");
    
    }
    
    • interjay
      interjay over 6 years
      So you want to get everything after the leftmost dot? That would give the wrong result for e.g. "Version 1.2.txt"
    • ansrivas
      ansrivas over 6 years
      @interjay, yes and therefore I maintain a hashmap of allowed extensions, therefore 2.txt would panic. My intention is to extract a possible extension, in a generic way and compare against allowed-extensions hashmap.
    • loganfsmyth
      loganfsmyth over 6 years
      .tar.gz isn't a standalone extension, it's a .gz file, and when uncompressed you get a .tar file. You should follow the same process. Extract the extension and the non-extension parts, and recursively process the non-extension part pulling extensions off of it.
  • ansrivas
    ansrivas over 6 years
    Thanks for your quick response, but I think this example would fail: assert_eq!(get_extension_from_filename("abc.tar.gz".to_strin‌​g()) ,"tar.gz" );
  • Alexander
    Alexander over 6 years
    @Sokio Oh, I missed that.