Java Regex file extension

10,846

Solution 1

You can use the following idiom to match both your path+file name, an gzip extensions in one go:

String[] inputs = {
        "/path/to/foo.txt.tar.gz", 
        "/path/to/bar.txt.gz",
        "/path/to/nope.txt"
 };
//                           ┌ group 1: any character reluctantly quantified
//                           |    ┌ group 2
//                           |    | ┌ optional ".tar"
//                           |    | |       ┌ compulsory ".gz"
//                           |    | |       |     ┌ end of input
Pattern p = Pattern.compile("(.+?)((\\.tar)?\\.gz)$");
for (String s: inputs) {
    Matcher m = p.matcher(s);
    if (m.find()) {
        System.out.printf("Found: %s --> %s %n", m.group(1), m.group(2));
    }
}

Output

Found: /path/to/foo.txt --> .tar.gz 
Found: /path/to/bar.txt --> .gz 

Solution 2

You need to make the part that matches the file name reluctant, i.e. change (.+) to (.+?):

String rgx = "^(.+?)(\\.tar)?\\.gz";
//              ^^^

Now you get:

Matcher m = Pattern.compile(rgx).matcher(path);           
if(m.find()){
   System.out.println(m.group(1));   //   /path/to/file.txt
}

Solution 3

Use a capturing group based regex.

^(.+)/(.+)(?:\\.tar)?\\.gz$

And,

Get the path from index 1.

Get the filename from index 2.

DEMO

Share:
10,846
Giovanni
Author by

Giovanni

Updated on June 04, 2022

Comments

  • Giovanni
    Giovanni almost 2 years

    I have to check if a file name ends with a gzip extension. In particular I'm looking for two extensions: ".tar.gz" and ".gz". I would like to capture the file name (and path) as a group using a single regular expression excluding the gzip extension if any. I tested the following regular expressions on this example path

    String path = "/path/to/file.txt.tar.gz";
    
    1. Expression 1:

      String rgx = "(.+)(?=([\\.tar]?\\.gz)$)";
      
    2. Expression 2:

      String rgx = "^(.+)[\\.tar]?\\.gz$";
      

    Extracting group 1 in this way:

    Matcher m = Pattern.compile(rgx).matcher(path);           
    if(m.find()){
       System.out.println(m.group(1));
    }
    

    Both regular expressions give me the same result: /path/to/file.txt.tar and not /path/to/file.txt. Any help will be appreciated.

    Thanks in advance