Java Regex file extension
Solution 1
You can use the following idiom to match both your path+file name, an gzip extensions in one go:
String[] inputs = {
"/path/to/foo.txt.tar.gz",
"/path/to/bar.txt.gz",
"/path/to/nope.txt"
};
// ┌ group 1: any character reluctantly quantified
// | ┌ group 2
// | | ┌ optional ".tar"
// | | | ┌ compulsory ".gz"
// | | | | ┌ end of input
Pattern p = Pattern.compile("(.+?)((\\.tar)?\\.gz)$");
for (String s: inputs) {
Matcher m = p.matcher(s);
if (m.find()) {
System.out.printf("Found: %s --> %s %n", m.group(1), m.group(2));
}
}
Output
Found: /path/to/foo.txt --> .tar.gz
Found: /path/to/bar.txt --> .gz
Solution 2
You need to make the part that matches the file name reluctant, i.e. change (.+)
to (.+?)
:
String rgx = "^(.+?)(\\.tar)?\\.gz";
// ^^^
Now you get:
Matcher m = Pattern.compile(rgx).matcher(path);
if(m.find()){
System.out.println(m.group(1)); // /path/to/file.txt
}
Solution 3
Use a capturing group based regex.
^(.+)/(.+)(?:\\.tar)?\\.gz$
And,
Get the path from index 1.
Get the filename from index 2.
Giovanni
Updated on June 04, 2022Comments
-
Giovanni almost 2 years
I have to check if a file name ends with a gzip extension. In particular I'm looking for two extensions: ".tar.gz" and ".gz". I would like to capture the file name (and path) as a group using a single regular expression excluding the gzip extension if any. I tested the following regular expressions on this example path
String path = "/path/to/file.txt.tar.gz";
Expression 1:
String rgx = "(.+)(?=([\\.tar]?\\.gz)$)";
Expression 2:
String rgx = "^(.+)[\\.tar]?\\.gz$";
Extracting group 1 in this way:
Matcher m = Pattern.compile(rgx).matcher(path); if(m.find()){ System.out.println(m.group(1)); }
Both regular expressions give me the same result:
/path/to/file.txt.tar
and not/path/to/file.txt
. Any help will be appreciated.Thanks in advance