Why does Java automatically decode %2F in URI encoded filenames?
Solution 1
The new File(URI)
constructs the file based on the path as obtained by URI#getPath()
instead of -what you expected- URI#getRawPath()
. This look like a feature "by design".
You have 2 options:
- Run
URLEncoder#encode()
onfn
twice (note:encode()
, notencoder()
). - Use
new File(String)
instead.
Solution 2
I think that @BalusC has nailed the direct problem in your code. I'd just like to point out some other issuse
The dir.toURI().toASCIIString()
and URLEncoder.encoder(fn, "UTF-8").toString()
expressions actually do rather different things.
The first one, encodes the URI as a string, applying the URI encoding rules according to the URI grammar. So for example, a '/' in the path component will not be encoded but a '/' in the query or fragment components will be encoded as %2F.
The second one, encodes the
fn
String applying the encoding rules without reference to the content of the string.
The File(URI)
constructor's mapping from a file URI to a File is system dependent and undocumented. I'm a bit surprised that it decodes the %2F
, but it does what it does, and @BalusC explains why. The take-away is that it is potentially problematic to use a mechanism ("file:" URIs) that are explicitly system dependent.
Finally, it is wrong to combine those URI component strings like that. It should be either
URI uri = new URI(
dir.toURI().toString() +
URLEncoder.encoder(fn, "UTF-8").toString();
or
URI uri = new URI(
dir.toURI().toASCIIString() +
URLEncoder.encoder(fn, "ASCII").toString());
Lucas
Updated on June 04, 2022Comments
-
Lucas almost 2 years
I have a servlet that needs to write out files that have a user-configurable name. I am trying to use URI encoding to properly escape special characters, but the JRE appears to automatically convert encoded forward slashes
%2F
into path separators.Example:
File dir = new File("C:\Documents and Setting\username\temp"); String fn = "Top 1/2.pdf"; URI uri = new URI( dir.toURI().toASCIIString() + URLEncoder.encoder( fn, "ASCII" ).toString() ); File out = new File( uri ); System.out.println( dir.toURI().toASCIIString() ); System.out.println( URLEncoder.encode( fn, "ASCII" ).toString() ); System.out.println( uri.toASCIIString() ); System.out.println( output.toURI().toASCIIString() );
The output is:
file:/C:/Documents%20and%20Settings/username/temp/ Top+1%2F2.pdf file:/C:/Documents%20and%20Settings/username/temp/Top+1%2F2.pdf file:/C:/Documents%20and%20Settings/username/temp/Top+1/2.pdf
After the new File object is instantiated, the
%2F
sequence is automatically converted to a forward slash and I end up with an incorrect path. Does anybody know the proper way to approach this issue?The core of the problem seems to be that
uri.equals( new File(uri).toURI() ) == FALSE
when there is a
%2F
in the URI.I'm planning to just use the URLEncoded string verbatim rather than trying to use the
File(uri)
constructor.