Java file equals

26,478

Solution 1

Not, it's not the case. Because equals is comparing equality of absolute paths (in your case above it is something like:

some-project\.\hello.txt
some-project\hello.txt

So they are naturally different.

It seems like I have to write my own equals method to support it, right?

Probably yes. But first of all, you have to know what you want to compare? Only pathnames? If yes, compare its canonical path in this way:

f1.getCanonicalPath().equals(f2.getCanonicalPath())

But if you want compare content of two different files, then yes, you should write your own method - or simply just copy from somewhere on the internet.

Solution 2

To properly test equals, you must call getCanonicalFile(). e.g.

public static void main(String[] args) throws IOException
   {
       File f1 = new File("./hello.txt").getCanonicalFile();
       File f2 = new File("hello.txt").getCanonicalFile();
       System.out.println("f1: " + f1.getAbsolutePath());
       System.out.println("f2: " + f2.getAbsolutePath());
       System.out.println("f1.equals(f2) returns " + f1.equals(f2));
       System.out.println("f1.compareTo(f2) returns " + f1.compareTo(f2));
   }

Will return true for equals. Note that getCanonicalFile may throw an IOException so I added that to the method signature.

Solution 3

If you only want to compare the CONTENTS of each file, you could read the contents into a byte array like this:

byte[] f1 = Files.readAllBytes(file1);
byte[] f2 = Files.readAllBytes(file2);

And then compare exactly what you want from there.

Note that this method call only exists in Java 7. For older versions, Guava and Apache have methods to do similar but with different names and details.

Edit: OR a better option (especially if you're comparing large files) might be to simply compare byte by byte rather than loading the entire file into memory, like this:

FileInputStream f1 = new FileInputStream(file1);
DataInputStream d1 = new DataInputStream(f1);
FileInputStream f2 = new FileInputStream(file2);
DataInputStream d2 = new DataInputStream(f2);

byte b1 = d1.readByte();
byte b2 = d2.readByte();

And then compare from there.

Solution 4

The quicker way I found to diff on two files is below.

That's just proposition to work it around.

Not sure about the performance (what if files are 10 GB each?)

    File file = new File("/tmp/file.txt");
    File secondFile = new File("/tmp/secondFile.txt");

    // Bytes diff
    byte[] b1 = Files.readAllBytes(file.toPath());
    byte[] b2 = Files.readAllBytes(secondFile.toPath());

    boolean equals = Arrays.equals(b1, b2);

    System.out.println("the same? " + equals);

    // List Diff
    List<String> c1 = Files.readAllLines(file.toPath());
    List<String> c2 = Files.readAllLines(secondFile.toPath());

    boolean containsAll = c1.containsAll(c2);
    System.out.println("the same? " + containsAll);                
}

EDIT

But still, diff utility on unix system would be much quicker and verbose. Depends what you need to compare.

Solution 5

Here is the implementation of both methods:

/**
 * Tests this abstract pathname for equality with the given object.
 * Returns <code>true</code> if and only if the argument is not
 * <code>null</code> and is an abstract pathname that denotes the same file
 * or directory as this abstract pathname.  Whether or not two abstract
 * pathnames are equal depends upon the underlying system.  On UNIX
 * systems, alphabetic case is significant in comparing pathnames; on Microsoft Windows
 * systems it is not.
 *
 * @param   obj   The object to be compared with this abstract pathname
 *
 * @return  <code>true</code> if and only if the objects are the same;
 *          <code>false</code> otherwise
 */
public boolean equals(Object obj) {
    if ((obj != null) && (obj instanceof File)) {
        return compareTo((File)obj) == 0;
    }
    return false;
}
/**
 * Compares two abstract pathnames lexicographically.  The ordering
 * defined by this method depends upon the underlying system.  On UNIX
 * systems, alphabetic case is significant in comparing pathnames; on Microsoft Windows
 * systems it is not.
 *
 * @param   pathname  The abstract pathname to be compared to this abstract
 *                    pathname
 *
 * @return  Zero if the argument is equal to this abstract pathname, a
 *          value less than zero if this abstract pathname is
 *          lexicographically less than the argument, or a value greater
 *          than zero if this abstract pathname is lexicographically
 *          greater than the argument
 *
 * @since   1.2
 */
public int compareTo(File pathname) {
    return fs.compare(this, pathname);
}
Share:
26,478

Related videos on Youtube

aandeers
Author by

aandeers

Updated on September 17, 2020

Comments

  • aandeers
    aandeers over 3 years

    I don't know about you guys but at least I expected that f1 would be equal to f2 in the below code but apparently that's not the case! What's your thoughts about this? It seems like I have to write my own equals method to support it, right?

    import java.io.*;
    
    public class FileEquals
    {
        public static void main(String[] args)
        {
            File f1 = new File("./hello.txt");
            File f2 = new File("hello.txt");
            System.out.println("f1: " + f1.getName());
            System.out.println("f2: " + f2.getName());
            System.out.println("f1.equals(f2) returns " + f1.equals(f2));
            System.out.println("f1.compareTo(f2) returns " + f1.compareTo(f2));
        }
    }
    
    • Luciano
      Luciano over 12 years
      The same happens with Java 7's Path class. But there exist methods like Path.normalize() or Files.isSameFile()
    • bluenote10
      bluenote10 almost 8 years
      You could safe all viewer of this question some time by showing the actual output. I was expecting that equals and compareTo had contradicting results. This is not the case, equals returns false and compareTo returns -58, meaning lexicographically "less than". @Luciano: Note that Files.isSameFile would in this case try to open the files since the paths are not equal and could fail with NoSuchFileException.
  • aandeers
    aandeers over 12 years
    I actually want to do something like "fileList.contains(file)" and this method calls the equals method.
  • user949300
    user949300 over 12 years
    +1 Nice post - I learned something today (I haven't used Java 7 yet, glad to see they added a Files utility)
  • Luciano
    Luciano over 12 years
    I would compare the files' size first, if it's available.
  • unbeli
    unbeli over 12 years
    it is an incredibly bad idea to compare files like that
  • user949300
    user949300 over 12 years
    @Luciano yes, testing file size first is a good idea. I don't know why size would not be available, but, if it weren't, then test (f1.length == f2.length)
  • user949300
    user949300 over 12 years
    @unbeli Please elaborate. I've used similar code in a lot of unit tests, where one file contains correct results and one file contains the results generated by the program/algorithm. That isn't what OP wants to do (as he has since elaborated) but Brian said CONTENTS and he even capitalized it.
  • Brian Snow
    Brian Snow over 12 years
    @user949300 Thanks for the addition to my post
  • Brian Snow
    Brian Snow over 12 years
    @unbeli I'm also hoping you can elaborate on your comment.
  • unbeli
    unbeli over 12 years
    @Brian Snow, think about this: if the first byte of these two files is different, why reading all of it? What if the files are large? Do you really need both files in memory?
  • user949300
    user949300 over 12 years
    @unbeli Now that Wikipedia is back up, link typical HDD throughputs are, as I understand the article, 1000Mbit per second, or ~100MB per second. So, unless you have a performance requirement that this comparison be done in less than a couple of seconds, it is just fine for files up to 100MB.
  • user949300
    user949300 over 12 years
    @unbeli Also, I was looking at files in unit tests where you expect them to be equal. If the files are unlikely to be equal, and they are large, then you are absolutely right that this is a bad idea.
  • unbeli
    unbeli over 12 years
    @user949300 it does not matter if files are expected to be equal or not. It also does not matter what the HDD throughput is (and no, you got it wrong).
  • unbeli
    unbeli over 12 years
    @Brian Snow, what you wrote is not my idea. Please remove that claim, thank you.
  • user949300
    user949300 over 12 years
    @unbeli If files are expected to be equal 99% of the time, 99% of the time you have to read every byte.
  • unbeli
    unbeli over 12 years
    @user949300 possibly, but you never have to keep both files in memory.
  • linjiejun
    linjiejun almost 5 years
    The answer let me feel confuse. See the source code in the UnixFileSystem.java in jdk : public int compare(File f1, File f2) { return f1.getPath().compareTo(f2.getPath()); }@G.Demecki I am not agree with : equals is comparing equality of absolute paths