How to identify the file type even though the file-extension has been changed?

10,851

Solution 1

Structure, magic numbers, metadata, strings and regular expressions, heuristics and statistical analysis... the tool will only be as good as the database of rules behind it.

Try DROID (Digital Record Object IDentification tool) for identifying file types; Java, Net BSD-licensed. It is a free project of the National Archives UK, unrelated to Android. Source is available on Github and Sourceforge. DROID documentation is good.

See also Darwinsys file and libmagic.

Solution 2

One of the best libraries to do this is Apache Tika. It doesn't only read the file's header, it's also capable of performing content analysis to detect the file type. Using Tika is very simple, here's an example of detecting a file's type:

import java.net.URL;
import org.apache.tika.Tika; //Including Tika

public class TestTika {

    public static void main(String[] args) {
        Tika tika = new Tika();
        String fileType = tika.detect(new URL("http://example.com/someFile.jpg"));
        System.out.println(fileType);
    }

}
Share:
10,851
Maximin
Author by

Maximin

Programming is my passion and also profession. I love to explore new tech, learn new things, play with it! Computer Vision, Neural Networks, Machine learning, Microservice driven backend design, implementation and management. My Global Profile Contact me @gmail

Updated on June 05, 2022

Comments

  • Maximin
    Maximin almost 2 years

    Files are categorized by file-extension. So my question is, how to identify the file type even the file extension has been changed.

    For example, i have a video file with name myVideo.mp4, i have changed it to myVideo.txt. So if i double-click it, the preferred text editor will open the file, and won't open the exact content. But, if i play myVideo.txt in a video player, the video will be played without any problem.

    I was just thinking of developing an application to determine the type of file without checking the file-extension and suggesting the software for opening the file. I would like to develop the application in Java.