HowTo extract MimeType from a byte[]
Solution 1
byte[] data = ...
MagicMatch match = Magic.getMagicMatch(data);
String mimeType = match.getMimeType();
Solution 2
I'm sure the library posted by @sfussenegger is the best solution, but I do it by hand with the following snippet that I hope it could help you.
DESCONOCIDO("desconocido", new byte[][] {}), PDF("PDF",
new byte[][] { { 0x25, 0x50, 0x44, 0x46 } }), JPG("JPG",
new byte[][] { { (byte) 0xff, (byte) 0xd8, (byte) 0xff,
(byte) 0xe0 } }), RAR("RAR", new byte[][] { { 0x52,
0x61, 0x72, 0x21 } }), GIF("GIF", new byte[][] { { 0x47, 0x49,
0x46, 0x38 } }), PNG("PNG", new byte[][] { { (byte) 0x89, 0x50,
0x4e, 0x47 } }), ZIP("ZIP", new byte[][] { { 0x50, 0x4b } }), TIFF(
"TIFF", new byte[][] { { 0x49, 0x49 }, { 0x4D, 0x4D } }), BMP(
"BMP", new byte[][] { { 0x42, 0x4d } });
Regards.
PD: The best of it is that it doesn't have any dependency. PD2: No warranty about it's correctness! PD3: "desconocido" stands for "unknown" (in spanish)
Related videos on Youtube
mickthompson
Updated on July 09, 2022Comments
-
mickthompson almost 2 years
I've a web page that that can be used to upload files.
Now I need to check if the file type is correct (zip, jpg, pdf,...).
I can use the mimeType that comes with the request but I don't trust the user and let's say I want to be sure that nobody is able to upload a .gif file that was renamed in .jpg
I think that in this case I should inspect the magic number.
This is a java library I've found that seems to achieve what I need 'extract the mimetype from the magic number'.
Is this a correct solution or what do you suggest?UPDATE: I've found the mime-util project and it seems very good and up-to-date! (maybe better then Java Mime Magic Library?)
Here is a list of utility projects that can help you to extract mime-types -
mickthompson over 14 yearsI tried activation framework's getContentType() over some .pdf, .xls files but unfortunately the method is always returning 'application/octet-stream'. only for .txt is giving something like 'text/plain'
-
mickthompson over 14 yearsactually the getContentType only maps the file based on the file extension and a map of mimeType that you provide... this is not what I'm looking for
-
James B over 14 yearsI agree, that's not what you're looking for!
-
Oscar Pérez over 11 yearsIt does not detect docx files correctly.. it keeps giving application/zip as mimetype...
-
sfussenegger about 11 years@OscarPérez A docx is indeed a zip archive containing a bunch of XML files, so it's technically correct. You could inspect the archive yourself to see if it is a docx or similar. This would probably be out of scope for this small library.
-
catch23 about 11 years@sfussenegger What can you say about this SO question check file of MIME-type with JMimeMagic?
-
blong over 10 yearsLinking to an IP address is weird.