Saving one file format with a different file extension. JPG - PNG; MOV - MP4

7,493

Solution 1

Most programs don't look at the extension AT ALL. They look at the file-header content to determine what it really is and act accordingly.
Almost every well-known standard file-format has recognizable identification in the first bytes of the file. (E.g Every GIF image has the characters "GIF87a" as the first 6 bytes.)
If the software knows how to handle it, it just does (some do give a warning that the extension is wrong), if it doesn't it gives you an error message (or just crashes if it is badly programmed).

The extension mainly serves as a visual indicator for you indicating what the file most likely is.
And it allows your OS to quickly determine what application is best suited to handle it, without having to actually read the content of the file.

Solution 2

Changing the name of the file does exactly that: change the name of the file. Nothing more. In particular, changing the name of the file does not change the content of the file, only the name and nothing but the name.

(In fact, changing the name of the file will actually not touch the file at all, since the "name" is actually just an entry in the directory. It isn't associated with the file.)

Since nothing about the content of the file itself changed, it should not be surprising that a program that was able to correctly decode the content of the file when it was named Fred will also be able to correctly decode the content of the file when it is named Wilma, for the simple reason that the content is exactly the same.

Solution 3

Almost all file formats embed information about what the file type is right near the beginning of the file. For example, a real PNG file always starts with the eight bytes 0x89 0x50 0x4E 0x47 0x0D 0x0A 0x1A 0x0A (note that bytes 2 through 4 are the ASCII characters 'PNG', the rest of the header is binary data designed to detect the file being handled in ways that would result in data corruption), or an ELF object file (used for executables on most systems other than Windows and macOS) starts with 0x7F 0x45 0x4C 0x46 (with bytes 2-4 being 'ELF' in ASCII). These are known as file signatures, and while they are not the only way to determine a file type based on contents, they're generally the first step. Wikipedia has a list of them for many common file types that may be of interest.

The near ubiquitous use of file signatures means that you can look at the contents of the file itself to figure out what type of file it is, and almost all software does exactly that for two reasons:

  • It's significantly more reliable than matching on file extensions (or on the MIME type reported by the server you're downloading the file from) because you can't modify this data and still have it be a valid file of that type but you can change the extension (or MIME type) to whatever you want and the file will not change what type it is.
  • Validating the file type is an important layer of protection against crashing the application or exploiting bugs in it. If you blindly trust other sources of information about the file type, you run the risk of trying to parse something as one type of data when it is in fact a different type, which can cause all kinds of problems. Insufficient validation of the file structure given the expected file type has historically been a very common attack vector for malware.

Windows is largely the anomalous case here in that it predominantly favors file extensions over actual file contents for deciding what to tell users the file type is, while most other systems and most applications only fall back to the file extension if they can't figure out the type by looking at the file contents. The sole practical purpose of a file extension these days is to act as a generic indicator of what the file type might be, making it easier to determine what type of file you're dealing with or find files of a specific type without having to inspect the file contents, though in some cases people just choose to inspect the contents anyway (see for example the file command from UNIX systems).

Solution 4

In general, a file extension is a way of providing a clue to some piece of software about what format the file's contents are in. The other clue that's usually available is the contents of the file itself, which often include an explicit header at the beginning of the file for this purpose.

Every piece of software is free to use either or both of these pieces of information. Some common approaches are:

  • Ignore the file extension completely, and just examine the file contents.
  • Ignore the the file contents completely, and just look at the file extension. Error if the file can't be processed as that type.
  • Look at both, and warn the user if they don't match, then attempt to process according to the file contents.
  • Look at the file extension initially, and attempt to process as that type. If processing fails, warn the user and try to guess the type from the contents, or even ask the user.

Changing the file extension will therefore affect different programs differently:

  • It will change the icon and double-click action in Windows Explorer, and change the download behaviour on an Apache web server, because those programs look at the file extension only.
  • If the file is not valid for the file extension you select, it may cause some programs to refuse to open the file.
  • If the file can be validly interpreted as more than one format, changing the file extension may cause some programs to change how they process it.

For videos in particular, most file formats are "containers" anyway, so have a lot of metadata at the start of the file to indicate exactly how they have been encoded and assembled. It's therefore likely that software for working with them will take a content-first approach, and changing the extension will either make no difference, or give a warning and then proceed as normal.

Solution 5

im fairly sure i have tried the same video renamed with 2 or 3 different extensions on one of those media player boxes or dvd player, and some files played ok & not others - exactly the same file only differing in extension

and irfanview will tell me if a jpg is really a gif or vice versa & offer to rename it for me

Share:
7,493
Radostin Cholakov
Author by

Radostin Cholakov

Updated on September 18, 2022

Comments

  • Radostin Cholakov
    Radostin Cholakov over 1 year

    From experience I know that if I save .jpg file with an .png extension (or vice versa) the most programs will open it as normally. I am wondering why is that the case and ask people with experience in the video codecs, what will happen if I try to save mov or avi files as mp4? (To be totally clear: By saving I mean, renaming their filename with the non-corresponding extension)

    Will the video players supporting AVI and MOV be still able to play the file if its file extension is MP4.

    • If so what are all the possible issues you could describe that could pop while trying such a playback?
    • If not, why? And also why this is possible with images? Describe all the technical details you feel related to this topic!

    Thanks :)

    • Frank Thomas
      Frank Thomas almost 4 years
      Note that in the case of an image viewer, there are parsers and renders written into the program to handle all the appropriate types and their code does not determine which type it is just based on the ext, as Tonny explained. you could not for instance display a word document in a image viewer though, since that programmer has no logic for dealing with the document format.
    • chrylis -cautiouslyoptimistic-
      chrylis -cautiouslyoptimistic- almost 4 years
      You may be interested in the file program.
    • Booga Roo
      Booga Roo almost 4 years
      A side note: Some file types, like AVI or MKV, are container formats that have to be parsed anyway and can be encoded in a variety of ways. Even a program that can handle AVI may not be able to open all AVI files, correct extensions or not. Results can be frustratingly inconsistent when testing any of these mix/match scenarios.
    • Kamil Maciorowski
      Kamil Maciorowski almost 4 years
    • OrangeDog
      OrangeDog almost 4 years
      These days mov and mp4 are different names for exactly the same format. The others mentioned are actually all different formats.
    • user13267
      user13267 almost 4 years
      Have a look at Magic number
  • DrMoishe Pippik
    DrMoishe Pippik almost 4 years
    JFYI, en.wikipedia.org/wiki/List_of_file_signatures lists some common file format signatures. Some programs, such as IrfanView, offer the user, "[filename.ext] is a [file type] with incorrect extension. Rename?"
  • chrylis -cautiouslyoptimistic-
    chrylis -cautiouslyoptimistic- almost 4 years
  • zero298
    zero298 almost 4 years
    In my experience, this usually only applies to binary format files. Would you say this is your experience as well? I mean the only real indication that a text file is a JS file or just file full of cat names is the extension. Sometimes you have to try parsing and failing before knowing.
  • Tonny
    Tonny almost 4 years
    @zero298 True. Especially text and the various xml, html, json variants are always tricky. XML may have a DocType header, but for anything else.... Simple example: A plain text file with 1 word per line. It that just a list or a single column CSV? Obviously that can lead to the wrong content being fed to a program that expects something else. And the solution is just as you say: Try to parse it and (hopefully) the program is robust enough to not crash and give a reasonable error-message.
  • htmlcoderexe
    htmlcoderexe almost 4 years
    Throwing this in as an example when a text file gets confused with... a text file: there used to be a bug in Windows Notepad that caused certain plain text to be recognised as Unicode, turning it into gibberish: en.wikipedia.org/wiki/Bush_hid_the_facts Funny thing, it's actually caused by a Windows API function, so the "fix" for Notepad was to use a different algorithm; as far as I can tell the original function is still in Windows and still just as faulty.
  • Jasen
    Jasen almost 4 years
    some gifs start GIF89a
  • Tonny
    Tonny almost 4 years
    @Jasen You are right. 89a is the 2nd edition of the gif-standard. It added a few minor additions, mainly to improve handling of animated gif’s. Most animated gif’s should be 89a. Gif producing software often does it wrong and simply hard coded saves it as 87a. (E.g. old versions of Photoshop saved everything as 87, even if it was technically a 89. This was fixed in Ps v7 or 8 iirc.) Most readers ignore the version and read as if its 89, which is the safe thing to do, but results in a wrong playback speed for a animated gif that is really in 87a format.
  • yyny
    yyny almost 4 years
    The OS typically doesn't care about the extension of files either. On both Linux and Windows, you can execute any file regardless of their extension. Typically only the file manager/desktop manager needs to care about the extension so they know what to do with a file when you click it's icon.
  • Tonny
    Tonny almost 4 years
    @yyny Technically that is correct, but for most people there really isn't a difference between "the OS" and 'the File/Desktop Manager". In fact: Most people don't even know they have a desktop manager. I didn't want to complicate the answer too much :-)
  • OrangeDog
    OrangeDog almost 4 years
    It definitely is associated with the file, it's just the file contents are stored separately to the file metadata.
  • OrangeDog
    OrangeDog almost 4 years
    It's also quicker to check the file extension, than to open and read the file looking for a signature that may not even be there.
  • IMSoP
    IMSoP almost 4 years
    Upvoted, but beware of repeating the common mantra that "Windows uses file extensions, other OSes don't". As I tried to explain here, both file extension and file contents are generally being looked at by particular programs, and a lot of programs (on any OS) will use the file extension because it's much more efficient, and in some cases actually more informative, than the available alternatives.
  • Austin Hemmelgarn
    Austin Hemmelgarn almost 4 years
    @IMSoP Windows itself (not any applications, but the actual OS and user shell) does indeed look at file extensions to the exclusion of contents in most contexts where it's presenting file type information to the user, while macOS and all major UNIX environments preferentially look at contents (or the resource fork in some cases on macOS and similar systems like BeOS). I'm not referring to cases where you're actually opening the file here, just presentational behavior.
  • IMSoP
    IMSoP almost 4 years
    @AustinHemmelgarn There is no agreed definition of "actual OS", and plenty of examples on all OSes where both names and contents are used to make decisions about the file. The statement "Windows Explorer (the default file manager in Windows) uses the file extension to choose icons and default actions" is true; the statement "Windows programs have a stronger tradition of relying on the file extension" may also be true; but the common characterisation of "Windows does one thing, other OSes do the other" is misleading.
  • Austin Hemmelgarn
    Austin Hemmelgarn almost 4 years
    @IMSoP My point that the presentational aspect on Windows prior to the user actually opening the file almost exclusively cares about the file extension is still the case though. I will agree that I could have worded my statement more clearly to express this (and have now done so).
  • IMSoP
    IMSoP almost 4 years
    @AustinHemmelgarn Better, but still a bit strong IMHO. Windows Explorer (which isn't synonymous with "Windows") isn't that "anomalous" in this - the Apache web server also uses file names as its primary source of type information, for instance, because file names are a lot more efficient to process than file contents.
  • muru
    muru almost 4 years
    @OrangeDog I suppose what OP is saying is that the filename is associated with a directory entry, which in turn is associated with the file. The filename itself isn't directly associated with the file, compared to, for example, permissions, timestamps, ownership, etc. in ext4 on Linux, where two hardlinks might have different names, but still share those other attributes.
  • Hashim Aziz
    Hashim Aziz almost 4 years
    Demarcating file managers and OS for Windows is pedanticism - for probably 99% of people using Windows, their file manager is Windows Explorer, and for all those people the extension definitely does affect how a file is treated.
  • DocRoot
    DocRoot almost 4 years
    Media players, smart TVs etc. can be particularly finicky about the media they will play... requiring it to be in a specific format and named just so.
  • Darrel Hoffman
    Darrel Hoffman almost 4 years
    A typical Windows File Open dialog will usually have the feature to filter by file type - and this generally only looks at the extension, rather than the file's signature. Actually opening the file, of course, is another matter, and that depends on whoever wrote the program in question.
  • Gerald
    Gerald almost 4 years
    Eye of Gnome, the default image viewer on Ubuntu, will open image files fine if they have the correct extension or no extension, but will refuse if they have an incorrect extension. I'm not sure why. I assume this was a deliberate decision on the part of the developers.
  • ARNAB
    ARNAB over 3 years
    . Many do. Plethora of text files only associated by their extension. What is this nonsense? Yes, we know some files can be figured out with headers but many can't. C# vs C vs Cpp? Yaml vs. Conf? Etc etc. Please. Use some common sense.
  • ARNAB
    ARNAB over 3 years
    Obviousky it can decode it but many systems still use extensions to determine app association.
  • IMSoP
    IMSoP over 3 years
    I sort of get what you're saying, in a very abstract way, but ultimately file names are always metadata - if I name a file "Sales Report for Mark", I'm adding metadata that will help me remember what it is, rather than having to identify it as "inode#35491057" or "{a8ce2701-44ad-4d14-befc-06239ef1b506}" or whatever. Some file systems (e.g. Apple HFS) have a separate field for file type; but then arguably so did FAT: the three-letter "extension" was intended for storing type information. There have been attempts to make more "database-like" file systems; none have become mainstream AFAIK.
  • Johannes Pille
    Johannes Pille over 3 years
    @muru (1) You might be right in that supposition in that it might indeed be what the OP had intended to express, but it factually isn't. if the file meta data is associated with anything but itself, then it is with what it exists to describe, then it is that very region of persistent storage. (2) Whether physically a (slightly) different location or not, not only users are presented both the file and it's meta data in conjunction usually, so are OS APIs and by proxy userland applications.
  • Johannes Pille
    Johannes Pille over 3 years
    (3) What are you even saying? "The filename itself isn't directly associated with the file, compared to, for example, permissions, timestamps, ownership" Come again? "The file" in this context surely is it's content itself, everything else is just meta, no? All the data you mention, beginning with "For example permissions" are stored on an inode, which is distinctly and deliberately not the file itself. Bunch of half-truths and misinformation you're spreading here.
  • Johannes Pille
    Johannes Pille over 3 years
    @Jörg W Mittag It's more so semantics than to be "disagreeing with the point your making". I get the notion your attempting to get across. However, may I ask, what you'd consider changing "the file" then? You're saying "In fact, changing the name of the file will actually not touch the file at all". Funny you should use that term. Actually running touch(1) file will not touch the file at all either (none of the sectors on disk containing any of it's content) anyway. It will simply set two timestamps on the corresponding inode.