File extensions and association with programs in linux

14,194

Solution 1

No, it doesn't mean that. If you have a text-file that has it's execute-permission set (e.g. chmod a+x somefile ), and the first line of the file is something like

"#!/path/to/executable"

It just tells Unix what program to use to execute the script. If a text-file is marked as an executable (i.e. is a script), Unix will start whatever program is specified in this way, and send the rest of the text-file (the script) to this program. Typically the program specified will be a shell (/bin/sh, /bin/csh or /bin/bash), the interpreter for some programming-language (Perl, Python or Ruby) or some other program that executes scripts (like the text-manipulators Awk or Sed).

Usually the "#" specify a comment in many languages, it's only if the first line begins with "#!" it's something special. If a file is marked as executable but doesn't start with a "#!", Unix will assume it's some kind of binary (e.g. an ELF-executable made by the C-compiler and linker).

In general Unix doesn't rely on the suffix of files. Many programs neither needs nor automatically adds their typical suffixes, one exception being the compression-programs (like gzip and bzip2) which usually replaces the original file with a compressed one, adding a suffix to mark compression-type (these are one of the few programs that complains about incorrect suffix).

Instead the file is identified by it's content through a series of tests, looking for "magic-numbers" and other identifiers (you can try the command file on some files to test this). This is also used by the file-browsers under GNOME and KDE to select icons and the list of programs to open/edit the file. Here the MIME-type of the file is identified by such tests, then the suitable programs for viewing and editing is found from a list associated to the MIME-type - not the suffix as in Windows.

Since one of the tests would be to check if the first line of a text-file is "#!/something", and then look at what "something" is; you could say for example that #!/usr/bin/perl identified the file as a perl-script - but that is more of a side-effect. Even if the file doesn't start with a "#!", the tests should be able to correctly identify the file. In any case, it's the content of the file that is used to identify it, not an arbitrary suffix. As such, endings like .pl (Perl) and .awk (Awk) is purely to help a human user to identify the type of file, it's not used by Unix to determent type (like the suffixes in Windows).

You can actually make a "script" without the "#!/something", but Unix wouldn't be able to automatically run it as an executable (it wouldn't know what program to run the script in). Instead you'd have to "manually" start it with something like perl myscript or python myscript. Many scripts in larger Python and Perl applications will actually not start with "#!/something", as they're scripts for "internal use" and not intended to be invoked by the user directly.

Instead you'll start the main script (which does start with a "#!/something"), and then it will pass these other scripts to the interpreter as this script runs.

Solution 2

In windows we can associate a file's extension with programs. E.g. a file test.pl can be run by the installed Perl interpreter due to the pl extension.

Right, but the extension is merely a hint about the type of file it is. A perl syntaxed script is still a perl syntaxed script irrespective of if it is called ".pl" or ".exe" or ".doc". You can still execute it by calling the interpreter directly for example; perl.exe thisfile.doc will work if the file is a perl syntaxed file. One can also associate '.blah' with being a perl file too if you wanted.

In linux though it needs #!/usr/bin/perl as the first line.

Again, this is merely a hint. You can still call a perl file without the shebang in it by directly invoking the script as an argument to the perl command.

Is this because there is no association between file extensions and programs in Linux

Technically, yes there is.

Linux actually has a number of ways to identify the executable format. The file command offers an illustration of this. It contains a database of "magic" (strings of certain lengths at certain offsets) to determine what type of file something is. It does this by inspecting the content of the file to work out what it is. Another way of doing this is with the file extension. You can actually register what behaviour you want using a system called 'binfmt_misc'. Wikipedia explains how this works (using 'magic' and file extensions).

A file is arbitrarily considered 'executable' in linux by the notion of the 'executuable' bit on the file permissions. When you attempt to run a application that has the executable bit, the kernel will read the first few bytes of the file to determine what to do.

  • If the file starts with #! then invoke the command in the path provided and make the following file the last argument of that command.
  • If the file starts with ELF, run /lib{64}/ld-linux.so (this is actually fetched from the section header of the ELF binary, its not a static path just used) followed by the command (this performs a series of static loading of libraries).
  • See what rules apply for binfmt_misc.

a.out formats I'm not too familiar with their invokation -- I assume the ld-linux.so loader still handles them.

Solution 3

No, it's not. Shebangs (#!) and associations between file types and applications serve different purposes, and you can find both in desktop Linux distributions.

It is customary to use the former in executable scripts that you call from the command line or from other scripts, and it does not depend on file extensions. Similar functionality is obtained in Windows through the PATHEXT environment variable, which does depend on file extensions.

The latter serves users who click at icons or file names on a desktop interface. In the Linux world, this comes from associating MIME types to applications, which, of course, is handled by specific tools and does not require reading the specification from typical desktop users.

Solution 4

You just can't make such comparisons, it's not that GNU/Linux does not have association between file extensions and programs, it's because GNU/Linux actually does not care about it.

You have file permissions, and if your file is executable, it will ALWAYS search for the shebang at the first line of any script to look on how to execute it.

Linux's ELF binaries, on the other hand, does not have exe extensions as windows binaries does, and they are always executables, even without ANY extension.

Solution 5

The shebang line (#!) is used to indicate which Perl version you're using, and also to invoke the script directly from the shell

./myscript.pl 

You can do this as well without the shebang line:

perl myscript.pl
Share:
14,194
Cratylus
Author by

Cratylus

Updated on September 18, 2022

Comments

  • Cratylus
    Cratylus almost 2 years

    In windows we can associate a file's extension with programs.
    E.g. a file test.pl can be run by the installed Perl interpreter due to the pl extension.
    In linux though it needs #!/usr/bin/perl as the first line.
    Is this because there is no association between file extensions and programs in Linux?

  • Admin
    Admin about 11 years
    But if I just have 1 version would I still need it?All tutorials and examples have this line.
  • Admin
    Admin about 11 years
    if your file is executable, it will ALWAYS search for the shebang at the first line of any script to look on how to execute it So without this line, Linux does not know how to execute the perl script?
  • Admin
    Admin about 11 years
    Yes, and then you should perl script.pl to teach BASH actually, not Linux per se.
  • Admin
    Admin about 11 years
    So unlike Windows I either do: perl script.pl or ./script.pl with #!/usr/bin/perl as first line? So there is no real association between extensions and programs in Linux?
  • Matthew Ife
    Matthew Ife about 11 years
    Linux's ELF binaries, on the other hand, does not have exe extensions as windows binaaries does, and they are always executables, even without ANY extension. This is not quite that accurate. ELF binaries are always ELF binaries, perl files are always perl files. But an ELF is not necessarily an executable directly (libraries and kernel modules are ELF binaries but not exectuable).
  • Admin
    Admin about 11 years
    I'm able to execute libraries on linux. try this: /lib/libc.so.6 and watch the execution output ;)
  • Matthew Ife
    Matthew Ife about 11 years
    Thats because it has an init routine. There are only 2 libraries I know of that do this (libc and ld-linux). Feel free to go 'execute libraries' that are not one of these two.
  • Stéphane Chazelas
    Stéphane Chazelas about 10 years
    A script without a she-bang is meant to be interpreted by the system's shell (when execve() returns ENOEXEC and the file is not detected as being binary).