Choose interpreter after script start e.g. if/else inside hashbang

8,386

Solution 1

No, that won't work. The two characters #! absolutely needs to be the first two characters in the file (how would you specify what interpreted the if-statement anyway?). This constitutes the "magic number" that the exec() family of functions detects when they determine whether a file that they are about to execute is a script (which needs an interpreter) or a binary file (which doesn't).

The format of the shebang line is quite strict. It needs to have an absolute path to an interpreter and at most one argument to it.

What you can do is to use env:

#!/usr/bin/env interpreter

Now, the path to env is usually /usr/bin/env, but technically that's no guarantee.

This allows you to adjust the PATH environment variable on each system so that interpreter (be it bash, python or perl or whatever you have) is found.

A downside with this approach is that it will be impossible to portably pass an argument to the interpreter.

This means that

#!/usr/bin/env awk -f

and

#!/usr/bin/env sed -f

is unlikely to work on some systems.

Another obvious approach is to use GNU autotools (or some simpler templating system) to find the interpreter and place the correct path into the file in a ./configure step, which would be run upon installing the script on each system.

One could also resort to running the script with an explicit interpreter, but that's obviously what you're trying to avoid:

$ sed -f script.sed

Solution 2

You can always make a wrapper script to find the correct interpreter for the actual program:

#!/bin/bash
if something ; then
    interpreter=this
    script=/some/path/to/program.real
    flags=()
else
    interpreter=that
    script=/other/path/to/program.real
    flags=(-x -y)
fi
exec "$interpreter" "${flags[@]}" "$script" "$@"

Save the wrapper in the users' PATH as program and put the actual program aside or with another name.

I used #!/bin/bash in the hashbang because of the flags array. If you don't need to store a variable number of flags or such and can do without it, the script should work portably with #!/bin/sh.

Solution 3

You can also write a polyglot (combine two languages). /bin/sh is guaranteed to exist.

This has the downside of ugly code and perhaps some /bin/shs could potentially get confused. But it can be used when env does not exist or exists somewhere else than /usr/bin/env. It can also be used if you want to do some pretty fancy selection.

The first part of the script determines which interpreter to use when run with /bin/sh as interpreter, but is ignored when run by the correct interpreter. Use exec to prevent the shell from running more than the first part.

Python example:

#!/bin/sh
'''
' 2>/dev/null
# Python thinks this is a string, docstring unfortunately.
# The shell has just tried running the <newline> program.
find_best_python ()
{
    for candidate in pypy3 pypy python3 python; do
        if [ -n "$(which $candidate)" ]; then
            echo $candidate
            return
        fi
    done
    echo "Can't find any Python" >/dev/stderr
    exit 1
}
interpreter="$(find_best_python)"   # Replace with something fancier.
# Run the rest of the script
exec "$interpreter" "$0" "$@"
'''

Solution 4

While this doesn't select the interpreter within the shell script (it selects it per machine) it is an easier alternative if you have administrative access to all the machines you are trying to run the script on.

Create a symlink (or a hardlink if desired) to point to the desired interpreter path. For example, on my system perl and python are in /usr/bin:

cd /bin
ln -s /usr/bin/perl perl
ln -s /usr/bin/python python

would create a symlink to allow the hashbang to resolve for /bin/perl, etc. This preserves the ability to pass parameters to the scripts as well.

Solution 5

I prefer Kusalananda's and ilkkachu's answers, but here is an alternative answer that more directly does what the question was asking for, simply because it was asked.

#!/usr/bin/ruby -e exec "non-existing-interpreter", ARGV[0] rescue exec "python", ARGV[0]

if True:
  print("hello world!")

Note that you can only do this when the interpreter permits writing code in the first argument. Here, -e and everything after it is taken verbatim as 1 argument to ruby. As far as I can tell, you can't use bash for the shebang code, because bash -c requires the code to be in a separate argument.

I tried doing the same with python for shebang code:

#!/usr/bin/python -cexec("import sys,os\ntry: os.execlp('non-existing-interpreter', 'non-existing-interpreter', sys.argv[1])\nexcept: os.execlp('ruby', 'ruby', sys.argv[1])")

if true
  puts "hello world!"
end

but it turns out too long and linux (at least on my machine) truncates the shebang to 127 characters. Please excuse the use of exec to insert newlines as python doesn't permit try-excepts or imports without newlines.

I'm not sure how portable this is, and I wouldn't do it on code meant to be distributed. Nevertheless, it's doable. Maybe someone will find it useful for quick-and-dirty debugging or something.

Share:
8,386

Related videos on Youtube

dkv
Author by

dkv

Updated on September 18, 2022

Comments

  • dkv
    dkv over 1 year

    Is there any way to dynamically choose the interpreter that's executing a script? I have a script that I'm running on two different systems, and the interpreter I want to use is located in different locations on the two systems. What I end up having to to is change the hashbang line every time I switch over. I would like to do something that is the logical equivalent of this (I realize that this exact construct is impossible):

    if running on system A:
        #!/path/to/python/on/systemA
    elif running on system B:
        #!/path/on/systemB
    
    #Rest of script goes here
    

    Or even better would be this, so that it tries to use the first interpreter, and if it doesn't find it uses the second:

    try:
        #!/path/to/python/on/systemA
    except: 
        #!path/on/systemB
    
    #Rest of script goes here
    

    Obviously, I can instead execute it as /path/to/python/on/systemA myscript.py or /path/on/systemB myscript.py depending on where I am, but I actually have a wrapper script that launches myscript.py, so I would like to specify the path to the python interpreter programmatically rather than by hand.

    • magor
      magor almost 7 years
      passing the 'rest of script' as a file to the interpreter without shebang, and using the if condition is not an option for you ? like, if something; then /bin/sh restofscript.sh elif...
    • dkv
      dkv almost 7 years
      It's an option, I also considered it, but somewhat messier than I would like. Since logic in the hashbang line is impossible, I think I will indeed go that route.
    • Oskar Skog
      Oskar Skog over 6 years
      I like the wide range of different answers this question has generated.
  • dkv
    dkv almost 7 years
    Right, I realize that #! needs to come at the beginning, since it's not the shell that processes that line. I was wondering if there's a way to put logic inside the hashbang line that would be equivalent to the if/else. I was also hoping to avoid messing around with my PATH but I guess those are my only options.
  • Kusalananda
    Kusalananda almost 7 years
    @dkv There may only be an absolute path to an interpreter and a single argument to it. It's detected and used by the exec() family of functions.
  • dkv
    dkv almost 7 years
    Just to be clear (and to try to build my own vocabulary since I'm relatively new at this), in your example, does awk -f count as a single argument? i.e. are flags part of the argument?
  • Kusalananda
    Kusalananda almost 7 years
    @dkv awk -f would count as two separate arguments to /usr/bin/env in that example.
  • Pankaj Goyal
    Pankaj Goyal almost 7 years
    When you use #!/usr/bin/awk, you may provide exactly one argument, as #!/usr/bin/awk -f. If the binary you're pointing to is env, the argument is the binary you're asking env to look for, as in #!/usr/bin/env awk.
  • dkv
    dkv almost 7 years
    But then, if there is only a single argument, why would #!/usr/bin/env awk -f be valid?
  • Kusalananda
    Kusalananda almost 7 years
    @dkv It's not. It uses an interpreter with two arguments, and it may work on some systems, but definitely not on all.
  • ilkkachu
    ilkkachu almost 7 years
    @dkv on Linux it runs /usr/bin/env with the single argument awk -f.
  • ilkkachu
    ilkkachu almost 7 years
    I think I've seen one of these before, but the idea is still equally awful... But, you probably want exec "$interpreter" "$0" "$@" to get the name of the script itself to the actual interpreter too. (And then hope nobody lied when setting up $0.)
  • RANJAN YADAV
    RANJAN YADAV almost 7 years
    I've seen exec "$interpreter" "${flags[@]}" "$script" "$@" also used to keep the process tree cleaner. It also propagates the exit code.
  • Jörg W Mittag
    Jörg W Mittag almost 7 years
    Scala actually has support for polyglot scripts in its syntax: if a Scala script starts with #!, Scala ignores everything up to a matching !#; this allows you to put arbitrarily complex script code in an arbitrary language in there, and then exec the Scala execution engine with the script.
  • Sergiy Kolodyazhnyy
    Sergiy Kolodyazhnyy almost 7 years
    Wouldn't #!/bin/sh be better instead of #!/bin/bash ? Even if /bin/sh is a symlink to a different shell, it should exist on most ( if not all) *nix systems, plus it would force the script author to make a portable script rather than fall into bashisms.
  • ilkkachu
    ilkkachu almost 7 years
    @SergiyKolodyazhnyy, heh, I thought about mentioning that earlier, but didn't, then. The array used for flags is a non-standard feature, but it's useful enough for storing a variable number of flags so I decided to keep it.
  • Kusalananda
    Kusalananda almost 7 years
    @ilkkachu awk and -f are two arguments to env.
  • ilkkachu
    ilkkachu almost 7 years
    @Kusalananda, no, that was the point. If you have a script called foo.awk with the hashbang line #!/usr/bin/env awk -f and call it with ./foo.awk then, on Linux, what env sees is the two parameters awk -f and ./foo.awk. It actually goes looking for /usr/bin/awk -f (etc.) with a space.
  • Kusalananda
    Kusalananda almost 7 years
    @ilkkachu Now I see what you mean.
  • vijay
    vijay almost 7 years
    Or use /bin/sh and just call the interpreter directly in each branch: script=/what/ever; something && exec this "$script" "$@"; exec that "$script" -x -y "$@". You could also add error checking for exec failures.
  • vijay
    vijay almost 7 years
    @Jörg W Mittag: +1 for Scala
  • gokhan acar
    gokhan acar almost 7 years
    +1 This is so simple. As you note, it doesn't quite answer the question, but it seems to do exactly what the OP wants. Though I guess using env gets around the root access on each machine issue.
  • Tim
    Tim about 4 years
    @JörgWMittag Is #! still used nowadays in Scala? First I don't see it in scala language specification. Second, how is this approach different from just writing scala in shebang? See unix.stackexchange.com/questions/573560/…
  • David Ljung Madison Stellar
    David Ljung Madison Stellar almost 4 years
    Here's a ruby example: davesource.com/Solutions/…