bash how to remove options from parameters after processing

14,554

Solution 1

POSIXly, the parsing for options should stop at -- or at the first non-option (or non-option-argument) argument whichever comes first. So in

cp -R file1 -t /mybackup file2 -f

that's at file1, so cp should recursively copy all of file1, -t, /mybackup and file2 into the -f directory.

GNU getopt(3) however (that GNU cp uses to parse options (and here you're using GNU cp since you're using the GNU-specific -t option)), unless the $POSIXLY_CORRECT environment variable is set, accepts options after arguments. So it is actually equivalent to POSIX option style parsing's:

cp -R -t /mybackup -f -- file1 file2

The getopts shell built-in, even in the GNU shell (bash) only handles the POSIX style. It also doesn't support long options or options with optional arguments.

If you want to parse the options the same way as GNU cp does, you'll need to use the GNU getopt(3) API. For that, if on Linux, you can use the enhanced getopt utility from util-linux (that enhanced version of the getopt command has also been ported to some other Unices like FreeBSD).

That getopt will rearrange the options in a canonical way which allows you to parse it simply with a while/case loop.

$ getopt -n "$0" -o t:Rf -- -Rf file1 -t/mybackup file2
 -R -f -t '/mybackup' -- 'file1' 'file2'

You'd typically use it as:

parsed_options=$(
  getopt -n "$0" -o t:Rf -- "$@"
) || exit
eval "set -- $parsed_options"
while [ "$#" -gt 0 ]; do
  case $1 in
    (-[Rf]) shift;;
    (-t) shift 2;;
    (--) shift; break;;
    (*) exit 1 # should never be reached.
  esac
done
echo "Now, the arguments are $*"

Also note that that getopt will parse options the same way as GNU cp does. In particular, it supports the long options (and entering them abbreviated) and honours the $POSIXLY_CORRECT environment variables (which when set disables support for options after arguments) the same way GNU cp does.

Note that using gdb and printing the arguments that getopt_long() receives can help building the parameters to getopt(1):

(gdb) bt
#0  getopt_long (argc=2, argv=0x7fffffffdae8, options=0x4171a6 "abdfHilLnprst:uvxPRS:T", long_options=0x417c20, opt_index=0x0) at getopt1.c:64
(gdb) set print pretty on
(gdb) p *long_options@40
$10 = {{
    name = 0x4171fb "archive",
    has_arg = 0,
    flag = 0x0,
    val = 97
  }, {
    name = 0x417203 "attributes-only",
[...]

Then you can use getopt as:

getopt -n cp -o abdfHilLnprst:uvxPRS:T -l archive... -- "$@"

Remember that GNU cp's list of supported options may change from one version to the next and that getopt will not be able to check if you pass a legal value to the --sparse option for instance.

Solution 2

So every time getopts processes an argument it doesn't expect it sets the shell variable $OPTIND to the next number in the argument list which it should process and returns other than 0. If $OPTIND is set to a value of 1, getopts is POSIX-specified to accept a new argument list. So this just watches getopts return, saves increments a counter plus $OPTIND's for any failed return, shifts away the failed arguments, and resets $OPTIND every failed try. You can use it like opts "$@" - though you'd want to customize the case loop or else save it into a variable and change that section to eval $case.

opts() while getopts :Rt:f opt || {
             until command shift "$OPTIND" || return 0
                   args=$args' "${'"$((a=${a:-0}+$OPTIND))"'}"'
                   [ -n "${1#"${1#-}"}" ]
             do OPTIND=1; done 2>/dev/null; continue
};     do case "$opt"  in
             R) Rflag=1      ;;
             t) tflag=1      ;
                targ=$OPTARG ;;
             f) fflag=1      ;;
       esac; done

While running it sets $args to every argument which getopts did not handle... so...

set -- -R file1 -t /mybackup file2 -f
args= opts "$@"; eval "set -- $args"
printf %s\\n "$args"
printf %s\\n "$@"         
printf %s\\n "$Rflag" "$tflag" "$fflag" "$targ"

OUTPUT

 "${2}" "${5}"
file1
file2
1
1
1
/mybackup

This works in bash, dash, zsh,ksh93, mksh... well, I quit trying at that point. In every shell it also got $[Rtf]flag and $targ. The point is that all of the numbers for the arguments that getopts didn't want to process remained.

Changing the options style made no difference either. It worked like -Rf -t/mybackup or -R -f -t /mybackup. It worked in the middle of the list, at the end of the list, or at the head of the list...

Still, the very best way is just to stick a -- for end of options on your arg list and then do shift "$(($OPTIND-1))" at the end of a getopts processing run. That way you remove all processed parameters and keep the tail end of the argument list.

One thing I like to do is translate long options to short -- and I do that in a very similar way which is why this answer came easily -- before I run getopts.

i=0
until [ "$((i=$i+1))" -gt "$#" ]
do case "$1"                   in
--Recursive) set -- "$@" "-R"  ;;
--file)      set -- "$@" "-f"  ;;
--target)    set -- "$@" "-t"  ;;
*)           set -- "$@" "$1"  ;;
esac; shift; done
Share:
14,554

Related videos on Youtube

jamadagni
Author by

jamadagni

Updated on September 18, 2022

Comments

  • jamadagni
    jamadagni almost 2 years

    I remember having seen somewhere a bash script using case and shift to walk through the list of positional parameters, parse flags and options with arguments when it encounters them, and removes them after parsing to leave only the bare arguments, which are later processed by the rest of the script.

    For example, in parsing the command line of cp -R file1 -t /mybackup file2 -f, it would first walk through the parameters, recognize that the user has requested to descend into directories by -R, specified the target by -t /mybackup and to force copying by -f, and remove those from the list of parameters, leaving the program to process file1 file2 as the remaining arguments.

    But I don't seem to be able to remember/find out whatever script I saw whenever. I'd just like to be able to do that. I have been googling around various sites and append a list of relevant pages I examined.

    A question on this website specifically asked about "order-independent options" but both the single answer and the answer of the question it was dupped to does not consider cases like the above where the options are mixed with normal arguments, which I presume was the reason for the person to specifically mention order-independent options.

    Since bash's built-in getopts seems to stop at the first non-option argument, it does not seem to be sufficient as a solution. This is why the Wooledge BashFAQ's page (see below) explains how to rearrange the arguments. But I'd like to avoid creating multiple arrays in case the argument list is quite long.

    Since shift does not support popping individual arguments off the middle of the parameter list, I am not sure what is a straightforward way to implement what I am asking.

    I'd like to hear if anyone has any solutions to removing arguments from the middle of the parameter list without creating a whole new array.

    Pages that I've already seen:

  • jamadagni
    jamadagni almost 10 years
    @all: thanks for your replies. @Stephane: is there a reason you are using while [ "$#" -gt 0 ] i.o. just while (($#))? Is it just to avoid a bashism?
  • Stéphane Chazelas
    Stéphane Chazelas almost 10 years
    Yes, though (($#)) is more a kshism.
  • Stéphane Chazelas
    Stéphane Chazelas almost 10 years
    Your long option converting would also convert non-options (as cp -t --file foo or cp -- --file foo) and would not cope with options entered abbreviated (--fi...), or with the --target=/dest syntax.
  • mikeserv
    mikeserv almost 10 years
    @StéphaneChazelas - the long converting thing is only an example - it obviously hasn't been built up very well. I replaced the getopts thing with a much simpler function.
  • mikeserv
    mikeserv almost 10 years
    @StéphaneChazelas - please look again. While the long option comment remains justified, the first - and its accompanying downvote - no longer is, as I think. Also, I think opts() has some bearing on your own answer as opts() works in a single loop - touching each argument but once - and does so reliably (as near as I can tell), portably, and without a single subshell.
  • Stéphane Chazelas
    Stéphane Chazelas almost 10 years
    resetting OPTIND is what I call starting another getopts loop. In effect, that's parsing several command lines/sets of options (-R, -t /mybackup, -f). I'll keep my down vote for now as it's still obfuscated, you're using $a uninitialised, and args= opts... is likely to leave args unset (or set to what it was before) after opts returns in many shells (that includes bash).
  • mikeserv
    mikeserv almost 10 years
    @StéphaneChazelas - that does include bash - I hate that. In my opinion, if the function is a current shell function that should be retained. I'm addressing these things - as the current function, with more tests, could robustly be made to handle all cases, and even to flag argument with its preceding option, but I disagree that it is obfuscated. As written, it is the most simple and most direct means of accomplishing its task that I can think of. testing then shifting makes little sense when in every failed shift case you should return. It is not intended to do otherwise.