Single dashes `-` for single-character options, but double dashes `--` for words?

28,287

Solution 1

In The Art of Unix Programming Eric Steven Raymond describes how this practice evolved:

In the original Unix tradition, command-line options are single letters preceded by a single hyphen... The original Unix style evolved on slow ASR-33 teletypes that made terseness a virtue; thus the single-letter options. Holding down the shift key required actual effort; thus the preference for lower case, and the use of “-” (rather than the perhaps more logical “+”) to enable options.

The GNU style uses option keywords (rather than keyword letters) preceded by two hyphens. It evolved years later when some of the rather elaborate GNU utilities began to run out of single-letter option keys (this constituted a patch for the symptom, not a cure for the underlying disease). It remains popular because GNU options are easier to read than the alphabet soup of older styles. 1

[1] http://www.catb.org/esr/writings/taoup/html/ch10s05.html

Solution 2

One reason for continuing to use the single letter options is because they can be strung together: ls -ltr is a lot easier to type than ls --sort=time --reverse --format=long. There are a number of times when both are good to use. As for searching for this topic, try "unix command line options convention".

Solution 3

The quote from Raymond by @jasonwryan has some useful information, but starts in the middle of the story:

  • Keep in mind that Unix started as a reduced-scope version of Multics, and that throughout its history, features in Unix were often imitations or adaptations of features seen and used on other systems.
  • The '-' option character was used in Multics. Bitsavers has a manual for its user commands.
  • Other systems used different characters, some with more claim to be more keystroke-efficient (such as '/' used for TOPS and VMS) and some less (such as '(' used in VM/SP CMS).
  • Multics options were multi-character, e.g., keywords separated by underscore.
  • Longer Multics options frequently had a shorter, abbreviated form, such as -print vs -pr (page 3-8).
  • Unix options were single-character, and after several years, getopt was introduced. Because it was not part of the original Unix, there are utilities which did not use getopt and were left as-is. But having getopt helped with making programs consistent.

On the other hand, Unix options using getopt were single-character. Other systems, in particular all larger ones, used keywords. Some (not all) allowed those keywords to be abbreviated, i.e., not all characters provided as long as the option was unambiguous. There are pitfalls in that test for ambiguity. For example:

  • early in 1985, I was working on a program which had to be ported to PrimOS. Prime's developers competed with several other companies by offering a command-language that (tried to) imitate each of those others, providing the most commonly used commands from each. Of course, they supported abbreviations (as did VMS). After reading the online help, I typed sta, thinking to get status. That was the abbreviation for start, and having given nothing to start, the command interpreter logged me off.
  • The X Toolkit (used by xterm) allows abbreviated options. To use this effectively in xterm, it has to preprocess the command parameters to prefer -v (for version) over -vb (visual bell). The X Toolkit has no direct way to specify a preferred option when there is an ambiguity.

Because of this potential for ambiguity, some developers prefer to not allow abbreviations. Lynx, for example, uses multi-character options without allowing abbreviations.

Not all programs used getopt: tar and ps did not. Nor did rcs (or sccs), as you can see by noting where the dash was optional, and option values were optional.

Taking all of this into account, GNU developers adapted the keyword options used in other systems by extending getopt to provide a long version of each short option. For instance, textutils 1.0 changelog says

Tue May  8 03:41:42 1990  David J. MacKenzie  (djm at abyss)

        * tac.c: Use regular expressions as the record boundaries.
        Give better error messages.
        Reformat code and make it more readable.
        (main): Use getopt_long to parse options.

The change in fileutils was earlier:

Tue Oct 31 02:03:32 1989  David J. MacKenzie  (djm at spiff)

        * ls.c (decode_switches): Add long options, using getopt_long
        instead of getopt.

and someone may find one still earlier, but it seems that the file-header shows the earliest date:

/* Getopt for GNU.
   Copyright (C) 1987, 1989 Free Software Foundation, Inc.

which is (for instance) concurrent with the X Toolkit (1987). Most of the Unix utilities with which you are familiar (such as ls, ps) used the existing single-character options that require periodic visits to the manual. When introducing getopt_long, the GNU developers did not do this by first adding new options; they began by tabulating the existing options and providing a matching long option.

Because they were adding to an existing repertoire, there was (again) the problem of conflict with existing options. To avoid this, they changed the syntax, using two dashes before long options.

These programs continue to use getopt_long in this manner for the usual reasons:

  • scripts depend upon the options; developers are not anxious to break scripts
  • there's a written coding standard (which may be effective)
  • no one has come up with a competing set of tools which is markedly incompatible (both BSDs and GNU developers copy option names from each other)

Solution 4

In wikipedia Command-line interface it is reported:

In Unix-like systems, the ASCII hyphen–minus is commonly used to specify options. The character is usually followed by one or more letters. An argument that is a single hyphen–minus by itself without any letters usually specifies that a program should handle data coming from the standard input or send data to the standard output. Two hyphen–minus characters ( -- ) are used on some programs to specify "long options" where more descriptive option names are used. This is a common feature of GNU software.

Solution 5

My guess is that more descriptive options were desired and also with longer options you're not going to have to worry about running out of single character options.

Once you decide you want long options you then have an issue, at least if you plan to support both long and short options. I'm not positive, but I believe arcege's answer holds the key as to why - and --. A generic processing routine, eg. getopt_long(), would need to know whether a single command line argument could contain multiple options, eg. -ltr. Thus a processing routine would need to be able to differentiate between the two. If I read a single dash, -, then the rest of the command line argument can match multiple options. If I read a double dash, --, then the rest of the command line argument must match a single option.

I just recently made use of getopt_long() and I'm starting to like long options as they're easier to remember and self documenting. If I have the following two commands:

./aggregator -f 15

./aggregator --flush-time 15

I would say the second one using the long option is more self explanatory.

Share:
28,287

Related videos on Youtube

Larry
Author by

Larry

Updated on September 18, 2022

Comments

  • Larry
    Larry almost 2 years

    Where did the convention of using single dashes for letters and doubles dashes for words come from and why is continued to be used?

    For example if I type in ls --help, you see:

      -a, --all                  do not ignore entries starting with .
      -A, --almost-all           do not list implied . and ..
          --author               with -l, print the author of each file
      -b, --escape               print octal escapes for nongraphic characters
          --block-size=SIZE      use SIZE-byte blocks
      -B, --ignore-backups       do not list implied entries ending with ~
    ...
    

    I tried googling - and -- convention even with quotes with little success.

    • chharvey
      chharvey about 9 years
      Just being nit-picky here, but the character - is technically called a hyphen. We use the word "dash" to refer to the em dash (—) in most cases, and sometimes the en dash (–), but neither of which is a hyphen (-).
    • The Unknown Dev
      The Unknown Dev about 8 years
      It really annoys me when well known programs don't follow the convention, though: java -version
    • Krzysztof Wende
      Krzysztof Wende about 8 years
      @Jamil Yeah. I ended up here wondering why is it find . -delete
    • Aaron Franke
      Aaron Franke about 5 years
      The idea of this is so that you can write things like -ab which activates both a and b. Without the double dash, -help would activate the h, e, l, and p options.
    • DAES
      DAES over 2 years
  • chharvey
    chharvey about 9 years
    This doesn't answer the question of where the convention came from and why it continues to be used.
  • schily
    schily about 6 years
    Since UNIX ls does not understand ls --sort=time --reverse --format=long it is not a good idea either to mention this non standard method.
  • JdeBP
    JdeBP over 4 years
    Interestingly, BSD ps was switched to getopt() in 1990. unix.stackexchange.com/a/511530/5132
  • Admin
    Admin about 2 years
    @schily Though getopt_long(3) also exists.
  • Admin
    Admin about 2 years
    Just a note (though I'm sure you, Thomas, know it): just because a tool does not use getopt(3) does not mean it doesn't support -. I've written several tools without using it. Actually almost always I don't use it. It's an interesting answer. Well written!
  • Admin
    Admin about 2 years
    Interesting theory and quite possibly true. It makes sense. I'm guilty of all three: - for single letter and also long options as well as -- for long options. It really depends on what I'm doing. But most of these aren't for anyone but myself.