In particular, why is that sometimes the options to some commands are preceded by a +
sign and sometimes by a -
sign?
for example:
These days, the POSIX standard using getopt() (aka getopt(3)) is widely used as a standard notation, but in the early days, people were experimenting. On some machines, the sort
command no longer supports the +
notation. However, various commands (notably ar
and tar
) accept controls without any prefix character - and dd
(alluded to by Alok in a comment) uses another convention altogether.
The GNU convention of using '--
' for long options (supported by getopt_long(3)) was changed from using '+
'. Of course, the X11 software uses a single dash before multi-character options. So, the whole thing is a collection of historic relics as people experimented with how best to handle it.
POSIX documents the Utility Conventions that it works to, except where historical precedent is stronger.
[At one time, SO 367309 contained the following material as my answer. It was originally asked 2008-12-15 02:02 by FerranB, but was subsequently closed and deleted.]
How many different types of options do you recognize? I can think of many, including:
For options taking an optional argument, sometimes the argument must be attached (co -p1.3 rcsfile.c
),
sometimes it must follow an '=' sign. POSIX doesn't support optional
arguments meaningfully (the POSIX getopt() only allows them for the last
option on the command line).
All sensible option systems use an option consisting of double-dash
('--
') alone to mean "end of options" — the following arguments are
"non-option arguments" (usually file names; POSIX calls them 'operands')
even if they start with a
dash. (I regard supporting this notation as an imperative. Be aware that if the --
is preceded by an option requiring an argument, the --
will be treated as the argument to the option, not as the 'end of options' marker.)
Many but not all programs accept single dash as a file name to mean
standard input (usually) or standard output (occasionally). Sometimes,
as with GNU 'tar
', both can be used in a single command line:
... | tar -cf - -F - | ...
The first solo dash means 'write to stdout'; the second means 'read file names from stdin'.
Some programs use other conventions — that is, options not preceded by a dash. Many of these are from the oldest days of Unix. For example, 'tar' and 'ar' both accept options without a dash, so:
tar cvzf /tmp/somefile.tgz some/directory
The dd
command uses opt=value
exclusively:
dd if=/some/file of=/another/file bs=16k count=200
Some programs allow you to interleave options and other arguments completely; the C compiler, make and the GNU utilities run without POSIXLY_CORRECT in the environment are examples. Many programs expect the options to precede the other arguments.
Note that git
and other VCS commands often use a hybrid system:
git commit -m 'This is why it was committed'
There is a sub-command as one of the arguments. Often, there will be optional 'global' options that can be specified between the command and the sub-command. There are examples of this in POSIX; the sccs command is in this category; you can argue that some of the other commands that run other commands are also in this category: nice
and xargs
spring to mind from POSIX; sudo
is a non-POSIX example, as are svn
and cvs
.
I don't have strong preferences between the different systems. When there are few enough options, then single letters with mnemonic value are convenient. GNU supports this, but recommends backing it up with multi-letter options preceded by a double-dash.
There are some things I do object to. One of the worst is the same option letter being used with different meanings depending on what other option letters have preceded it. In my book, that's a no-no, but I know of software where it is done.
Another objectionable behaviour is inconsistency in style of handling
arguments (especially for a single program, but also within a suite of
programs). Either require attached arguments or require detached
arguments (or allow either), but do not have some options requiring an
attached argument and others requiring a detached argument. And be
consistent about whether '=
' may be used to separate the option and
the argument.
As with many, many (software-related) things — consistency is more important than the individual decisions. Using tools that automate and standardize the argument processing helps with consistency.
Whatever you do, please, read the TAOUP's Command-Line Options and consider Standards for Command Line Interfaces. (Added by J F Sebastian — thanks; I agree.)
It's left to apps to parse options hence the inconsistency. Expanding on your sort example these are all equivalent for coreutils:
sort -k3
sort --k 3
sort --key 3
sort --key=3
_POSIX2_VERSION=199209 sort +2
A shell command is just a program, and it is free to interpret its command line any way it likes. Unix never had anything like Apple's interface police to make sure that the command-line interface was consistent across applications. As a result, there is inconsistency, especially in older commands.
Peering into my crystal ball, I think command-line tools will slowly migrate toward GNU standards, double dashes and all. (I grew up with single dashes and still find the double dash very awkward, but it is consistent.)
It's completely arbitrary; the command may implement all of the option handling in its own special way or it might call out to some other convenience functions. The getopt()
family of functions is pretty popular, so most software written even remotely recently follows the conventions set by those routines. There are always exceptions, of course!