Why are UNIX/POSIX system call namings so illegible?
Solution 1
It's due to the technical constraints of the time. The POSIX standard was created in the 1980s and referred to UNIX, which was born in the 1970. Several C compilers at that time were limited to identifiers that were 6 or 8 characters long, so that settled the standard for the length of variable and function names.
Related questions:
- Why is 'umount' not spelled 'unmount'?
- What did Ken Thompson mean when he said, "I'd spell creat with an 'e'."?
- What, if any, naming convention was used for the standard Unix commands?
- https://stackoverflow.com/questions/682719/what-does-the-9th-commandment-mean
Solution 2
dr_ is right, but there's also another reason - usability. Back in the day, you didn't have something as comfortable as a keyboard to type on. If you were lucky, you had something akin to an old-school typewriter. If you were unlucky, you had to deal with systems that required actual physical work to operate (as in, it took a lot of force to press the "key"), or you manually punched holes in a card.
This meant that even within the 6-8 character limit, you tried to keep your commands as short as possible. That's why you have ls
instead of list
, and creat
instead of create
. Code from that era is full of variables like a
, x
and i
- and of course, x2
and friends. Typing was a lot of work - today, you're less exerted from typing listIndex
than you used to be from "typing" i
- and it isn't even all that slower anymore (especially with additional technologies like auto-completion).
The real question is - why do so many Unix idioms persist even though they're no longer desirable?
Solution 3
In addition to the other answers, I would like to point out that Unix was developed as a reaction to Multics, CTSS, and other contemporary operating systems, which were significantly more verbose about their naming conventions. You can get a feel for these OSes at http://www.multicians.org/devdoc.html. For example, http://www.multicians.org/mspm-bx-1-00.html gives change_name
as the command for renaming a file; compare Unix mv
.
Also, the principal reason why the very short system call names persist is backward compatibility. You will notice that newer APIs tend to be more explicit; e.g. gettimeofday
and clock_gettime
instead of just time
.
(Even today, using whateverIndex
instead of i
for a loop index is an automatic code-review failure in my book ;-)
Solution 4
Dennis Ritchie set himself a constraint with C that it wouldn't rely on any linker features that weren't also required by Fortran. Hence the 6 character limit on external names.
Related videos on Youtube
Benjoyo
Updated on September 18, 2022Comments
-
Benjoyo almost 2 years
What is the reason to use such untelling system call names like
time
andcreat
instead ofgetCurrentTimeSecs
andcreateFile
or, maybe more suitable on Unixget_current_time_secs
andcreate_file
. Which brings me to the next point: why should someone want something likecfsetospeed
without camel case or at least underscores to make it readable? Of course the calls would have more characters but we all know that readability of code is more important right?-
Admin almost 9 yearsBecause they were invented decades before Hungarian notation, camel case, snake case, and the like became fashionable. Also because back then compilers had very few resources, and identifier names were limited to (IIRC) 8 characters.
-
Admin almost 9 years
-
Admin almost 9 years@lcd047: First result on my google search to falsify your comment: unix.com/unix-for-dummies-questions-and-answers/… . Also, do you have any backup to your statement about notations? In the Bundeswehr, e.g. CamelCase was always in use.
-
Admin almost 9 years@phresnel: Take note that your link talks about the limitations of the first Unix filesystem. When Thompson, Richie et al. were designing Unix, they had to bootstrap Unix on machines that did not run Unix yet, i.e. in probably even more constrained environments.
-
Admin almost 9 yearsWith "Always" I meant always literally, for as long as the Bundeswehr exists (1955 a.d.). Looks like I misinterpreted "fashionable", pardon.
-
Admin almost 9 years@DevSolar: Yeah, right. My apologies.
-
Admin almost 9 yearsYou might as well ask why they're not written in German, because that's about the closest natural-language approximation I can think of for the over-long monstrosities Java has taught programmers to accept...
-
Admin almost 9 yearsI'm so happy it is like this. Imagine how a
ls -la | grep
would look like:listAllHiddenAndNormalFiles() | globallySearchARegularExpressionAndPrint()
. -
Admin almost 9 yearsThis is why.
-
Admin almost 9 years@Pouya no need to hyperbolize, also didn't mention the shell but sys calls.
-
Admin almost 9 yearsI read somewhere that Unix was originally used with a 110 baud teletype, i.e. print around 10 characters a second. This is why unix is so terse.
-
Admin almost 9 yearsAnd "untelling"? To who?
-
Admin almost 9 years@ThorbjørnRavnAndersen to a dev who doesn't know the linux/unix kernel yet?
-
Admin almost 9 years@Benjoyo they did not write the kernel for being easy to read for somebody else - they wrote it to be easy to read for themselves. You must remember that C was designed to be portable assembly for implementing Unix on bare metal. Seeing it in any other light does not do the designers justice.
-
Admin almost 9 years@ThorbjørnRavnAndersen I neither criticize C nor the kernel source code. The sys calls are also the API for external devs and even if they weren't, there is no need to design their names badly. Unless there are technical reasons which seems to be the case.
-
Admin almost 9 yearsWell, if you ask why they designed their named "badly" you are actually criticizing.
-
Admin almost 9 yearsDennis Ritchie was asked what he would do different if he were to create UNIX today. His answer, I'd spell creat with two e's.
-
-
phresnel almost 9 yearsI don't think Technical Constraints applies. You could have files larger than 6 bytes, you could have programs spanning thousands of lines of code; abstract syntax trees way deeper than just six-levels, and with way more nodes per level. Just out of reason, the 6 character limit can't be technical, but rather a designed one. And it does not explain why "creat" is better than "create". Also, can you please name those several C compiler you talk about? Your answer really reads like "heard somewhere somewhen".
-
Petr almost 9 yearsOk I still would like to know why "creat" is better than "create" :P I know it's not easy to answer but if someone knows
-
Vicky almost 9 years@phresnel: He's not talking about file size, number of lines, or syntax tree depth. Old versions of the C language did not require compilers and linkers to keep more than the first 31 characters of identifiers with internal linkage, or more than 6 for external identifiers. Thus,
get_current_date()
andget_current_time()
couldn't be told apart by some of these early toolchains. The reason was that these systems were working on tiny footprints of a few kilobytes. -
Vicky almost 9 yearsBut you're right on
creat()
. Ken Thompson was once asked what he would do differently if he were redesigning the UNIX system. His reply: "I'd spell creat with an e." -
dr_ almost 9 yearsAbout Ken Thompson and
creat()
: see unix.stackexchange.com/questions/10893/… -
Vicky almost 9 yearsAnd besides, Unix actually predates the first C compiler -- it was written in assembly first, then rewritten in C... ;)
-
phresnel almost 9 years@dr01: "Apples and Oranges": I don't think so, when you can create stuff like that so deep, then surely implementing identifiers with more than 6 or 8 characters should impose no problem. Thx for the links; maybe use them in your answer.
-
Random832 almost 9 years@Petr C functions would have their name prefixed with _, so
_creat
. He could still have named the function "create", but the extra characters might have been ignored (though IIRC the actual limit on PDP-11 Unix was 8 characters, and 6 comes from some other system). -
Vicky almost 9 years@phresnel: Having only limited memory is not a hard limit. Having only a limited guaranteed supported length of identifiers is. If you're only guaranteed 6 significant characters, that's what you're working with if you're worth your salt.
-
yorkshiredev almost 9 yearsThe day my kernel changes from
time
togetCurrentTimeSecs
or something like that, I'll just stop upgrading it. Even with my comfortable keyboard and recent hardware, these names remain extremely convenient and simple (simplicity being one of UNIX's fundamentals). I really don't feel the need to bring that kind of Java/C#-style naming into the C language, let alone in a Linux kernel. IMO, from the perspective of a kernel developer, or UNIX developer in general, these idioms are nothing close to undesirable. -
Benjoyo almost 9 years@John WH Smith but they are really undesirable for anyone who is not used to this environment. For me it is just ugly compared to the Java method naming style where you most probably know what it is doing without looking into any doc.
-
muru almost 9 years@Benjoyo
unRootlyLongNamed.Packaged.nonsensicalFunction
is ugly to me, and I'd rather be sure what it does by doingman 2 time
than guess at what it seems to do. -
yorkshiredev almost 9 years@Benjoyo Well, anyone who is not used to this environment isn't supposed to use system calls in the first place, since they are specifically meant for those who are. (Standard) libraries are here for the others. UNIX does not follow these fashionable design "rules" that would make it easily understandable at first glance. Those who use it without looking into any doc are in for a lot of trouble, and few people in the UNIX community would consider that a problem to be solved.
-
Benjoyo almost 9 years@JohnWHSmith sure kernel mode developers will know their system and its docs but that is no reason for not having concise and telling names.
-
IhtkaS almost 9 years@JohnWHSmith As a lowly dev who only ever wrote kernel drivers and prefers meaningful names that are not full of abbreviations I have to disagree. But that's alright, because if you look for example at the original git source code, you'll find at least one kernel dev agreeing with me. Although if you tell Linus that his
get_X
orremove_file_from_cache
(might I proposermfc
?) are undesirable to kernel developers, please do it publicly - I'd love to see his reaction. -
Luaan almost 9 yearsYeah. It's funny hearing the technological argument about hardware capabilities when LISP Machines predate UNIX. Sure, UNIX machines were cheaper to buy, but that's about it. Adding maintenance costs (which of course don't count in the *nix land) turned the tables even then, but it wasn't a persuasive enough argument anyway. (And yup,
i
is fine for an index when you're, say, iterating an array. Coordinates? Usex
andy
. Traversing some ordinal? Be descriptive.) -
tpg2114 almost 9 yearsFWIW, the old Fortran standards limited identifies to 6 characters. From this book: "The Fortran rule of six characters in one identifier stems from the fact that six characters could be represented in one IBM 704 word." I can't speak for C, but I imagine the limitation has an very similar origin (or perhaps, identical origin)
-
phresnel almost 9 years@PeterCordes: Yes. A decision. That was all I wanted to say :)
-
Peter Cordes almost 9 yearsThe reason everyone was arguing with you is that those decisions had already been made BEFORE Unix was designed (because it was initially written on other platforms, not with its own tools). Thus, it was a lot easier for Unix to follow those rules, instead of writing new tools before starting on Unix. Also, limiting identifier length to save RAM and complexity still made sense at that point, and probably didn't feel like a big limitation.
-
Dolda2000 almost 9 years@Benjoyo: The relatively short time spent learning what things do is not worth sacrificing usability during the much longer time that one keeps using them for.
-
celtschk almost 9 years@JohnWHSmith: Don't worry, your kernel will never switch from
time
to anything. That's because the kernel doesn't take identifiers, it takes function call numbers (in the case oftime
, it's 13). The names are in the corresponding header files/userland libraries (and also in the kernel source, I guess), but not in the actual kernel calling code; to call a kernel function you load a specific register (on x86: eax) with the function number, set up the other arguments (often also in registers) and do a syscall (on systems without syscall instruction, a software interrupt is usually used). -
ninjalj almost 9 years@Voo: Well, you could make the point that system functions should have shorter names than functions specific to a single program. But maybe I have been braindamaged forever by Perl (not that I care much, I started coding in BASIC, so I was already braindamaged, according to some people).
-
IhtkaS almost 9 years@ninjalj Because functions that are used by every program out there should be particularly badly descriptive? If anything it should be the other way around. Again this is pretty non-controversial - to cite the kernel coding guide:
descriptive names for global variables are a must. [...] If you have a function that counts the number of active users, you should call that "count_active_users()" or similar, you should _not_ call it "cntusr()".
Yes we're stuck with the old names from the 70s for backcomp, but for new APIs go with descriptive long ones. -
user207421 over 7 years@downvoter You may not agree with Denni Ritchie about this, but that's what he did. Taking it out on this answer is futile.
-
pizdelect about 5 years@Luaan LISP machines do NOT predate Unix. You fail.
-
zwol about 5 years@pizdelect That's technically true, but technical truth is not sufficient reason to snap at someone.
-
Luaan about 5 years@pizdelect Sorry, I meant Lisp predates UNIX (and C). But LISP machines were essentially contemporary, if you compare their commercial impact (UNIX had a head start of about three years, but by the time LISP machines came, the commercial UNIX machines were still few and far between; most of UNIX was in academia or with no support). In any case, it's a response to the common technological arguments at the time, which was the 80s, when people were actually deciding between UNIX machines and LISPMs, and they were wrong. That changed with micro-computers which could run LISP faster anyway.
-
Admin about 2 yearsFirst rule of downvotes: don’t talk about downvotes.