Why are UNIX/POSIX system call namings so illegible?

history posix system-calls

5,431

Solution 1

It's due to the technical constraints of the time. The POSIX standard was created in the 1980s and referred to UNIX, which was born in the 1970. Several C compilers at that time were limited to identifiers that were 6 or 8 characters long, so that settled the standard for the length of variable and function names.

Solution 2

dr_ is right, but there's also another reason - usability. Back in the day, you didn't have something as comfortable as a keyboard to type on. If you were lucky, you had something akin to an old-school typewriter. If you were unlucky, you had to deal with systems that required actual physical work to operate (as in, it took a lot of force to press the "key"), or you manually punched holes in a card.

This meant that even within the 6-8 character limit, you tried to keep your commands as short as possible. That's why you have ls instead of list, and creat instead of create. Code from that era is full of variables like a, x and i - and of course, x2 and friends. Typing was a lot of work - today, you're less exerted from typing listIndex than you used to be from "typing" i - and it isn't even all that slower anymore (especially with additional technologies like auto-completion).

The real question is - why do so many Unix idioms persist even though they're no longer desirable?

Solution 3

In addition to the other answers, I would like to point out that Unix was developed as a reaction to Multics, CTSS, and other contemporary operating systems, which were significantly more verbose about their naming conventions. You can get a feel for these OSes at http://www.multicians.org/devdoc.html. For example, http://www.multicians.org/mspm-bx-1-00.html gives change_name as the command for renaming a file; compare Unix mv.

Also, the principal reason why the very short system call names persist is backward compatibility. You will notice that newer APIs tend to be more explicit; e.g. gettimeofday and clock_gettime instead of just time.

(Even today, using whateverIndex instead of i for a loop index is an automatic code-review failure in my book ;-)

Solution 4

Dennis Ritchie set himself a constraint with C that it wouldn't rely on any linker features that weren't also required by Fortran. Hence the 6 character limit on external names.

View more solutions

5,431

Benjoyo

Updated on September 18, 2022

Comments

Benjoyo almost 2 years

What is the reason to use such untelling system call names like time and creat instead of getCurrentTimeSecs and createFile or, maybe more suitable on Unix get_current_time_secs and create_file. Which brings me to the next point: why should someone want something like cfsetospeed without camel case or at least underscores to make it readable? Of course the calls would have more characters but we all know that readability of code is more important right?
- Admin almost 9 years
  
  Because they were invented decades before Hungarian notation, camel case, snake case, and the like became fashionable. Also because back then compilers had very few resources, and identifier names were limited to (IIRC) 8 characters.
- Admin almost 9 years
  
  RElated: unix.stackexchange.com/questions/10893/…, unix.stackexchange.com/questions/9832/…
- Admin almost 9 years
  
  @lcd047: First result on my google search to falsify your comment: unix.com/unix-for-dummies-questions-and-answers/… . Also, do you have any backup to your statement about notations? In the Bundeswehr, e.g. CamelCase was always in use.
- Admin almost 9 years
  
  @phresnel: Take note that your link talks about the limitations of the first Unix filesystem. When Thompson, Richie et al. were designing Unix, they had to bootstrap Unix on machines that did not run Unix yet, i.e. in probably even more constrained environments.
- Admin almost 9 years
  
  With "Always" I meant always literally, for as long as the Bundeswehr exists (1955 a.d.). Looks like I misinterpreted "fashionable", pardon.
- Admin almost 9 years
  
  @DevSolar: Yeah, right. My apologies.
- Admin almost 9 years
  
  You might as well ask why they're not written in German, because that's about the closest natural-language approximation I can think of for the over-long monstrosities Java has taught programmers to accept...
- Admin almost 9 years
  
  I'm so happy it is like this. Imagine how a ls -la | grep would look like: listAllHiddenAndNormalFiles() | globallySearchARegularExpressionAndPrint().
- Admin almost 9 years
  
  This is why.
- Admin almost 9 years
  
  @Pouya no need to hyperbolize, also didn't mention the shell but sys calls.
- Admin almost 9 years
  
  I read somewhere that Unix was originally used with a 110 baud teletype, i.e. print around 10 characters a second. This is why unix is so terse.
- Admin almost 9 years
  
  And "untelling"? To who?
- Admin almost 9 years
  
  @ThorbjørnRavnAndersen to a dev who doesn't know the linux/unix kernel yet?
- Admin almost 9 years
  
  @Benjoyo they did not write the kernel for being easy to read for somebody else - they wrote it to be easy to read for themselves. You must remember that C was designed to be portable assembly for implementing Unix on bare metal. Seeing it in any other light does not do the designers justice.
- Admin almost 9 years
  
  @ThorbjørnRavnAndersen I neither criticize C nor the kernel source code. The sys calls are also the API for external devs and even if they weren't, there is no need to design their names badly. Unless there are technical reasons which seems to be the case.
- Admin almost 9 years
  
  Well, if you ask why they designed their named "badly" you are actually criticizing.
- Admin almost 9 years
  
  Dennis Ritchie was asked what he would do different if he were to create UNIX today. His answer, I'd spell creat with two e's.
phresnel almost 9 years

I don't think Technical Constraints applies. You could have files larger than 6 bytes, you could have programs spanning thousands of lines of code; abstract syntax trees way deeper than just six-levels, and with way more nodes per level. Just out of reason, the 6 character limit can't be technical, but rather a designed one. And it does not explain why "creat" is better than "create". Also, can you please name those several C compiler you talk about? Your answer really reads like "heard somewhere somewhen".
Petr almost 9 years

Ok I still would like to know why "creat" is better than "create" :P I know it's not easy to answer but if someone knows
Vicky almost 9 years

@phresnel: He's not talking about file size, number of lines, or syntax tree depth. Old versions of the C language did not require compilers and linkers to keep more than the first 31 characters of identifiers with internal linkage, or more than 6 for external identifiers. Thus, get_current_date() and get_current_time() couldn't be told apart by some of these early toolchains. The reason was that these systems were working on tiny footprints of a few kilobytes.
Vicky almost 9 years

But you're right on creat(). Ken Thompson was once asked what he would do differently if he were redesigning the UNIX system. His reply: "I'd spell creat with an e."
dr_ almost 9 years

About Ken Thompson and creat(): see unix.stackexchange.com/questions/10893/…
Vicky almost 9 years

And besides, Unix actually predates the first C compiler -- it was written in assembly first, then rewritten in C... ;)
phresnel almost 9 years

@dr01: "Apples and Oranges": I don't think so, when you can create stuff like that so deep, then surely implementing identifiers with more than 6 or 8 characters should impose no problem. Thx for the links; maybe use them in your answer.
Random832 almost 9 years

@Petr C functions would have their name prefixed with _, so _creat. He could still have named the function "create", but the extra characters might have been ignored (though IIRC the actual limit on PDP-11 Unix was 8 characters, and 6 comes from some other system).
Vicky almost 9 years

@phresnel: Having only limited memory is not a hard limit. Having only a limited guaranteed supported length of identifiers is. If you're only guaranteed 6 significant characters, that's what you're working with if you're worth your salt.
yorkshiredev almost 9 years

The day my kernel changes from time to getCurrentTimeSecs or something like that, I'll just stop upgrading it. Even with my comfortable keyboard and recent hardware, these names remain extremely convenient and simple (simplicity being one of UNIX's fundamentals). I really don't feel the need to bring that kind of Java/C#-style naming into the C language, let alone in a Linux kernel. IMO, from the perspective of a kernel developer, or UNIX developer in general, these idioms are nothing close to undesirable.
Benjoyo almost 9 years

@John WH Smith but they are really undesirable for anyone who is not used to this environment. For me it is just ugly compared to the Java method naming style where you most probably know what it is doing without looking into any doc.
muru almost 9 years

@Benjoyo unRootlyLongNamed.Packaged.nonsensicalFunction is ugly to me, and I'd rather be sure what it does by doing man 2 time than guess at what it seems to do.
yorkshiredev almost 9 years

@Benjoyo Well, anyone who is not used to this environment isn't supposed to use system calls in the first place, since they are specifically meant for those who are. (Standard) libraries are here for the others. UNIX does not follow these fashionable design "rules" that would make it easily understandable at first glance. Those who use it without looking into any doc are in for a lot of trouble, and few people in the UNIX community would consider that a problem to be solved.
Benjoyo almost 9 years

@JohnWHSmith sure kernel mode developers will know their system and its docs but that is no reason for not having concise and telling names.
IhtkaS almost 9 years

@JohnWHSmith As a lowly dev who only ever wrote kernel drivers and prefers meaningful names that are not full of abbreviations I have to disagree. But that's alright, because if you look for example at the original git source code, you'll find at least one kernel dev agreeing with me. Although if you tell Linus that his get_X or remove_file_from_cache (might I propose rmfc?) are undesirable to kernel developers, please do it publicly - I'd love to see his reaction.
Luaan almost 9 years

Yeah. It's funny hearing the technological argument about hardware capabilities when LISP Machines predate UNIX. Sure, UNIX machines were cheaper to buy, but that's about it. Adding maintenance costs (which of course don't count in the *nix land) turned the tables even then, but it wasn't a persuasive enough argument anyway. (And yup, i is fine for an index when you're, say, iterating an array. Coordinates? Use x and y. Traversing some ordinal? Be descriptive.)
tpg2114 almost 9 years

FWIW, the old Fortran standards limited identifies to 6 characters. From this book: "The Fortran rule of six characters in one identifier stems from the fact that six characters could be represented in one IBM 704 word." I can't speak for C, but I imagine the limitation has an very similar origin (or perhaps, identical origin)
phresnel almost 9 years

@PeterCordes: Yes. A decision. That was all I wanted to say :)
Peter Cordes almost 9 years

The reason everyone was arguing with you is that those decisions had already been made BEFORE Unix was designed (because it was initially written on other platforms, not with its own tools). Thus, it was a lot easier for Unix to follow those rules, instead of writing new tools before starting on Unix. Also, limiting identifier length to save RAM and complexity still made sense at that point, and probably didn't feel like a big limitation.
Dolda2000 almost 9 years

@Benjoyo: The relatively short time spent learning what things do is not worth sacrificing usability during the much longer time that one keeps using them for.
celtschk almost 9 years

@JohnWHSmith: Don't worry, your kernel will never switch from time to anything. That's because the kernel doesn't take identifiers, it takes function call numbers (in the case of time, it's 13). The names are in the corresponding header files/userland libraries (and also in the kernel source, I guess), but not in the actual kernel calling code; to call a kernel function you load a specific register (on x86: eax) with the function number, set up the other arguments (often also in registers) and do a syscall (on systems without syscall instruction, a software interrupt is usually used).
ninjalj almost 9 years

@Voo: Well, you could make the point that system functions should have shorter names than functions specific to a single program. But maybe I have been braindamaged forever by Perl (not that I care much, I started coding in BASIC, so I was already braindamaged, according to some people).
IhtkaS almost 9 years

@ninjalj Because functions that are used by every program out there should be particularly badly descriptive? If anything it should be the other way around. Again this is pretty non-controversial - to cite the kernel coding guide: descriptive names for global variables are a must. [...] If you have a function that counts the number of active users, you should call that "count_active_users()" or similar, you should _not_ call it "cntusr()". Yes we're stuck with the old names from the 70s for backcomp, but for new APIs go with descriptive long ones.
user207421 over 7 years

@downvoter You may not agree with Denni Ritchie about this, but that's what he did. Taking it out on this answer is futile.
pizdelect about 5 years

@Luaan LISP machines do NOT predate Unix. You fail.
zwol about 5 years

@pizdelect That's technically true, but technical truth is not sufficient reason to snap at someone.
Luaan about 5 years

@pizdelect Sorry, I meant Lisp predates UNIX (and C). But LISP machines were essentially contemporary, if you compare their commercial impact (UNIX had a head start of about three years, but by the time LISP machines came, the commercial UNIX machines were still few and far between; most of UNIX was in academia or with no support). In any case, it's a response to the common technological arguments at the time, which was the 80s, when people were actually deciding between UNIX machines and LISPMs, and they were wrong. That changed with micro-computers which could run LISP faster anyway.
Admin about 2 years

First rule of downvotes: don’t talk about downvotes.