Why is cp's option not to overwrite files called --no-clobber?

5,369

Solution 1

Clobber” in the context of data manipulation means destroying data by overwriting it. In the context of files in a Unix environment, the word was used at least as far back as the early 1980s, possibly earlier. Csh had set noclobber to configure > to refuse to overwrite an existing file (later set -o noclobber in ksh93 and other sh-style shells). When GNU coreutils added --no-clobber (in 2009), they used the same vocabulary that shells were using.

Solution 2

Because this is actually a standard term. As explained in Wikipedia:

In software engineering, clobbering a file or computer memory is overwriting its contents. The Jargon File defines clobbering as

To overwrite, usually unintentionally: "I walked off the end of the array and clobbered the stack." Compare mung, scribble, trash, and smash the stack.

As mentioned on the same page, bash and other shells also use the term in their set -o noclobber or equivalent. This is just the standard term for this sort of thing, so it was a natural choice for the developers of cp.

Solution 3

The term "clobber" is well-known in computing in general.

The --no-clobber/-n option for cp was only added on 2009-01-14 by Kamil Dudka <[email protected]> (commit on github).

Specifically within the GNU project, it's also used in GCC to describe when a CPU instruction or inline asm statement destroys the contents of a register. So it's not a random choice English word, and it's not unlikely that people working on GNU projects written in C would be at least familiar in passing with usage of the term from GCC docs, or from other GNU project developers using it:

  • (clobber x) in GCC-internals machine description files that teach GCC what each instruction in an ISA does. (Similar constraints to inline-asm)
  • GNU C Extended Asm inline asm() statements have a "clobber" section to tell the compiler which registers the inline asm template steps on. Like this useless nonsensical x86 example:
    asm("xor %eax,%eax; mfence" ::: "eax", "memory", "cc");. e.g. an SO Q&A asking about a function-calling convention in those terms.
  • GCC docs for -fcall-used-reg describe it as telling the compiler that a given register is "clobbered" by function calls (i.e. tweaks the calling convention). As opposed to -fcall-saved-reg or -ffixed-reg.
  • GCC -Wclobbered warning - "Warn for variables that might be changed by longjmp or vfork." (IDK if this existed in 2009, but it demonstrates that this word gets used to describe this sort of thing in various contexts including option names in other programs).

The author of the coreutils commit that added --no-clobber, Kamil Dudka, is definitely familiar with GCC internals: he (later?) wrote a GCC plugin for formal verification of C programs.

I don't know whether GCC internals influenced his choice of name, or if that came from existing shell options like set noclobber, or both.

Fun fact: original authors of GNU cp include Torjorn Granlund, principal author of the gmplib project (GNU Multi-Precision), and who helped invent/implement GCC's multiplicative-inverse optimization for division by a compile-time constant (1994 paper, Stack Overflow Q&A).

Share:
5,369

Related videos on Youtube

TZubiri
Author by

TZubiri

Updated on September 18, 2022

Comments

  • TZubiri
    TZubiri almost 2 years

    cp is a massively popular Linux tool maintained by the coreutils team of the GNU foundation.

    By default, files with the same name will be overwritten, if the user wants to change this behaviour they can add --no-clobber to their copy command:

       -n, --no-clobber
              do not overwrite an existing file (overrides a previous -i option)
    

    Why not something like --no-overwrite?

    • Paulo Tomé
      Paulo Tomé over 4 years
      The --no-clobber option is not specified in POSIX. It is specific of GNU implementation.
    • TZubiri
      TZubiri over 4 years
      Are you implying that it is therefore not considered a relevant question?
    • Paulo Tomé
      Paulo Tomé over 4 years
      On the contrary, my opinion is that this is a relevant question. The purpose of the comment is to provide context to the question with relevant information.
    • ilkkachu
      ilkkachu over 4 years
      Probably from the same source as the name noclobber in the shell's set builtin. But I don't know the timeline.
    • Paulo Tomé
      Paulo Tomé over 4 years
      The -n, --noclobber option has been introduced in coreutils version 7.1 at the beginning of year 2009. 2009-01-14 Kamil Dudka <[email protected]>
    • Paulo Tomé
      Paulo Tomé over 4 years
      The mvcommand also has a -n, --no-clobber option introduced at the same time.
    • Stefan Skoglund
      Stefan Skoglund over 4 years
      Nitpick: the coreutil team maintains the GNU version of cp, all the different properiary unix:es has their own version of cp (though it can be licensed) and so they do their own maintenance.
    • Simman
      Simman over 4 years
      As a native British English speaker, particularly from the British Midlands, to "clobber" usually means to hit or break. So there is a usage similarity in the sense clobbering something is undesirable. That said, I hadn't appreciated the technical definition... but I'll be using it in future.
    • mcalex
      mcalex over 4 years
      Strictly speaking, it doesn't overwrite (ie, open the existing file and replace each of the characters with new characters, deleting any remaining chars at end of replace process). The original file is gone and the new one takes its place.
    • leinaD_natipaC
      leinaD_natipaC over 4 years
      @TomasZubiri Welcome to the magical world of computer science etymology. I hope you come to love it as much as I do !
    • mustaccio
      mustaccio over 4 years
      +1 for "massively popular"
    • TZubiri
      TZubiri over 4 years
      When I initially asked the question, I suspected that cp was bigger than linux and gnu. Now I know that even openbsd have cp and probably lots of other systems as well, if someone with a more clear understanding of what cp is could edit the first sentence, that would be appreciated.
    • Thorbjørn Ravn Andersen
      Thorbjørn Ravn Andersen over 4 years
      @TomasZubiri cp was a command introduced very early in Unix. Hence it has undergone the same forking, cloning and aging as Unix itself.
    • David42
      David42 over 4 years
      I live in New England and her "clobber" is sometimes used to described the damage from storms as in "Wow, we got really clobbered by that storm!" To me clobbering a file summons up a picture of sweeping through it smashing it to bits.
  • JdeBP
    JdeBP over 4 years
    The C shell had it in the 1980s, and at least one contemporary source bemoans it not being in the Korn shell of the time. (-: It is documented in the Andersons' The UNIX C shell field guide which was published by Prentice Hall in 1986 (ISBN 9780139374685) so probably had existed for a while before that.
  • JdeBP
    JdeBP over 4 years
    A quick check of the source in Diomidis Spinellis's archive reveals that noclobber was in the C shell from 2BSD in 1979, so one would have to find out where Bill Joy got the word from.
  • siliconrockstar
    siliconrockstar over 4 years
    Always wondered if 'clobber' was inspired by The Thing from The Fantastic Four, a la 'It's clobberin' time!'
  • Ross Presser
    Ross Presser over 4 years
    clobber (in this sense) goes back at least to 1941. Marvel coopted an existing word. I suppose it's possible that Bill Joy was a fan, or something, but it doesn't seem likely to be the sole reason for using the word.