Is there a difference between read, head -1, and sed 1q?

bash shell sed head read

11,976

Solution 1

Neither efficiency nor builtinness is the biggest difference. All of them will return different output for certain input.

head -n1 will provide a trailing newline only if the input has one.
sed 1q will always provide a trailing newline, but otherwise preserve the input.
read will never provide a trailing newline, and will interpret backslash sequences.

Additionally, read has additional options, such as splitting, timeouts, and input history, some of which are standard and others vary between shells.

Solution 2

Builtins are present as a way to have the system calls more faster. So, I believe read command is present as a builtin to be more efficient.

Quoting from here,

These builtin commands are part of the shell, and are implemented as part of the shell's source code. The shell recognizes that the command that it was asked to execute was one of its builtins, and it performs that action on its own, without calling out to a separate executable. Different shells have different builtins, though there will be a whole lot of overlap in the basic set.

Now, I would like this to be experimented by yourself, so that you can understand why read is present as a shell builtin.

Normally, you couldn't do strace on shell builtins. However, there is a workaround for this as well. This is explained pretty neatly in this answer.

In the first shell, run the command as stty -echo.
Open another shell and run the command as cat | strace bash > /dev/null.
Now, the shell would be waiting for the user to type in the commands and there by when the user types the commands, you could see what happens in the system level as well.
When you give the above 3 commands, you could see that read has fewer system calls than the remaining 2 commands. I am not pasting the output from strace as it is pretty big.

Solution 3

For one thing, you can parse text with read, not just take a whole line

echo "foo:bar:baz" | {
  IFS=: read one two three
  echo $two
}

11,976

Mr. DROP TABLE

I run an iPhone app consultancy called Bynomial. Please let me know if you're interested in app development! I'm always looking for good coders and good clients. My background: I love math -- I got a PhD in it from the Courant Institute at NYU in 2006. After that I was a coder at Google for a few years, and in 2009 started Bynomial.

Updated on September 18, 2022

Comments

Mr. DROP TABLE over 1 year
The following commands seem to be roughly equivalent:
```
read varname
varname=$(head -1)
varname=$(sed 1q)
```
One difference is that read is a shell builtin while head and sed aren't.

Besides that, is there any difference in behavior between the three?

My motivation is to better understand the nuances of the shell and key utilities like head,sed. For example, if using head is an easy replacement for read, then why does read exist as a builtin?
Caleb Eklund over 9 years

Of course you can use read varname outside a shell script. Try it! There are very few things you can do in a script that you can't do at a prompt in the "sh" family of shells (whether they are very useful is a different matter).
Caleb Eklund over 9 years

Also the read builtin in /bin/sh pre-exists head and probably sed.
mikeserv over 9 years

You don't parse it with read, exactly, you parse it with $IFS. IFS=:; set -- ${0+foo:bar:baz}; echo "$2" accomplishes the same - and it requires no |pipes or cloned subshells. Or: IFS=:; printf %.0s%s%.0s\\n ${0+foo:bar:baz}. Else: echo foo:bar:baz | { IFS=:; set -- $(cat); echo "$2"; } if you must have the pipe. In any case, the splitting is a result of your setting $IFS; not necessarily that you read it.
Angel Todorov over 9 years

True. read assigns the split parts to variables.
Henk Langeveld over 9 years

tl;dr: read is a builtin and skips the overhead and bookkeeping involved with starting a new process.
Mr. DROP TABLE over 9 years

Thanks. These behavior differences are what I was looking for. I also like knowing that, as @glenn-jackman pointed out, that read can parse parameters separated by the internal field separator $IFS. @ramesh's strace trick is also an awesome way to analyze differences.