How can the standard input of one program be passed as an arg to another?

5,063

Solution 1

If the program supports writing to any file descriptor even if it can't seek, you can use /dev/stdout as the output file. This is a symlink to /proc/self/fd/1 on my system. File descriptor 1 is stdout.

Solution 2

From the pdftotext man page:

If text-file is ´-', the text is sent to stdout.

So in this case all you need is:

pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" -

Or if you want to pipe this to STDIN of another program:

pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" - | another_prog

Using - as substitute for a filename is a convention many utilities follow (including pdftotext) when we want input from STDIN or output to STDOUT. However not all utilities follow this convention. In that case the idiomatic way to do this in bash is to use a process substitution:

my_utility "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" >( cat )

Here the >( ) behaves largely like a file passed to my_utility, but instead of being a real file, the stream is piped into the stdin of the contained process, i.e. cat. So here, the text should ultimately output as required.

Use of cat almost always sets off UUOC alarm bells on forums like this. I contend that if the utility does not support -, then this is a useful use of cat, though if there are any ways to do this process substitution without the cat, then I'm all ears ;-).

However, if (as the question states) the ultimate destination of of the stream is STDIN of another program, then the cat can be eliminated:

my_utility "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" >( another_prog )

Solution 3

If your shell supports them, the simplest way of doing such manipulations would be to use process substitution: <(…) and >(…). This works in bash, zsh and ksh and possibly other shells. For example:

$ sort <(printf "b\nc\na\n")
a
b
c
$ ls
foo
$ cp <(find . -name foo) bar
$ ls
bar  foo

However, this won't help in the example you state since pdftotext will save in a text file. While your best choice (apart from the obvious one of using -) is to use /dev/stdout as suggested by @TiCPU, you could also use another shell feature. The construct !:N refers to the Nth argument of the previous command. Therefore, you could do:

$ pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf"  out.txt
$ cat !:2
Share:
5,063

Related videos on Youtube

Dziugas
Author by

Dziugas

Updated on September 18, 2022

Comments

  • Dziugas
    Dziugas almost 2 years

    Let's say a program exists, which takes two arguments; input file and output file.

    What if I don't wish to save this output file to disk, but rather pass it straight to stdin of another program. Is there a way to achieve this?

    A lot of commands I come across on Linux provide an option to pass '-' as the output file argument, which does what I've specified above. Is this because passing the stdin of a program as an argument is not possible? If it is, how do we do it?

    An example of how I would image using this is:

    pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" stdin(echo)
    

    The shell I'm using is bash.

    • mikeserv
      mikeserv almost 9 years
      cat <file | cmd /dev/fd/0 works on most unices.
    • Dziugas
      Dziugas almost 9 years
      Not working for me. Tried it with: cat < README.txt | cp /dev/fd/0. It said cp: missing destination file operand after ‘/dev/fd/0’ Try 'cp --help' for more information.
    • yaegashi
      yaegashi almost 9 years
      program input-file /dev/stdout | another-program? Also note that echo reads nothing from stdin.
    • mikeserv
      mikeserv almost 9 years
      @Dziugas - of course not - you can't cp a file nowhere. echo 1 2 3| cp /dev/fd/0 /dev/tty will print 1 2 3. And by the way, /dev/fd/[num] is more likely to work than /dev/std(in|out|err) in most cases. See Portability of File-Descriptor Links about what you can expect to work where.
    • Jorge Bucaran
      Jorge Bucaran over 8 years
      A good UNIX program would write to standard output leaving it up to the user to decide whether they wish to redirect to a file or pipe to another command.
  • Dziugas
    Dziugas almost 9 years
    This solved my query. So is there no way to do it when the program needs to seek?
  • TiCPU
    TiCPU almost 9 years
    If you're trying to prevent disk access, you can write the file in /dev/shm/, however, if you don't want any file on the filesystem, then as far as I know, there is no way to seek on a pipe. Seeking forward means it would have to buffer everything in memory until it reached that point forward, and seeking backward implies having buffered everything in memory.
  • Scott - Слава Україні
    Scott - Слава Україні almost 9 years
    When is cat <( command ) ever useful?  That looks like a UUOC.  I think TiCPU's answer is correct (although not spelled out clearly): pdftotext "C BY K&R.pdf" /dev/stdout.  (I guess Digital Trauma's answer would work, although it's also a UUOC.)
  • dhag
    dhag almost 9 years
    I'm not sure how this answers the question, which is about combining commands; perhaps you expand with an example of how you would achieve that.
  • terdon
    terdon almost 9 years
    @Scott yes, it is a UUOC but neither of my other two examples are. I very often use <() for things like diff <(sort foo) <(sort bar). As for cat <(command) specifically, I can't think of a case at the moment that couldn't be replaced by other tools but there may well be one. In any case, cat was just the example chosen by the OP.
  • Scott - Слава Україні
    Scott - Слава Україні almost 9 years
    I don't see where the OP chose cat.  Somebody posted a semi-answer featuring a UUOC in a comment, and the OP (who didn't understand quite how to apply it) replied that it didn't work for him.  (And, of course, I realize that commands that don't even include cat cannot be UUOCs.)
  • terdon
    terdon almost 9 years
    @Scott whops, true, it was the echo(stdin) which I translated to cat. That's just the only way I could think of to twist the OP's example into something workable.
  • jimmij
    jimmij almost 9 years
    While I agree that cat <() can be useful in some situations, in this scenario however it is not working at all. The problem (very poorly described by OP, I must admit) is that pdftotext takes two arguments: input file and output file. If second argument is missing then it produces nothing, so cat <(pdftotext "file.pdf") would also return nothing. One can cheat pdftotext command by giving >(cat) as a second argument like Digital Trauma answered, but cat <() is pointless here. Obviously in pdftotext case it is best just to use - as the output file name.
  • terdon
    terdon almost 9 years
    @jimmij ah, I see. In that case, TiCPU's answer is probably the way to go.
  • jimmij
    jimmij almost 9 years
    I guess you are saying to check with tty the name of the terminal, and then use that file as an output, for example pdftotext file.pdf /dev/pts/2. In that case, I agree.
  • Digital Trauma
    Digital Trauma almost 9 years
    @Scott How is my answer a UUOC? How would you do this process substitution without cat? >( ) will effectively pipe the stream to whatever process is inside - so we actually do need a cat here to output that stream. Normally we should be able to do something like pdftotext input.pdf -, but apparently pdftotext doesn't support the - parameter to output directly to stdout instead of a file - try it.
  • jimmij
    jimmij almost 9 years
    @DigitalTrauma it is not uuoc. I believe cat is the fastest you can get in case of just printing, but in fact you can use other command as >(grep something) to be more useful. BTW, my pdftotext 3.04 do support - as an output file, so I'm a little surprised by the whole discussion.
  • Digital Trauma
    Digital Trauma almost 9 years
    @jimmij yes - I did not notice that pdftotext actually supports -. In this case I think it is a UUOC ;-) But I think the construct is still useful for utilities that don't support -.
  • Scott - Слава Україні
    Scott - Слава Україні almost 9 years
    @DigitalTrauma: Yeah, sorry; I was just in the process of typing a retraction.  But first: The question seems (to me) to be asking how to handle some (hypothetical) program (nominally, pdftotext) that insists on doing open(argv[1], O_RDONLY) and open(argv[2], O_CREAT|O_WRONLY), and doesn’t default to reading stdin or writing stdout (not even if given an arg of -).  And TiCPU and Digital Trauma both wrote decent answers to that question.  … (Cont’d)
  • Scott - Слава Україні
    Scott - Слава Україні almost 9 years
    (Cont’d) …  And I want to retract what I said in my first comment: if TiCPU’s answer doesn’t work (e.g., because /dev/stdout doesn’t exist), Digital Trauma’s answer may be the only (or at least the best) answer, and calling it a UUOC, while arguably, technically true, was a little harsh, because, while prog1 input_file >(  cat  ) | prog2 can be abbreviated to prog1 input_file >( prog2 ), the cat form is (again, arguably) clearer.
  • Stéphane Chazelas
    Stéphane Chazelas almost 9 years
    pdftotext like many (but not all) other utilities support - for that as well (which would work even on systems that don't support /dev/stdout, or where /dev/stdout don't work as expected like on Linux where stdout is not a pipe). pdftotext file.pdf - | wc -c
  • Scott - Слава Україні
    Scott - Слава Україні almost 9 years
    Of course, if you just want to display the output of prog1, you can use prog1 input_file /dev/tty, or jas's idea, prog1 input_file $(tty).
  • Digital Trauma
    Digital Trauma almost 9 years
    @Scott Thanks - These comments were useful - I've significantly revised my answer - I hope its much more complete now.
  • Scott - Слава Україні
    Scott - Слава Україні almost 9 years
    And let me backpedal once more: if prog2 writes to stdout, prog1 input_file >( cat ) | prog2 is better than prog1 input_file >( prog2 ), because the cat form waits for prog2 to complete (i.e., before the shell issues the next prompt or goes on to the next command (e.g., after ; or &&)), while the cat-less form waits only for prog1 to complete.  Also, after the cat form, $? is the exit status from prog2, whereas, in the other, $? is the exit status from prog1.  (You pays your money and you takes your choice.)
  • Scott - Слава Україні
    Scott - Слава Україні almost 9 years
    That can be abbreviated/automated to prog1  input_file $(tty); which is generally going to be equivalent to prog1  input_file /dev/tty.  But this approach assumes that the goal is to display the output of prog1 (i.e., in the terminal), and that is not what the question is asking (see the comments on terdon's answer for some clarification on the meaning of the question).
  • terdon
    terdon almost 9 years
    @DigitalTrauma that's not being a stickler! That's me being an idiot. Thanks for pointing it out and please never apologize when pointing out mistakes. I would much rather have my mistake pointed out to me and so learn something than leave it there in all its dubious glory.