Is there a way around broken pipe?

14,656
sort: write failed: standard output: Broken pipe

The problem is not between find and sort. The sort has problem with output, which means the shell is not willing to read as long list in a variable.

You'll have to process the input with while read…, storing it in temporary file if you need it more than once. With the added advantage, that this splits on newline only, so it correctly handles filenames with spaces which the backtick approach does not.

Unfortunately you don't say how you want to use the result, I can't tell you how to exactly rewrite it.

Note, that arrays are not part of POSIX shell specification and there are shells that are noticeably faster than bash, but don't have them. That's why many people, including me, often avoid using them in scripts.

Share:
14,656

Related videos on Youtube

user1541776
Author by

user1541776

Updated on September 18, 2022

Comments

  • user1541776
    user1541776 over 1 year

    I have a directory with a large number of files.

    ./I_am_a_dir_with_many_subdirs/
    

    Within a script I'd like to find all subdirs in it, to sort them and to output to a bash array. So, I do:

    SubdirsArray=(`find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort`)
    

    Executing the script, I get the following error messages:

        sort: write failed: standard output: Broken pipe
        sort: write error
    

    As explained in this post: probably sort executes and closes the pipe, before find completes writing to it. Thus write() command initiated by find gets an error EPIPE "Broken pipe", OS sends find a SIGPIPE. Before the SIGPIPE reaches find, it prints the error message, then gets SIGPIPE and dies.

    Questions:

    1. So, what does my SubdirsArray contain? The Subdirs, that find found, but sort left unsorted?

    2. If so, than what would be the way around this issue with broken pipes? Make find write it's results to a temporary file and then make sort read it?

      I don't understand, why "it's also nothing to be concerned about" if it happens within a non-interactive shell: why? My SubdirsArray contains something unsorted and further in the script, I assume, that its elements are sorted?!

    3. I get two error messages:

      sort: write failed: standard output: Broken pipe
      sort: write error
      

    In this thread it is suggested, that sort doesn't have enough space in a temporary directory to sort all the input. But, doesn't it mean, that sort got something from find?!? I'm confused... Anyways, I tried to use

    SubdirsArray=(`find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort -T /home/temp_dir`)
    

    but it didn't help.

    P.S.

    I'm not sure whether it's important, but I use find|sort in a multi-processor script: several processors execute the same command at once in the subshells.

    • Jan Hudec
      Jan Hudec about 10 years
      sort can't do anything before it read the input in full and besides if it was the sort ending prematurely, it would be find reporting broken pipe, not sort. The error in the other thread you mention looks very different and is indeed different.
    • user1541776
      user1541776 about 10 years
      @JanHudec thank you for pointing that out, I didn't pay attention, to what command reported the problem.
  • user1541776
    user1541776 about 10 years
    Jan, thank you for the answer and the comment. I'd like to use SubdirsArray in a for loop. So, I will implement your solution like: find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort > temp.txt; while read Subdir; do myFunction $Subdir; done; rm temp.txt In the end, I'd like to apply myFunction to all Subdirs. To do it faster, I try to parallelise my code and use N subshells with wait. Each subshell should take only it's part of Subdirs. I didn't want to send a long array of Subdirs it should handle to each subshell, but first/last index of SubdirsArray.
  • Jan Hudec
    Jan Hudec about 10 years
    @user1541776: Don't forget to redirect input into the loop. It could be done even without temporary file, but probably not if you want to split it first.
  • user1541776
    user1541776 about 10 years
    you mean find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort > temp.txt; while read Subdir; do myFunction $Subdir; done < "temp.txt" ; rm temp.txt ?
  • Jan Hudec
    Jan Hudec about 10 years
    @user1541776: Yes, exactly. read just reads from standard input.
  • Jan Hudec
    Jan Hudec about 10 years
    @user1541776: You can also pipe to the loop, but it will than run in a subshell, on in shell that has it (like bash, but not ash/dash) you can use process substitution, i.e. like <(find ... | sort).
  • user1541776
    user1541776 about 10 years
    thank you. So, the final version would be 1. piping to the loop: find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort | while read Subdir; do myFunction $Subdir; done or alternatively 2. using process substitution while read Subdir; do myFunction $Subdir; done <(find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort)
  • Jan Hudec
    Jan Hudec about 10 years
    @user1541776: Yes.
  • user1541776
    user1541776 about 10 years
    Thanks, Jan! Bugfix: while read Subdir; do myFunction $Subdir; done < <(find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort)