`xargs` with spaces in filenames

8,007

Solution 1

Using xargs, it can be done in this way:

find . -type f -print0 | xargs -0 file | grep -v 'image' 

But xargs is so yesterday. The cool kids use parallel today. Using parallel, it would be:

find . -type f | parallel file | grep -v 'image'

See. No use of -print0 and -0. parallel is really smart by itself.

UPDATE

For listing only the most recent 500 files, your command would be:

ls -1t | head -500 | parallel file {} | grep -v image

Important

In case your parallel is old and above syntax doesn't work, then install the new version of parallel as explained here: http://www.gnu.org/software/parallel/parallel_tutorial.html

Solution 2

Use "find" with "-print0" option & pipe the output to "xargs" with "-0" option.

Even though I know (and use) this technique, I see that user @Jens has answered a similar question, where you can find more Details :

https://stackoverflow.com/questions/16758525/use-xargs-with-filenames-containing-whitespaces

Solution 3

For generic advice regarding processing of file names potentially containing spaces, see Why does my shell script choke on whitespace or other special characters?

The difficulty with what you're trying to do is that there's no nice way to list the N most recent files with standard tools.

The easiest way to do what you're doing here is to use zsh as your shell. It has glob qualifiers to sort files by date. To run file on the 500 most recent files:

file *(om[1,500])

With the Linux file utility, pass the -i or --mime-type option to get output that's easier to parse. Image files are identified by lines ending with image/something.

file --mime-type *(om[1,500]) | sed -n 's~: *image/[^ ]*$~~p'

If you need to cope with absolutely all file names, including those with a newline in their name, use the -0 option for null-delimited output. Recent versions of GNU sed can use null bytes as the record delimiter instead of newlines.

file --mime-type -- *(om[1,500]) | sed -zn 's~: *image/[^ ]*$~~p'

If you don't have zsh, you can use ls and cope with file names that contain spaces but not newlines or trailing spaces by passing the -L1 option to file. This invoked file on one file at a time, so it's slightly slower.

ls -t | head -n 500 | xargs -L1 file --mime-type -- | sed -n 's~: *image/[^ ]*$~~p'

Solution 4

I have two crude suggestions that might help. Neither feels particularly satisfying though, so perhaps something better will come up.

First, use sed to add quotes to everything, so you'd only end up with trouble if there are quotes in the file name like

ls -t | head -500 | sed -e 's/\(.*\)/"\1"/' | xargs file | grep -v 'image'

The other is to use the ls to find the 501st most recent then use find to get the newer stuff like

find -newer $(ls -t | head -501 | tail -1) -type f -exec file {} \; | grep -v image
Share:
8,007

Related videos on Youtube

isomorphismes
Author by

isomorphismes

Updated on September 18, 2022

Comments

  • isomorphismes
    isomorphismes over 1 year

    I'm trying to list only non-image files, searching only in the most recent 500 files. So I run

    ls -t | head -500 | file | grep -v 'image'

    which isn't right: it displays a help message. Changing it to

    ls -t | head -500 | xargs file | grep -v 'image'

    I now sometimes get the output I want, but if the filename has spaces in it—for example Plutonian\ Nights\ -\ Sun\ Ra.mp3—then xargs will run file Plutonian, file Nights, etc.


    How do I either help xargs see the spaces, or otherwise accomplish what I'm trying to accomplish?

    • MattBianco
      MattBianco about 7 years
      In popular xargs implementations, the delimiter can be changed, for example to '\n'. This is often helpful when the input is not generated by find. See -d (GNU) and -E (OSX)
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' almost 9 years
    When I try this with a file with spaces in its name (e.g., Sun Ra), I get Sun\0Ra\0, so this doesn't solve the problem.
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' almost 9 years
    You got the low-hanging fruit.  How do you search only the most recent 500 files?
  • doneal24
    doneal24 almost 9 years
    Sorry, I missed a set of quotes:
  • dhag
    dhag almost 9 years
    This will not work; printf will consider each space-separated word as an argument. You can test this with printf "%s\n" $(printf "file #1\nfile2\n").
  • dhag
    dhag almost 9 years
    As long as we're going to parse the output of ls, I believe your first snippet would be improved by replacing newlines with nulls (tr \\n \\0) and using xargs -0.
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' almost 9 years
    @dhag: Yeah, I pointed that out 40 minutes ago.
  • doneal24
    doneal24 almost 9 years
    How about find . maxdepth 1 -newer "$(ls -t | head -501 | tail -1)" -print0 | xargs -0 file
  • G-Man Says 'Reinstate Monica'
    G-Man Says 'Reinstate Monica' almost 9 years
    @Doug: If you're going to propose an incremental refinement of Eric's answer, it makes more sense to do it in a comment on Eric's answer — and explain why your answer is better than his.  Also, you missed the image part of the question.