How to check for non empty files in Perl

perl operators

24,649

Solution 1

Refer to perldoc perlfunc -X for a refresher of the Perl file test operators. What you want is this one:

-s  File has nonzero size (returns size in bytes).

Simple script showing how to use File::Find:

#!/usr/bin/perl -w
use strict;
use File::Find;
# $ARGV[0] is the first command line argument
my $startingDir = $ARGV[0];
finddepth(\&wanted, $startingDir);
sub wanted
{
    # if current path is a file and non-empty
    if (-f $_ && -s $_)
    {
        # print full path to the console
        print $File::Find::name . "\n";
    }
}

In this example I have the output going to the console. To pipe it to a file, you can just use shell output redirection, e.g. ./findscript.pl /some/dir > somefile.out.

Solution 2

Please have a look at perldoc http://perldoc.perl.org/functions/-X.html

-z  File has zero size (is empty).
-s  File has nonzero size (returns size in bytes).

Sample usage to detect non-empty file:

unless ( (-z $FILE) ) { process_file($FILE); }
if (-s $FILE) { process_file($FILE); }

24,649

Author by

Grace

Updated on September 18, 2020

Comments

Grace almost 2 years

I'm using the find command for finding files in directories. I would like to check if the files in the directories are not empty (non 0 size) before proceeding. Thanks to the find manual, I know how to identify empty files using the -empty option.

However, I want to use Perl to check for non-empty files. How can I do that?

Thanks in advance.
Grace over 10 years

Thanks bobbymcr, yes I have read that from web using -s option, but not sure how to apply it in the code. I'm using the find command such as -> system "findn $dir -type f ".. not sure how to insert the -s option in as it keep prompt me this error " find: invalid predicate `-s'" if I insert th e-s option in the line.
bobbymcr over 10 years

@Grace: Don't do that. If you're using Perl, use Perl's version of the find command which is File::Find. See here: perldoc.perl.org/File/Find.html
Grace over 10 years

sub get_DIR { my ($dir) = @_; use File::Basename ; use File::Find ; use constant DIR => "$dir" ; finddepth (\&wanted, DIR) > "tmp1"; sub wanted { return if not -f $_ ; if (not exists $file_hash{$_}) { $file_hash{$_} = [] ; } push @{$file_hash{$_}}, $File::Find::dir ; } foreach my $basename (@{$file_hash{$_}}) { open FH, '>log' or die $!; $basename = basename( $_ ) ; print FH "$basename \n" ; } chmod (0750, "log") ; close (FH) ; }
Grace over 10 years

Sorry, I'm not sure how to format my code in "add comment" in readable format. Hopefully you can understand my code.. May I know what is wrong here as seem doesn't really work correctly. What I wanted to do is actually serach through the directories (which having sub/sub-sub directories) to find the file name , dump it to a file call "tmp1" Finally getting the basename of each of the file. (And of course I also wanted to check if the file is not empty then only procedd with getting the basename of each file). Hope I'm clear in my statement. Thanks.
Grace over 10 years

Hi bobbymcr, thanks for the sample code, the code was able to run but it seem like still captured the empty size file even there's a "-s" inside in the code. Is my understanding true that if there's an empty size file in any of the directories, it will just ignore? If this is true then the result seem not correct.
bobbymcr over 10 years

It worked for me (tested locally [on Windows] with some empty files). Are you sure the file is really empty? Does empty mean something else besides file size of 0?
Grace over 10 years

Yes it is empty.. hm.. let me try for my end again. Thanks.
Zaid over 10 years

Unless I'm missing something blatantly obvious, this example won't work because $_ inside the subroutine is not defined.
bobbymcr over 10 years

@Zaid: File::Find passes the name of the current file to your wanted function. This is the value of $_.
mplungjan over 10 years

@Ashish Kumar suggested to use if (-f $_ && -s _) to save time - I rejected his edit since it was done in your answer
Grace over 10 years

Hi bobbymcr, I amended the code as below, tried to write to a file called "log" but it seem only captured the first execution to the log. Can you pls point out where goes wrong? sub get_file { my ($dir) = @_; use File::Find ; open LOG, '>log' or die $! ; finddepth (\&wanted, $dir) ; sub wanted { if (-f $_ && -s $_) { my $basename = $_ ; print LOG "$basename \n" ; } chmod (0750, "log") ; close (LOG) ; }
Grace over 10 years

Hi bobbymcr, I changed it to the code below.. this time, it was able to captue all the repetative execution, but the problem is it seem to append all the data into the "log" each time my main code calling this subroutine "sub get_file" . Can you help to point out the mistake? :(. sub get_file { my ($dir) = @_; use File::Find ; finddepth (\&wanted, $dir) ; sub wanted {if (-f $_ && -s $_) {my $basename = $_ ; push @new, $basename ;} } open LOG, '>log' or die $! ;foreach my $new (@new) {print LOG "$new\n" ; } chmod (0750, "log") ;close (LOG) ;} Thanks in advance.
Grace over 10 years

Hi rpg, I want to check for non empty file but not empty file. Anyway, thanks for your respond.
Grace over 10 years

Hi bobbymcr, though I have already find out what I want with the ! -empty option in my find command, I'm still interest to know what is going wrong with my code above. Pls point out my error for my learning if you don't mind. Thanks!
bobbymcr over 10 years

I took your code and put it here (slightly modified): pastebin.ca/2096886 . I was able to run this successfully; it created a file called 'log' containing the names of files that were non-empty. Running it again does not append, but rewrites the file. So I don't see any problem necessarily...