Portable way to get file size (in bytes) in the shell
Solution 1
wc -c < filename
(short for word count, -c
prints the byte count) is a portable, POSIX solution. Only the output format might not be uniform across platforms as some spaces may be prepended (which is the case for Solaris).
Do not omit the input redirection. When the file is passed as an argument, the file name is printed after the byte count.
I was worried it wouldn't work for binary files, but it works OK on both Linux and Solaris. You can try it with wc -c < /usr/bin/wc
. Moreover, POSIX utilities are guaranteed to handle binary files, unless specified otherwise explicitly.
Solution 2
I ended up writing my own program (really small) to display just the size. More information is in bfsize - print file size in bytes (and just that).
The two cleanest ways in my opinion with common Linux tools are:
stat -c %s /usr/bin/stat
50000
wc -c < /usr/bin/wc
36912
But I just don't want to be typing parameters or pipe the output just to get a file size, so I'm using my own bfsize.
Solution 3
Even though du
usually prints disk usage and not actual data size, the GNU Core Utilities du
can print a file's "apparent size" in bytes:
du -b FILE
But it won't work under BSD, Solaris, macOS, etc.
Solution 4
BSD systems have stat
with different options from the GNU Core Utilities one, but with similar capabilities.
stat -f %z <file name>
This works on macOS (tested on 10.12), FreeBSD, NetBSD and OpenBSD.
Solution 5
When processing ls -n
output, as an alternative to ill-portable shell arrays, you can use the positional arguments, which form the only array and are the only local variables in the standard shell. Wrap the overwrite of positional arguments in a function to preserve the original arguments to your script or function.
getsize() { set -- $(ls -dn "$1") && echo $5; }
getsize FILE
This splits the output of ln -dn
according to current IFS
environment variable settings, assigns it to positional arguments and echoes the fifth one. The -d
ensures directories are handled properly and the -n
assures that user and group names do not need to be resolved, unlike with -l
. Also, user and group names containing white space could theoretically break the expected line structure; they are usually disallowed, but this possibility still makes the programmer stop and think.
Admin
Updated on July 14, 2022Comments
-
Admin almost 2 years
On Linux, I use
stat --format="%s" FILE
, but the Solaris machine I have access to doesn't have thestat
command. What should I use then?I'm writing Bash scripts and can't really install any new software on the system.
I've considered already using:
perl -e '@x=stat(shift);print $x[7]' FILE
or even:
ls -nl FILE | awk '{print $5}'
But neither of these looks sensible - running Perl just to get file size? Or running two programs to do the same?
-
Admin over 14 yearsWell, I do a lot of Perl writing myself, but sometimes the tool is chosen for me, not by me :)
-
caf over 14 yearsOr just
wc -c < file
if you don't want the filename appearing. -
Admin about 13 yearsFirst line of problem description states that stat is not an option, and the wc -c is the top answer for over a year now, so I'm not sure what is the point of this answer.
-
jmtd about 13 yearsIf I'm not mistaken, though,
wc
in a pipeline mustread()
the entire stream to count the bytes. Thels
/awk
solutions (and similar) use a system call to get the size, which should be linear time (versus O(size)) -
Camilo Martin almost 12 yearsI recall
wc
being very slow the last time I did that on a full hard disk. It was slow enough that I could re-write the script before the first one finished, came here to remember how I did it lol. -
yo' over 11 yearsThe point is in people like me who find this SO question in Google and
stat
is an option for them. -
Robert Calhoun about 11 yearsI'm working on an embedded system where
wc -c
takes 4090 msec on a 10 MB file vs "0" msec forstat -c %s
, so I agree it's helpful to have alternative solutions even when they don't answer the exact question posed. -
Orwellophile about 11 years"stat -c" is not portable / does not accept the same arguments on MacOS as it does on Linux. "wc -c" will be very slow for large files.
-
Haravikk almost 11 yearsI wouldn't use
wc -c
; it looks much neater butls
+awk
is better for speed/resource use. Also, I just wanted to point out that you actually need to post-process the results ofwc
as well because on some systems it will have whitespace before the result, which you may need to strip before you can do comparisons. -
Palec about 10 yearsFYI maxdepth is not needed. It could be rewritten as
size=$(test -f filename && find filename -printf '%s')
. -
SourceSeeker about 10 years@Palec: The
-maxdepth
is intended to preventfind
from being recursive (since thestat
which the OP needs to replace is not). Yourfind
command is missing a-name
and thetest
command isn't necessary. -
Palec about 10 years@DennisWilliamson
find
searches its parameters recursively for files matching given criteria. If the parameters are not directories, the recursion is… quite simple. Therefore I first test thatfilename
is really an existing ordinary file, and then I print its size usingfind
that has nowhere to recurse. -
pbies almost 10 years
stat
gives the size of locked file, whenwc
does not - Cygwin under Windows on c:\pagefile.sys. -
Rdpi over 9 yearsWhat would be then the best option to print the result in human friendly format? e.g. MB, KB
-
Admin almost 9 yearsstat is not portable either.
stat -c %s /usr/bin/stat
stat: illegal option -- c
usage: stat [-FlLnqrsx] [-f format] [-t timefmt] [file ...]
-
Orwellophile almost 9 yearsi did say that. try my answer, based on ls should be quite portable stackoverflow.com/a/15522969/912236
-
Silas over 8 years
wc -c
is great, but it will not work if you don't have read access to the file. -
Jose Alban about 8 yearsOn MacOS X,
brew install coreutils
andgdu -b
will achieve the same effect -
Luciano about 8 yearsOr put it in a shell script: ls -Lon "$1" | awk '{ print $4 }'
-
Orwellophile almost 8 years@Luciano I think you have totally missed the point of not forking and doing a task in bash rather than using bash to string a lot of unix commands together in an inefficient fashion.
-
CousinCocaine over 7 yearsI prefer this method because
wc
needs to read the whole file befor giving a result,du
is immediate. -
Palec almost 7 yearsThe
stat
andls
utilities just execut thelstat
syscall and get the file length without reading the file. Thus, they do not need the read permission and their performance does not depend on the file's length.wc
actually opens the file and usually reads it, making it perform much worse on large files. But GNU coreutils wc optimizes when only byte count of a regular file is wanted: it usesfstat
andlseek
syscalls to get the count. See the comment with(dd ibs=99k skip=1 count=0; ./wc -c) < /etc/group
in its source. -
Palec almost 7 years
find . -maxdepth 1 -type f -name filename -printf '%s'
works only if the file is in the current directory, and it may still examine each file in the directory, which might be slow. Better use (even shorter!)find filename -maxdepth 1 -type f -printf '%s'
. -
Palec almost 7 yearsPOSIX mentions
du -b
in a completely different context indu
rationale. -
Palec almost 7 yearsActually, units could be converted, but this shows disk usage instead of file data size ("apparent size").
-
Palec almost 7 yearsThis shows disk usage instead of file data size ("apparent size").
-
Palec almost 7 yearsThis uses just the
lstat
call, so its performance does not depend on file size. Shorter thanstat -c '%s'
, but less intuitive and works differently for folders (prints size of each file inside). -
Palec almost 7 yearsFreeBSD
du
can get close usingdu -A -B1
, but it still prints the result in multiples of 1024B blocks. Did not manage to get it to print bytes count. Even settingBLOCKSIZE=1
in the environemnt does not help, because 512B block are used then. -
Palec almost 7 yearsSolaris does not have
stat
utility at all, though. -
Ciro Santilli OurBigBook.com over 5 yearsI always wondered why the
stat
CLI utility was never included in POSIX. -
alper over 5 yearsWould it be efficient to use
wc -c < FILE
for very large files such as 100GB ? @Carl Smotricz -
Carl Smotricz over 5 years@alper: I haven't tested, but I suspect that redirecting a large file like this is terribly slow. My answer was about measuring file size portably, not efficiently. For a quick size based on the directory data, you'd probably be better off looking at some of the other answers here.
-
Jason Martin about 3 yearsBusybox doesn't support that structure: stat: unrecognized option: % BusyBox v1.32.1 () multi-call binary.
-
Andrew Henle over 2 yearsUse the flag
--b M
or--b G
for the output in Megabytes or Gigabytes Note, though, that neither of those are portable. pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/… -
Peter Mortensen over 2 yearsWhere does it work? Only on Linux?