How to count lines in a document?

1,683,245

Solution 1

Use wc:

wc -l <filename>

This will output the number of lines in <filename>:

$ wc -l /dir/file.txt
3272485 /dir/file.txt

Or, to omit the <filename> from the result use wc -l < <filename>:

$ wc -l < /dir/file.txt
3272485

You can also pipe data to wc as well:

$ cat /dir/file.txt | wc -l
3272485
$ curl yahoo.com --silent | wc -l
63

Solution 2

To count all lines use:

$ wc -l file

To filter and count only lines with pattern use:

$ grep -w "pattern" -c file  

Or use -v to invert match:

$ grep -w "pattern" -c -v file 

See the grep man page to take a look at the -e,-i and -x args...

Solution 3

wc -l <file.txt>

Or

command | wc -l

Solution 4

there are many ways. using wc is one.

wc -l file

others include

awk 'END{print NR}' file

sed -n '$=' file (GNU sed)

grep -c ".*" file

Solution 5

wc -l does not count lines.

Yes, this answer may be a bit late to the party, but I haven't found anyone document a more robust solution in the answers yet.

Contrary to popular belief, POSIX does not require files to end with a newline character at all. Yes, the definition of a POSIX 3.206 Line is as follows:

A sequence of zero or more non- <newline> characters plus a terminating character.

However, what many people are not aware of is that POSIX also defines POSIX 3.195 Incomplete Line as:

A sequence of one or more non- <newline> characters at the end of the file.

Hence, files without a trailing LF are perfectly POSIX-compliant.

If you choose not to support both EOF types, your program is not POSIX-compliant.

As an example, let's have look at the following file.

1 This is the first line.
2 This is the second line.

No matter the EOF, I'm sure you would agree that there are two lines. You figured that out by looking at how many lines have been started, not by looking at how many lines have been terminated. In other words, as per POSIX, these two files both have the same amount of lines:

1 This is the first line.\n
2 This is the second line.\n
1 This is the first line.\n
2 This is the second line.

The man page is relatively clear about wc counting newlines, with a newline just being a 0x0a character:

NAME
       wc - print newline, word, and byte counts for each file

Hence, wc doesn't even attempt to count what you might call a "line". Using wc to count lines can very well lead to miscounts, depending on the EOF of your input file.

POSIX-compliant solution

You can use grep to count lines just as in the example above. This solution is both more robust and precise, and it supports all the different flavors of what a line in your file could be:

$ grep -c ^ FILE
Share:
1,683,245
Alucard
Author by

Alucard

Federal e o !

Updated on November 17, 2021

Comments

  • Alucard
    Alucard over 2 years

    I have lines like these, and I want to know how many lines I actually have...

    09:16:39 AM  all    2.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00   94.00
    09:16:40 AM  all    5.00    0.00    0.00    4.00    0.00    0.00    0.00    0.00   91.00
    09:16:41 AM  all    0.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00   96.00
    09:16:42 AM  all    3.00    0.00    1.00    0.00    0.00    0.00    0.00    0.00   96.00
    09:16:43 AM  all    0.00    0.00    1.00    0.00    1.00    0.00    0.00    0.00   98.00
    09:16:44 AM  all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
    09:16:45 AM  all    2.00    0.00    6.00    0.00    0.00    0.00    0.00    0.00   92.00
    

    Is there a way to count them all using linux commands?

    • Luv33preet
      Luv33preet over 6 years
      Open file using vim, then type g <Ctrl-g>, It will show you number of lines, words, columns and bytes
  • ggb667
    ggb667 over 10 years
    Yes, but wc -l file gives you the number of lines AND the filename to get just the filename you can do: filename.wc -l < /filepath/filename.ext
  • CheeHow
    CheeHow about 10 years
    this is great!! you might use awk to get rid of the file name appended to the line number as such: wc -l <file> | awk '{print $1}
  • Tensigh
    Tensigh about 10 years
    Even shorter, you could do wc -l < <filename>
  • CMCDragonkai
    CMCDragonkai almost 10 years
    This gives me one extra line then all the lines?
  • VeikkoW
    VeikkoW over 9 years
    Does not work: dir | perl -lne 'END { print $. }' Can't find string terminator "'" anywhere before EOF at -e line 1.'
  • Buttle Butkus
    Buttle Butkus over 9 years
    Isn't that like using an F16 to kill garden weeds?
  • baptx
    baptx over 9 years
    @GGB667 you can also get rid of the file name with cat <file> | wc -l
  • tripleee
    tripleee about 9 years
    @VeikkoW Works for me. If you are on Windows, different quoting rules apply; but the OP asked about Linux / Bash.
  • Admin
    Admin almost 9 years
    the first and last method are the same. the last one is better because it doesn't spawn an extra process
  • DarkSide
    DarkSide almost 9 years
    and with watch wc -l <filename> you can follow this file in real-time. That's useful for log files for example.
  • fedorqui
    fedorqui almost 9 years
    This answer was posted 3 years after the question was asked and it is just copying other ones. The first part is the trivial and the second is all ghostdog's answer was adding. Downvoting.
  • Tom Fenech
    Tom Fenech almost 9 years
    perl -lne '}{ print $. ' does the same.
  • Damien Roche
    Damien Roche about 8 years
    4 years on.. downvoting. Let's see if we can get a decade long downvote streak!
  • ghoti
    ghoti over 7 years
    None of the suggestions in this answer are actually bash answers. But while we're recommending other tools, you could avoid the whitespace by just using awk: awk 'END{print NR}' /dir/file.txt, or sed: sed -n '$=' /dir/file.txt. Or heck, if you wanted an actual bash solution, you could count the files in a loop! while read _; do ((n++)); done < /dir/file.txt; echo $n.
  • MarkHu
    MarkHu over 7 years
    Oddly sometimes the grep -c works better for me. Mainly due to wc -l annoying "feature" padding space prefix.
  • Zlemini
    Zlemini over 7 years
    Using the GNU grep -H argument returns filename and count. grep -Hc ".*" file
  • ggb667
    ggb667 over 7 years
    No, you are wrong; ghostdog's answer does not answer the original question. It gives you the number of lines AND the filename. To get just the filename you can do: filename.wc -l < /filepath/filename.ext. Which is why I posted the answer. awk, sed and grep are all slightly inferior ways of doing this. The proper way is the one I listed.
  • MitchellK
    MitchellK almost 7 years
    So simple, thanks .... How could I write this into a variable inside a bash script so that I can collect line counts from various files and then use those variables later on in my script. So like $LINECOUNT1 (from file1.txt) $LINECOUNT2 (from file2.txt) etc ??? And then if I want to I can just take a sum of variable1 + variable2 +variable3 etc.
  • MitchellK
    MitchellK almost 7 years
    Never mind figured it out WC1=$(wc -l < file1.txt) WC2=$(wc -l < file2.txt)
  • Konstantin
    Konstantin almost 7 years
    Beware that wc -l counts "newlines". If you have a file with 2 lines of text and one "newline" symbol between them, wc will output "1" instead of "2".
  • Joshua Lawrence Austill
    Joshua Lawrence Austill almost 7 years
    ls -l | wc -l will actually give you the number of files in the directory +1 for the total size line. you can do ls -ld * | wc -l to get the correct number of files.
  • Scott Joudry
    Scott Joudry over 6 years
    This is the first answer I have found that works with a file that has a single line of text that does not end in a newline, which wc -l reports as 0. Thank you.
  • asdf
    asdf almost 6 years
    @user85509 wc -l gives the number of new lines, which might be different from actual number of lines in a file. (Usually wc -l gives 1 less than actual no of lines)
  • sveti petar
    sveti petar over 5 years
    In a bash script, how do I assign the output of wc -l < /dir/file.txt to a variable?
  • Dragas
    Dragas over 5 years
    @jovan I would use $() (evaluation) operator.
  • Theodore Murdock
    Theodore Murdock almost 5 years
    @asdf Actually, wc -l usually gives the real number of lines in a compliant Linux text file. The last line in a file is always supposed to be \n, so that cat <file> prints the prompt on a new line, wc -l gives the right line count, etc. A lot of text editors (and IDEs) will always introduce a newline at the end of a text file when you save it for this reason. So you shouldn't assume you need to increment; if you care, you should check whether it's non-compliant (last char is not '\n'), and add one in that case.
  • growlingchaos
    growlingchaos over 4 years
    I voted this solutions because wc -l counts newline characters and not the actual lines in a file. All the other commands included in this answer will give you the right number in case you need the lines.
  • Chiru
    Chiru over 4 years
    This answer is not POSIX-compliant and can easily miscount lines. wc counts newlines, the character, and not lines. This will lead to miscounts if your EOF is not \n, which POSIX does not require. I've answered this in detail here.
  • jeb
    jeb over 4 years
    Where is the benefit of repeating the accepted (ten years old) answer?
  • Harsh Sarohi
    Harsh Sarohi over 4 years
    Because I couldn't find command to get only line numbers in output in this thread.
  • jeb
    jeb over 4 years
    It's the second example in the accepted answer. wc -l < filename
  • Harsh Sarohi
    Harsh Sarohi over 4 years
    wc -l < filename > gives filename as well as number of lines in output.
  • jeb
    jeb over 4 years
    No, wc -l < filename is different to wc -l filename, the first uses redirection and then there isn't any filename in the output, like shown in the answer from user85509
  • Nexonus
    Nexonus about 3 years
    Additionally when your last line does not end with an LF or CRLF wc -l gives out a wrong number of lines as it only counts line endings. So grep with a pattern like ^.*$ will actually give you the true line number.
  • Eric
    Eric almost 3 years
    awk used this way is 16 times slower than grep -c '^'
  • Eric
    Eric almost 3 years
    This should be the accepted asnwer. Not only because it is correct but also because grep is more that twice faster than wc.
  • smac89
    smac89 almost 3 years
    @Eric does grep also count the lines?
  • Eric
    Eric almost 3 years
    sure: grep -c -E ^ will count the number of "start of line" markers, hence the number of lines.
  • smac89
    smac89 almost 3 years
    @Eric Ah cool, cool. I was going to suggest you post that answer, but it looks like someone else already did so. Anyways, when I posted this answer, I just discovered awk, and this was one of the many things I discovered it could do. I also just tested with a 1GB file, and awk was only 4x slower, not 16x. I created the test file using base64 /dev/urandom | head -c 1000000000, but with smaller files (which is most likely what these answers will be used for), the speed is hardly variable
  • Eric
    Eric almost 3 years
    Yeah I get also a ratio of 4 with this sort of files. So depending on the file, yout mileage may vary. The point is that it's always in benefit of grep.
  • netrox
    netrox over 2 years
    Wow, this is a good answer. It needs to be the accepted answer because of good explanation and POSIX specs are clearly outlined.
  • kvantour
    kvantour over 2 years
    Very nice: you might want to comment on this
  • miken32
    miken32 over 2 years
    Don't use xargs. The find command has an -exec verb that is much simpler to use. Someone already suggested its use 6 years ago, although this question does not ask anything about multiple files. stackoverflow.com/a/28016686