How to count lines in a document?

linux bash command-line scripting

1,683,245

Solution 1

Use wc:

wc -l <filename>

This will output the number of lines in <filename>:

$ wc -l /dir/file.txt
3272485 /dir/file.txt

Or, to omit the <filename> from the result use wc -l < <filename>:

$ wc -l < /dir/file.txt
3272485

You can also pipe data to wc as well:

$ cat /dir/file.txt | wc -l
3272485
$ curl yahoo.com --silent | wc -l
63

Solution 2

To count all lines use:

$ wc -l file

To filter and count only lines with pattern use:

$ grep -w "pattern" -c file

Or use -v to invert match:

$ grep -w "pattern" -c -v file

See the grep man page to take a look at the -e,-i and -x args...

Solution 3

wc -l <file.txt>

command | wc -l

Solution 4

there are many ways. using wc is one.

wc -l file

others include

awk 'END{print NR}' file

sed -n '$=' file (GNU sed)

grep -c ".*" file

Solution 5

`wc -l` does not count lines.

Yes, this answer may be a bit late to the party, but I haven't found anyone document a more robust solution in the answers yet.

Contrary to popular belief, POSIX does not require files to end with a newline character at all. Yes, the definition of a POSIX 3.206 Line is as follows:

A sequence of zero or more non- <newline> characters plus a terminating character.

However, what many people are not aware of is that POSIX also defines POSIX 3.195 Incomplete Line as:

A sequence of one or more non- <newline> characters at the end of the file.

Hence, files without a trailing LF are perfectly POSIX-compliant.

If you choose not to support both EOF types, your program is not POSIX-compliant.

As an example, let's have look at the following file.

1 This is the first line.
2 This is the second line.

No matter the EOF, I'm sure you would agree that there are two lines. You figured that out by looking at how many lines have been started, not by looking at how many lines have been terminated. In other words, as per POSIX, these two files both have the same amount of lines:

1 This is the first line.\n
2 This is the second line.\n

1 This is the first line.\n
2 This is the second line.

The man page is relatively clear about wc counting newlines, with a newline just being a 0x0a character:

NAME
       wc - print newline, word, and byte counts for each file

Hence, wc doesn't even attempt to count what you might call a "line". Using wc to count lines can very well lead to miscounts, depending on the EOF of your input file.

POSIX-compliant solution

You can use grep to count lines just as in the example above. This solution is both more robust and precise, and it supports all the different flavors of what a line in your file could be:

$ grep -c ^ FILE

View more solutions

1,683,245

Author by

Alucard

Federal e o !

Updated on November 17, 2021

Comments

Alucard over 2 years

I have lines like these, and I want to know how many lines I actually have...

09:16:39 AM  all    2.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00   94.00
09:16:40 AM  all    5.00    0.00    0.00    4.00    0.00    0.00    0.00    0.00   91.00
09:16:41 AM  all    0.00    0.00    4.00    0.00    0.00    0.00    0.00    0.00   96.00
09:16:42 AM  all    3.00    0.00    1.00    0.00    0.00    0.00    0.00    0.00   96.00
09:16:43 AM  all    0.00    0.00    1.00    0.00    1.00    0.00    0.00    0.00   98.00
09:16:44 AM  all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
09:16:45 AM  all    2.00    0.00    6.00    0.00    0.00    0.00    0.00    0.00   92.00

Is there a way to count them all using linux commands?

Luv33preet over 6 years

Open file using vim, then type g <Ctrl-g>, It will show you number of lines, words, columns and bytes

ggb667 over 10 years

Yes, but wc -l file gives you the number of lines AND the filename to get just the filename you can do: filename.wc -l < /filepath/filename.ext
CheeHow about 10 years

this is great!! you might use awk to get rid of the file name appended to the line number as such: wc -l <file> | awk '{print $1}
Tensigh about 10 years

Even shorter, you could do wc -l < <filename>
CMCDragonkai almost 10 years

This gives me one extra line then all the lines?
VeikkoW over 9 years

Does not work: dir | perl -lne 'END { print $. }' Can't find string terminator "'" anywhere before EOF at -e line 1.'
Buttle Butkus over 9 years

Isn't that like using an F16 to kill garden weeds?
baptx over 9 years

@GGB667 you can also get rid of the file name with cat <file> | wc -l
tripleee about 9 years

@VeikkoW Works for me. If you are on Windows, different quoting rules apply; but the OP asked about Linux / Bash.
Admin almost 9 years

the first and last method are the same. the last one is better because it doesn't spawn an extra process
DarkSide almost 9 years

and with watch wc -l <filename> you can follow this file in real-time. That's useful for log files for example.
fedorqui almost 9 years

This answer was posted 3 years after the question was asked and it is just copying other ones. The first part is the trivial and the second is all ghostdog's answer was adding. Downvoting.
Tom Fenech almost 9 years

perl -lne '}{ print $. ' does the same.
Damien Roche about 8 years

4 years on.. downvoting. Let's see if we can get a decade long downvote streak!
ghoti over 7 years

None of the suggestions in this answer are actually bash answers. But while we're recommending other tools, you could avoid the whitespace by just using awk: awk 'END{print NR}' /dir/file.txt, or sed: sed -n '$=' /dir/file.txt. Or heck, if you wanted an actual bash solution, you could count the files in a loop! while read _; do ((n++)); done < /dir/file.txt; echo $n.
MarkHu over 7 years

Oddly sometimes the grep -c works better for me. Mainly due to wc -l annoying "feature" padding space prefix.
Zlemini over 7 years

Using the GNU grep -H argument returns filename and count. grep -Hc ".*" file
ggb667 over 7 years

No, you are wrong; ghostdog's answer does not answer the original question. It gives you the number of lines AND the filename. To get just the filename you can do: filename.wc -l < /filepath/filename.ext. Which is why I posted the answer. awk, sed and grep are all slightly inferior ways of doing this. The proper way is the one I listed.
MitchellK almost 7 years

So simple, thanks .... How could I write this into a variable inside a bash script so that I can collect line counts from various files and then use those variables later on in my script. So like $LINECOUNT1 (from file1.txt) $LINECOUNT2 (from file2.txt) etc ??? And then if I want to I can just take a sum of variable1 + variable2 +variable3 etc.
MitchellK almost 7 years

Never mind figured it out WC1=$(wc -l < file1.txt) WC2=$(wc -l < file2.txt)
Konstantin almost 7 years

Beware that wc -l counts "newlines". If you have a file with 2 lines of text and one "newline" symbol between them, wc will output "1" instead of "2".
Joshua Lawrence Austill almost 7 years

ls -l | wc -l will actually give you the number of files in the directory +1 for the total size line. you can do ls -ld * | wc -l to get the correct number of files.
Scott Joudry over 6 years

This is the first answer I have found that works with a file that has a single line of text that does not end in a newline, which wc -l reports as 0. Thank you.
asdf almost 6 years

@user85509 wc -l gives the number of new lines, which might be different from actual number of lines in a file. (Usually wc -l gives 1 less than actual no of lines)
sveti petar over 5 years

In a bash script, how do I assign the output of wc -l < /dir/file.txt to a variable?
Dragas over 5 years

@jovan I would use $() (evaluation) operator.
Theodore Murdock almost 5 years

@asdf Actually, wc -l usually gives the real number of lines in a compliant Linux text file. The last line in a file is always supposed to be \n, so that cat <file> prints the prompt on a new line, wc -l gives the right line count, etc. A lot of text editors (and IDEs) will always introduce a newline at the end of a text file when you save it for this reason. So you shouldn't assume you need to increment; if you care, you should check whether it's non-compliant (last char is not '\n'), and add one in that case.
growlingchaos over 4 years

I voted this solutions because wc -l counts newline characters and not the actual lines in a file. All the other commands included in this answer will give you the right number in case you need the lines.
Chiru over 4 years

This answer is not POSIX-compliant and can easily miscount lines. wc counts newlines, the character, and not lines. This will lead to miscounts if your EOF is not \n, which POSIX does not require. I've answered this in detail here.
jeb over 4 years

Where is the benefit of repeating the accepted (ten years old) answer?
Harsh Sarohi over 4 years

Because I couldn't find command to get only line numbers in output in this thread.
jeb over 4 years

It's the second example in the accepted answer. wc -l < filename
Harsh Sarohi over 4 years

wc -l < filename > gives filename as well as number of lines in output.
jeb over 4 years

No, wc -l < filename is different to wc -l filename, the first uses redirection and then there isn't any filename in the output, like shown in the answer from user85509
Nexonus about 3 years

Additionally when your last line does not end with an LF or CRLF wc -l gives out a wrong number of lines as it only counts line endings. So grep with a pattern like ^.*$ will actually give you the true line number.
Eric almost 3 years

awk used this way is 16 times slower than grep -c '^'
Eric almost 3 years

This should be the accepted asnwer. Not only because it is correct but also because grep is more that twice faster than wc.
smac89 almost 3 years

@Eric does grep also count the lines?
Eric almost 3 years

sure: grep -c -E ^ will count the number of "start of line" markers, hence the number of lines.
smac89 almost 3 years

@Eric Ah cool, cool. I was going to suggest you post that answer, but it looks like someone else already did so. Anyways, when I posted this answer, I just discovered awk, and this was one of the many things I discovered it could do. I also just tested with a 1GB file, and awk was only 4x slower, not 16x. I created the test file using base64 /dev/urandom | head -c 1000000000, but with smaller files (which is most likely what these answers will be used for), the speed is hardly variable
Eric almost 3 years

Yeah I get also a ratio of 4 with this sort of files. So depending on the file, yout mileage may vary. The point is that it's always in benefit of grep.
netrox over 2 years

Wow, this is a good answer. It needs to be the accepted answer because of good explanation and POSIX specs are clearly outlined.
kvantour over 2 years

Very nice: you might want to comment on this
miken32 over 2 years

Don't use xargs. The find command has an -exec verb that is much simpler to use. Someone already suggested its use 6 years ago, although this question does not ask anything about multiple files. stackoverflow.com/a/28016686