"head" only printing one line?
Solution 1
I think it is line-ending related. Excel will save files with carriage return/line feed endings but head
will be expecting line feeds only.
What output does this display:
tr -d '\r' < messy.csv | head -10
If it displays the 10 lines correctly, that's your answer.
file
can tell you the line ending for certain text files (it will print ..., with CRLF line terminators), but it doesn't do that for all text files (I believe it doesn't do it if it recognises the file as being something else, eg HTML).
Solution 2
You have \r
only as the end-of-line character for lines 2 onwards (up to line 10 at least). Line 1 has \n
as the end-of-line character. eg.
printf 'ABC\nXYZ\r123\r' | head
output (to the screen)
ABC
This is a display artifact related to terminal output. The \r
kicks back to the start of the line and the next line overwrites it, and the last line gets overwritten –fully or partially– by the terminal prompt.
When the last \r
delimited line is longer than the prompt, then that line is partially revealed (beyond the end of the prompt) – eg, In the following sampel output, the terminal prompt is just nn $
(5 characters), where nn
is the n'th command issued).
72 $ printf 'ABC\nXYZ\rabcdefghijklmnop\r'
ABC
73 $ fghijklmnop
To fix it
sed -i.bak 's/\r$//; s/\r/\n/g' file
The -i.bak
option causes the input file
to be updated inline and makes a backup file.bak
. If you don't want a backup, just use -i
.
Solution 3
Analyse your problem
head
doesn't behave as you expect it.
Replace it by a simple analysis tool od
to see what is going on:
od -cx messy.csv
and then to see how head
deal with this file:
head -2 messy.csv | od -cx
You will notice that head
is dealing with the \r
return ASCII code
(0x0d
) as it was conceived for:
make the "carriage return" of original type writer. It does just bring back the current cursor position ready for the next position to write at "the beginning of line".
Fix it
See the correct sed
command here:
fix '\r' from an Excel file
For the record
This Microsoft bug is a winner one: this coding of Excel end of line is wrong for: Windows, Unix (all), MacOS X.
You can't outperform it :).
Related videos on Youtube
Richard
Updated on September 18, 2022Comments
-
Richard over 1 year
I've got a CSV file that's generated by saving as CSV from Excel. If I do "head" (or indeed "grep" or anything else) it only prints the first line:
head -n 10 messy.csv 10,15,11,21
But if I open the file in a text editor, or in Excel, it has many lines in it:
10,15,11,21 9,11,17,19 7,11,24,18 ...
head
works just fine on other files on the machine.Why is this? (I suspect it's something to do with line endings, but I don't know what.) And how can I fix it?
-
mjturner almost 9 yearsThe
-n <count>
option is included in the POSIX specification so most, if not all,head
variants should support it. -
Richard almost 9 yearsThanks. The
tr
outputs the whole file as one long string!file messy.csv
printsmessy.csv: ASCII English text, with CR line terminators
. -
mjturner almost 9 years@Richard Very strange that your file only has carriage returns! Try
tr '\r' '\n' < messy.csv |head -10
then -
Peter.O almost 9 yearsThis is almost certainly not a Windows issue. the behaviour described does not apply to a
\r\n
line ending.. It is probably an Excel for Mac issue – "Basically, saving a file as comma separated values (csv) uses a carriage return \r" – see Excel and line endings -
roaima almost 9 yearsMac systems use CR as line terminators. Fix the
tr
to swap CR for NL and it'll work.