r - read.csv - skip rows with different number of columns

r csv null skip

13,205

You could try:

read.csv(text=readLines('myfile.csv')[-(1:5)])

This will initially store each line in its own vector element, then drop the first five and treat the rest as a csv.

13,205

Author by

datavoredan

Economist by training, Data Scientist by trade. Working towards being a more complete data scientist through learning about the best tools available. Working to become proficient in: Object Oriented Programming Python (pandas etc + Django) R SQL

Updated on June 17, 2022

Comments

datavoredan almost 2 years
There are 5 rows at the top of my csv file which serve as information about the file, which I do not need.

These information rows have only 2 columns, while the headers, and rows of data (from 6 on-wards) have 8. This appears to be the cause of the issue.

I have tried using the skip function within read.csv to skip these lines, and the same with read.table
```
df = read.csv("myfile.csv", skip=5)
df = read.table("myfile.csv", skip=5)
```
but this still gives me the same error message, which is:
```
Error in read.table("myfile.csv",  :empty beginning of file
```
In addition: Warning messages:
```
1: In readLines(file, skip) : line 1 appears to contain an embedded nul
2: In readLines(file, skip) : line 2 appears to contain an embedded nul
...
5: In readLines(file, skip) : line 5 appears to contain an embedded nul
```
How can I get this .csv to be read into r without the null values in the first 5 rows causing this issue?
Bono about 9 years

That only gets rid of the messages, but does not actually solves the problem.
Michal aka Miki about 7 years

How can you skip columns if you have some of them causing problems? Here one example stackoverflow.com/q/5788117/54964