r - read.csv - skip rows with different number of columns
You could try:
read.csv(text=readLines('myfile.csv')[-(1:5)])
This will initially store each line in its own vector element, then drop the first five and treat the rest as a csv.
datavoredan
Economist by training, Data Scientist by trade. Working towards being a more complete data scientist through learning about the best tools available. Working to become proficient in: Object Oriented Programming Python (pandas etc + Django) R SQL
Updated on June 17, 2022Comments
-
datavoredan almost 2 years
There are 5 rows at the top of my csv file which serve as information about the file, which I do not need.
These information rows have only 2 columns, while the headers, and rows of data (from 6 on-wards) have 8. This appears to be the cause of the issue.
I have tried using the skip function within read.csv to skip these lines, and the same with read.table
df = read.csv("myfile.csv", skip=5) df = read.table("myfile.csv", skip=5)
but this still gives me the same error message, which is:
Error in read.table("myfile.csv", :empty beginning of file
In addition: Warning messages:
1: In readLines(file, skip) : line 1 appears to contain an embedded nul 2: In readLines(file, skip) : line 2 appears to contain an embedded nul ... 5: In readLines(file, skip) : line 5 appears to contain an embedded nul
How can I get this .csv to be read into r without the null values in the first 5 rows causing this issue?
-
Bono about 9 yearsThat only gets rid of the messages, but does not actually solves the problem.
-
Michal aka Miki about 7 yearsHow can you skip columns if you have some of them causing problems? Here one example stackoverflow.com/q/5788117/54964