In Scala, how to read a simple CSV file having a header in its first line?

63,215

Solution 1

You can just use drop:

val iter = src.getLines().drop(1).map(_.split(":"))

From the documentation:

def drop (n: Int) : Iterator[A]: Advances this iterator past the first n elements, or the length of the iterator, whichever is smaller.

Solution 2

Here's a CSV reader in Scala. Yikes.

Alternatively, you can look for a CSV reader in Java, and call that from Scala.

Parsing CSV files properly is not a trivial matter. Escaping quotes, for starters.

Solution 3

First I read the header line using take(1), and then the remaining lines are already in src iterator. This works fine for me.

val src = Source.fromFile(f).getLines

// assuming first line is a header
val headerLine = src.take(1).next

// processing remaining lines
for(l <- src) {
  // split line by comma and process them
  l.split(",").map { c => 
      // your logic here
  }
}
Share:
63,215
Ivan
Author by

Ivan

Updated on February 28, 2020

Comments

  • Ivan
    Ivan about 4 years

    The task is to look for a specific field (by it's number in line) value by a key field value in a simple CSV file (just commas as separators, no field-enclosing quotes, never a comma inside a field), having a header in its first line.

    User uynhjl has given an example (but with a different character as a separator):

    
    val src = Source.fromFile("/etc/passwd")
    val iter = src.getLines().map(_.split(":"))
    // print the uid for Guest
    iter.find(_(0) == "Guest") foreach (a => println(a(2)))
    // the rest of iter is not processed
    src.close()
    
    

    the question in this case is how to skip a header line from parsing?