Count the length (number of lines) of a CSV file?

29,034

Solution 1

another way to read the number of lines is

file.readlines.size

Solution 2

All of the solutions listed here actually load the entire file into memory in order to get the number of lines. If you're on a Unix-based system a much faster, easier and memory-efficient solution is:

`wc -l #{your_file_path}`.to_i

Solution 3

.length and .size are actually synonyms. to get the rowcount of the csv file you have to actually parse it. simply counting the newlines in the file won't work, because string fields in a csv can actually have linebreaks. a simple way to get the linecount would be:

CSV.read(params[:upcsv][:filename]).length

Solution 4

CSV.foreach(file_path, headers: true).count

Above will exclue header while counting rows

CSV.read(file_path).count

Solution 5

your_csv.count should do the trick.

Share:
29,034
Mathias
Author by

Mathias

Updated on November 06, 2020

Comments

  • Mathias
    Mathias over 3 years

    I have a form (Rails) which allows me to load a .csv file using the file_field. In the view:

        <% form_for(:upcsv, :html => {:multipart => true}) do |f| %>
        <table>
            <tr>
                <td><%= f.label("File:") %></td>
                <td><%= f.file_field(:filename) %></td>
            </tr>
        </table>
            <%= f.submit("Submit") %>
        <% end %>
    

    Clicking Submit redirects me to another page (create.html.erb). The file was loaded fine, and I was able to read the contents just fine in this second page. I am trying to show the number of lines in the .csv file in this second page.

    My controller (semi-pseudocode):

    class UpcsvController < ApplicationController
        def index
        end
    
        def create
            file = params[:upcsv][:filename]
            ...
            #params[:upcsv][:file_length] = file.length # Show number of lines in the file
            #params[:upcsv][:file_length] = file.size
            ...
        end
    end
    

    Both file.length and file.size returns '91' when my file only contains 7 lines. From the Rails documentation that I read, once the Submit button is clicked, Rails creates a temp file of the uploaded file, and the params[:upcsv][:filename] contains the contents of the temp/uploaded file and not the path to the file. And I don't know how to extract the number of lines in my original file. What is the correct way to get the number of lines in the file?

    My create.html.erb:

    <table>
        <tr>
            <td>File length:</td>
            <td><%= params[:upcsv][:file_length] %></td>
        </tr>
    </table>
    

    I'm really new at Rails (just started last week), so please bear with my stupid questions.

    Thank you!

    Update: apparently that number '91' is the number of individual characters (including carriage return) in my file. Each line in my file has 12 digits + 1 newline = 13. 91/13 = 7.

  • Mathias
    Mathias over 13 years
    Thanks, guys! Alas, now I'm getting "can't convert Tempfile into String". This is the Request parameter: {"commit"=>"Submit","authenticity_token"=>"<-removed->","upc‌​sv"=>{"filename"=>#<‌​File:/tmp/RackMultip‌​art20110111-14030-14‌​2mv1a-0>}} Is there any way that I can evaluate the actual .csv file rather than this Tempfile?
  • Mathias
    Mathias over 13 years
    Hey, that actually works! However, Rails deleted the Tempfile after I run that line so I can't process the contents of the file...weird behavior. Thank you!
  • cam
    cam over 13 years
    @Mathias, are you sure that the Tempfile is deleted? I suspect you just need to rewind (file.seek(0))
  • Mathias
    Mathias over 13 years
    @cam, I suspect the file was deleted after I do any read on it, since if I added some kind of line-counting codes before my main code (to process the data), my main code fails (although now I forgot what the error was). I'm pretty sure my line-counting code was not destructive by itself. So I suspect it's just the way rails work with tempfiles. But I do find that somewhat strange...
  • Zero Dragon
    Zero Dragon almost 12 years
    I actually had the same problem as @Mathias just added file.seek(0) between the file.readlines.size and the main code. That did the trick :D
  • boulder_ruby
    boulder_ruby almost 12 years
    ** to do the above in rails, something like this (" config.autoload_paths += Dir["#{config.root}/lib/**/"]") must be added to config/application.rb
  • chetang
    chetang about 8 years
    CSV.read(file_path, headers: true).count should also return count excluding header
  • user1051849
    user1051849 over 7 years
    just in case anyone else need this, you can get the file object by using: file = open("/yourpath/file.csv")
  • CanadianGirl827x
    CanadianGirl827x over 6 years
    It's a good idea, but readlines returns an enumerator, so it shouldn't read the whole thing into memory, anyway.
  • CanadianGirl827x
    CanadianGirl827x over 6 years
    A row in a CSV can contain newlines, you need to actually parse it.
  • CanadianGirl827x
    CanadianGirl827x over 6 years
    A row in a CSV can contain newlines, you need to actually parse it.
  • prograils
    prograils over 5 years
    I've ckecked in on a CSV with 100k+ lines - it works with no problem.