In Rails, what's the difference between find_each and where?

44,564

An active record relation does not automatically load all records into memory.

When you call #each, all records will be loaded into memory. When you call #find_each, records will be loaded into memory in batches of the given batch size.

So when your query returns a number of records that would be too much memory for the server's available resources, then using #find_each would be a great choice.

It's basically like using ruby's lazy enumeration #to_enum#lazy with #each_slice and then #each (very convenient).

Share:
44,564
coderz
Author by

coderz

Beginner.

Updated on July 05, 2022

Comments

  • coderz
    coderz almost 2 years

    In Rails, both find_each and where are used for retrieving data from Database supported by ActiveRecord.

    You can pass your query condition to where, like:

    c = Category.where(:name => 'Ruby', :position => 1)
    

    And you can pass batch size to find_each, like:

    Hedgehog.find_each(batch_size: 50).map{ |p| p.to_json }
    

    But what's the difference between the following 2 code?

    # code 1
    Person.where("age > 21").find_each(batch_size: 50) do |person|
      # processing
    end
    
    # code 2
    Person.where("age > 21").each do |person|
      # processing
    end
    

    Does code 1 batch retrieve 50 tuples each time, and code 2 retrieve all tuples in one time? More details explaination is welcomed.

    My opinion is:

    1. both where and find_each can be used for batch retrieving, but user can define batch size when using find_each.
    2. find_each does not support passing query condition.

    Please correct me if my understanding is wrong.

  • coderz
    coderz about 9 years
    So code 1 may execute SQL multiple times(according to records size and batch size), code 2 only execute SQL once?
  • Admin
    Admin about 9 years
    Yes that is my understanding. If you look at your development log or look at the sql output in rails console you'll see something like Users.for_each(batch_size) {|u| } SELECT "users"."*" FROM "users" WHERE ("users"."id" > 51) SELECT "users"."*" FROM "users" WHERE ("users"."id" > 51) LIMIT 50 ... and so on
  • Jin Lim
    Jin Lim about 2 years
    users = User.where(:birth_day < Date.today) if we call this line, we didn't call #each. but are you sure that we don't load all data into users variable?