How to Process Items in an Array in Parallel using Ruby (and open-uri)

10,609

Solution 1

There's also a gem called Parallel which is similar to Peach, but is actively updated.

Solution 2

I hope this gives you an idea:

def do_something(url, secs)
    sleep secs #just to see a difference
    puts "Done with: #{url}"
end

threads = []
urls_ary = ['url1', 'url2', 'url3']

urls_ary.each_with_index do |url, i|
    threads << Thread.new{ do_something(url, i+1) }
    puts "Out of loop #{i+1}"
end
threads.each{|t| t.join}

Perhaps creating a method for Array like:

class Array
    def thread_each(&block)
        inject([]){|threads,e| threads << Thread.new{yield(e)}}.each{|t| t.join}
    end
end

[1, 2, 3].thread_each do |i|
    sleep 4-i #so first one ends later
    puts "Done with #{i}"
end

Solution 3

module MultithreadedEach
  def multithreaded_each
    each_with_object([]) do |item, threads|
      threads << Thread.new { yield item }
    end.each { |thread| thread.join }
    self
  end
end

Usage:

arr = [1,2,3]

arr.extend(MultithreadedEach)

arr.multithreaded_each do |n|
  puts n # Each block runs in it's own thread
end

Solution 4

A simple method using threads:

threads = []

[1, 2, 3].each do |i|
  threads << Thread.new { puts i }
end

threads.each(&:join)
Share:
10,609
Mario Zigliotto
Author by

Mario Zigliotto

The Ruby language makes me an even happier person.

Updated on July 16, 2022

Comments

  • Mario Zigliotto
    Mario Zigliotto almost 2 years

    I am wondering how i can go about opening multiple concurrent connections using open-uri? i THINK I need to use threading or fibers some how but i'm not sure.

    Example code:

    def get_doc(url)
      begin
        Nokogiri::HTML(open(url).read)
      rescue Exception => ex
        puts "Failed at #{Time.now}"
        puts "Error: #{ex}"
      end
    end
    
    array_of_urls_to_process = [......]
    
    # How can I iterate over items in the array in parallel (instead of one at a time?)
    array_of_urls_to_process.each do |url|
      x = get_doc(url)
      do_something(x)
    end
    
  • Mike Atlas
    Mike Atlas about 9 years
    The gem is jruby only
  • kraftydevil
    kraftydevil over 5 years
    is this thread safe? see stackoverflow.com/questions/17765102/…
  • Joshua Pinter
    Joshua Pinter over 5 years
    This gem is dope AF. If you need to get the index, make sure to use each_with_index instead of the start or finish callbacks. It's 10x - 50x more performant.