How to Process Items in an Array in Parallel using Ruby (and open-uri)
10,609
Solution 1
There's also a gem called Parallel which is similar to Peach, but is actively updated.
Solution 2
I hope this gives you an idea:
def do_something(url, secs)
sleep secs #just to see a difference
puts "Done with: #{url}"
end
threads = []
urls_ary = ['url1', 'url2', 'url3']
urls_ary.each_with_index do |url, i|
threads << Thread.new{ do_something(url, i+1) }
puts "Out of loop #{i+1}"
end
threads.each{|t| t.join}
Perhaps creating a method for Array
like:
class Array
def thread_each(&block)
inject([]){|threads,e| threads << Thread.new{yield(e)}}.each{|t| t.join}
end
end
[1, 2, 3].thread_each do |i|
sleep 4-i #so first one ends later
puts "Done with #{i}"
end
Solution 3
module MultithreadedEach
def multithreaded_each
each_with_object([]) do |item, threads|
threads << Thread.new { yield item }
end.each { |thread| thread.join }
self
end
end
Usage:
arr = [1,2,3]
arr.extend(MultithreadedEach)
arr.multithreaded_each do |n|
puts n # Each block runs in it's own thread
end
Solution 4
A simple method using threads:
threads = []
[1, 2, 3].each do |i|
threads << Thread.new { puts i }
end
threads.each(&:join)
Author by
Mario Zigliotto
The Ruby language makes me an even happier person.
Updated on July 16, 2022Comments
-
Mario Zigliotto almost 2 years
I am wondering how i can go about opening multiple concurrent connections using open-uri? i THINK I need to use threading or fibers some how but i'm not sure.
Example code:
def get_doc(url) begin Nokogiri::HTML(open(url).read) rescue Exception => ex puts "Failed at #{Time.now}" puts "Error: #{ex}" end end array_of_urls_to_process = [......] # How can I iterate over items in the array in parallel (instead of one at a time?) array_of_urls_to_process.each do |url| x = get_doc(url) do_something(x) end
-
Mike Atlas about 9 yearsThe gem is jruby only
-
kraftydevil over 5 yearsis this thread safe? see stackoverflow.com/questions/17765102/…
-
Joshua Pinter over 5 yearsThis gem is dope AF. If you need to get the index, make sure to use
each_with_index
instead of thestart
orfinish
callbacks. It's 10x - 50x more performant.