How can I filter an array of hashes to get only the keys in another array?
Solution 1
This should do what you want:
events.map do |hash|
hash.select do |key, value|
[:id, :start].include? key
end
end
Potentially faster (but somewhat less pretty) solution:
events.map do |hash|
{ id: hash[:id], start: hash[:start] }
end
If you need return_keys
to be dynamic:
return_keys = [:id, :start]
events.map do |hash|
{}.tap do |new_hash|
return_keys.each do |key|
new_hash[key] = hash[key]
end
end
end
Note that, in your code, select
picks out elements from the array, since that's what you called it on, but doesn't change the hashes contained within the array.
If you're concerned about performance, I've benchmarked all of the solutions listed here (code):
user system total real
amarshall 1 0.140000 0.000000 0.140000 ( 0.140316)
amarshall 2 0.060000 0.000000 0.060000 ( 0.066409)
amarshall 3 0.100000 0.000000 0.100000 ( 0.101469)
tadman 1 0.140000 0.010000 0.150000 ( 0.145489)
tadman 2 0.110000 0.000000 0.110000 ( 0.111838)
mu 0.130000 0.000000 0.130000 ( 0.128688)
Solution 2
If you happen to be using Rails (or don't mind pulling in all or part of ActiveSupport) then you could use Hash#slice
:
return_array = events.map { |h| h.slice(:id, :start) }
Hash#slice
does some extra work under the covers but it is probably fast enough that you won't notice it for small hashes and the clarity is quite nice.
Solution 3
A better solution is to use a hash as your index instead of doing a linear array lookup for each key:
events = [{id:2, start:"3:30",break:30,num_attendees:14},{id:3, start:"3:40",break:40,num_attendees:4},{id:4, start:"4:40",break:10,num_attendees:40}]
return_keys = [ :id, :start ]
# Compute a quick hash to extract the right values: { key => true }
key_index = Hash[return_keys.collect { |key| [ key, true ] }]
return_array = events.collect do |event|
event.select do |key, value|
key_index[key]
end
end
# => [{:id=>2, :start=>"3:30"}, {:id=>3, :start=>"3:40"}, {:id=>4, :start=>"4:40"}]
I've adjusted this to use symbols as the key names to match your definition of events
.
This can be further improved by using the return_keys
as a direct driver:
events = [{id:2, start:"3:30",break:30,num_attendees:14},{id:3, start:"3:40",break:40,num_attendees:4},{id:4, start:"4:40",break:10,num_attendees:40}]
return_keys = [ :id, :start ]
return_array = events.collect do |event|
Hash[
return_keys.collect do |key|
[ key, event[key] ]
end
]
end
The result is the same. If the subset you're extracting tends to be much smaller than the original, this might be the best approach.
Solution 4
Considering that efficiency appears to be a concern, I would suggest the following.
Code
require 'set'
def keep_keys(arr, keeper_keys)
keepers = keeper_keys.to_set
arr.map { |h| h.select { |k,_| keepers.include?(k) } }
end
This uses Hash#select, which, unlike Enumerable#select, returns a hash. I've converted keeper_keys
to a set for fast lookups.
Examples
arr = [{ id:2, start: "3:30", break: 30 },
{ id: 3, break: 40, num_attendees: 4 },
{ break: 10, num_attendees: 40 }]
keep_keys arr, [:id, :start]
#=> [{:id=>2, :start=>"3:30"}, {:id=>3}, {}]
keep_keys arr, [:start, :break]
#=> [{:start=>"3:30", :break=>30}, {:break=>40}, {:break=>10}]
keep_keys arr, [:id, :start, :cat]
#=> [{:id=>2, :start=>"3:30"}, {:id=>3}, {}]
keep_keys arr, [:start]
#=> [{:start=>"3:30"}, {}, {}]
keep_keys arr, [:cat, :dog]
pedalpete
Originally from Whistler, Canada, now living in Bondi Beach, Aus. I like building interesting things, algorithms, UX/UI, getting into hardware and RaspberryPi.
Updated on October 28, 2020Comments
-
pedalpete over 3 years
I'm trying get a subset of keys for each hash in an array.
The hashes are actually much larger, but I figured this is easier to understand:
[ { id:2, start: "3:30", break: 30, num_attendees: 14 }, { id: 3, start: "3: 40", break: 40, num_attendees: 4 }, { id: 4, start: "4: 40", break: 10, num_attendees: 40 } ]
I want to get only the
id
andstart
values.I've tried:
return_keys = ['id','start'] return_array = events.select{|key,val| key.to_s.in? return_keys}
but this returns an empty array.
-
tadman over 12 yearsFor N keys in
events
and M keys in each hash, and P keys in the inner array, this performs at O(MNP) speed, which could be cripplingly slow. -
Andrew Marshall over 12 years@tadman Though, I suppose it's really O(NP)? I don't think there's anything faster than that. Assuming P is very small though, it shouldn't really affect the time complexity.
-
Andrew Marshall over 12 yearsI've also updated to include code for when
return_keys
needs to be dynamic. -
Andrew Marshall over 12 yearsActually, you need to
require 'active_support/core_ext'
if you're not in Rails. Core extensions need to be loaded explicitly so justrequire 'active_support
' doesn't work. (I say this because the latter is what most would consider "pulling in all of ActiveSupport".) -
Andrew Marshall over 12 yearsIn case you're curious, I benchmarked all the solutions here and posted the results in my answer
:)
. -
pedalpete over 12 yearsAwesome andrew, #2 is significantly faster, and I don't think that code is particularly un-pretty. I don't have the need for return keys to be dynamic at the moment, and the hashes can get pretty big, so I'll go for door #2.
-
tadman over 12 yearsNice work. #2 is the optimal solution if the keys selected are small and predictable. These probably have wildly different properties if the numbers involved grow large, e.g. N=10e6, M=100, P=50, but that is only an academic consideration if the values are known to be small.