Convert an array into an index hash in Ruby
Solution 1
If all you need the hash for is membership, consider using a Set
:
Set
Set implements a collection of unordered values with no duplicates. This is a hybrid of Array's intuitive inter-operation facilities and Hash's fast lookup.
Set is easy to use with Enumerable objects (implementing
each
). Most of the initializer methods and binary operators accept generic Enumerable objects besides sets and arrays. An Enumerable object can be converted to Set using theto_set
method.Set uses Hash as storage, so you must note the following points:
- Equality of elements is determined according to
Object#eql?
andObject#hash
.- Set assumes that the identity of each element does not change while it is stored. Modifying an element of a set will render the set to an unreliable state.
- When a string is to be stored, a frozen copy of the string is stored instead unless the original string is already frozen.
Comparison
The comparison operators
<
,>
,<=
and>=
are implemented as shorthand for the {proper_,}{subset?,superset?} methods. However, the<=>
operator is intentionally left out because not every pair of sets is comparable. ({x,y} vs. {x,z} for example)Example
require 'set' s1 = Set.new [1, 2] # -> #<Set: {1, 2}> s2 = [1, 2].to_set # -> #<Set: {1, 2}> s1 == s2 # -> true s1.add("foo") # -> #<Set: {1, 2, "foo"}> s1.merge([2, 6]) # -> #<Set: {1, 2, "foo", 6}> s1.subset? s2 # -> false s2.subset? s1 # -> true
[...]
Public Class Methods
new(enum = nil)
Creates a new set containing the elements of the given enumerable object.
If a block is given, the elements of enum are preprocessed by the given block.
Solution 2
try this one:
a=[1,2,3]
Hash[a.zip]
Solution 3
You can do this very handy trick:
Hash[*[1, 2, 3, 4].map {|k| [k, nil]}.flatten]
=> {1=>nil, 2=>nil, 3=>nil, 4=>nil}
Solution 4
If you want to quickly ask "is X in the array?" you should use Array#include?
.
Edit (in response to addition in OP):
If you want speedy look up times, use a Set. Having a Hash that points to all nil
s is silly. Conversion is an easy process too with Array#to_set
.
require 'benchmark'
require 'set'
array = (1..1_000_000).to_a
set = array.to_set
Benchmark.bm(15) do |x|
x.report("Array.include?") { 1000.times { array.include?(500_000) } }
x.report("Set.include?") { 1000.times { set.include?(500_000) } }
end
Results on my machine:
user system total real
Array.include? 36.200000 0.140000 36.340000 ( 36.740605)
Set.include? 0.000000 0.000000 0.000000 ( 0.000515)
You should consider just using a set to begin with, instead of an array so that a conversion is never necessary.
Solution 5
I'm fairly certain that there isn't a one-shot clever way to construct this hash. My inclination would be to just be explicit and state what I'm doing:
hash = {}
array.each{|x| hash[x] = nil}
It doesn't look particularly elegant, but it's clear, and does the job.
FWIW, your original suggestion (under Ruby 1.8.6 at least) doesn't seem to work. I get an "ArgumentError: odd number of arguments for Hash" error. Hash.[] expects a literal, even-lengthed list of values:
Hash[a, 1, b, 2] # => {a => 1, b => 2}
so I tried changing your code to:
hash = Hash[*array.map {|x| [x, nil]}.flatten]
but the performance is dire:
#!/usr/bin/ruby -w
require 'benchmark'
array = (1..100_000).to_a
Benchmark.bm(15) do |x|
x.report("assignment loop") {hash = {}; array.each{|e| hash[e] = nil}}
x.report("hash constructor") {hash = Hash[*array.map {|e| [e, nil]}.flatten]}
end
gives
user system total real
assignment loop 0.440000 0.200000 0.640000 ( 0.657287)
hash constructor 4.440000 0.250000 4.690000 ( 4.758663)
Unless I'm missing something here, a simple assignment loop seems the clearest and most efficient way to construct this hash.
Related videos on Youtube
derobert
Updated on August 19, 2020Comments
-
derobert almost 4 years
I have an array, and I want to make a hash so I can quickly ask "is X in the array?".
In perl, there is an easy (and fast) way to do this:
my @array = qw( 1 2 3 ); my %hash; @hash{@array} = undef;
This generates a hash that looks like:
{ 1 => undef, 2 => undef, 3 => undef, }
The best I've come up with in Ruby is:
array = [1, 2, 3] hash = Hash[array.map {|x| [x, nil]}]
which gives:
{1=>nil, 2=>nil, 3=>nil}
Is there a better Ruby way?
EDIT 1
No, Array.include? is not a good idea. Its slow. It does a query in O(n) instead of O(1). My example array had three elements for brevity; assume the actual one has a million elements. Let's do a little benchmarking:
#!/usr/bin/ruby -w require 'benchmark' array = (1..1_000_000).to_a hash = Hash[array.map {|x| [x, nil]}] Benchmark.bm(15) do |x| x.report("Array.include?") { 1000.times { array.include?(500_000) } } x.report("Hash.include?") { 1000.times { hash.include?(500_000) } } end
Produces:
user system total real Array.include? 46.190000 0.160000 46.350000 ( 46.593477) Hash.include? 0.000000 0.000000 0.000000 ( 0.000523)
-
Sean Vikoren over 12 yearsDon't forget to consider the time it takes to do the conversion. Of course, if your situation allows, using a set to begin with (as suggested by @Zach Langley) obviates this cost.
-
drhenner almost 8 yearsTo be fair the benchmark above should include converting from an array to a hash
-
derobert almost 8 years@drhenner in theory, sure. In practice, not really—it's basically irrelevant. The conversion is done once, the lookup many, many, many times. I forget what I was working on when I asked this, but in the actual program the lookup was probably done many millions of times, after converting once.
-
-
derobert over 15 yearsPlease see my (newly added) section on why not Array#include? in the question.
-
derobert over 15 yearsNeat approach, but obscenely slow; on 1..1_000_000: 1.4s (my map code) vs. got-bored-waiting after 3 minutes.
-
derobert over 15 yearsEmpirically, that seems to run in O(n^2)
-
derobert over 15 yearsI realized I left off the closing ] in the code... Fixed. It works with ruby 1.8.7 here, and also 1.9.0
-
derobert over 15 yearsThe hash load is so quick, this would hardly ever make sense, unless you're really short on memory. The entire Hash load is O(n), a single a.include? call is also O(n).
-
zenazn over 15 yearsIn this case, yes, but this can be generalized to memoize any function of your choice.
-
Zach Langley over 15 yearsFor Ruby 1.8.6, you need the splat, i.e., hash = Hash[*array.map { |x| [x, nil] }.flatten]
-
covard over 11 yearsThis worked great for me. I changed it to be [k,k] instead of [k, nil]. Thanks for this post
-
denis.peplin over 9 yearsPerfect! Most elegant way.
-
l3thal about 8 yearsWhy not just do
Hash[ [1, 2, 3, 4].map {|k| [k, nil]} ]
? There's no need for the extra splat and flatten. -
drhenner almost 8 yearsUse merge! should be much faster
-
drhenner almost 8 yearsarray.inject(Hash.new) { |h,i| h.merge!({i => nil}) } turns out to be pretty fast but still
array.each{|e| hash[e] = nil}
is 3.5 times faster -
Alex Pan over 7 years
Hash[a.zip]
also returns the same response. -
Magne over 6 years"this certainly works... but i don't think it's the most efficient... you're iterating twice in this case. Not a concern of course with an array of length 2, but still worth noting." @brad from stackoverflow.com/a/9434078/380607