Is order of a Ruby hash literal guaranteed?

ruby loops sorting hash literals

14,684

Solution 1

There are couple of locations where this could be specified, i.e. a couple of things that are considered "The Ruby Language Specification":

The ISO spec doesn't say anything about Hash ordering: it was written in such a way that all existing Ruby implementations are automatically compliant with it, without having to change, i.e. it was written to be descriptive of current Ruby implementations, not prescriptive. At the time the spec was written, those implementations included MRI, YARV, Rubinius, JRuby, IronRuby, MagLev, MacRuby, XRuby, Ruby.NET, Cardinal, tinyrb, RubyGoLightly, SmallRuby, BlueRuby, and others. Of particular interest are MRI (which only implements 1.8) and YARV (which only implements 1.9 (at the time)), which means that the spec can only specify behavior which is common to 1.8 and 1.9, which Hash ordering is not.

The RubySpec project was abandoned by its developers out of frustration that the ruby-core developers and YARV developers never recognized it. It does, however, (implicitly) specify that Hash literals are ordered left-to-right:

new_hash(1 => 2, 4 => 8, 2 => 4).keys.should == [1, 4, 2]

That's the spec for Hash#keys, however, the other specs test that Hash#values has the same order as Hash#keys, Hash#each_value and Hash#each_key has the same order as those, and Hash#each_pair and Hash#each have the same order as well.

I couldn't find anything in the YARV testsuite that specifies that ordering is preserved. In fact, I couldn't find anything at all about ordering in that testsuite, quite the opposite: the tests go to great length to avoid depending on ordering!

The Flanagan/matz book kinda-sorta implicitly specifies Hash literal ordering in section 9.5.3.6 Hash iterators. First, it uses much the same formulation as the docs:

In Ruby 1.9, however, hash elements are iterated in their insertion order, […]

But then it goes on:

[…], and that is the order shown in the following examples:

And in those examples, it actually uses a literal:

h = { :a=>1, :b=>2, :c=>3 }

# The each() iterator iterates [key,value] pairs
h.each {|pair| print pair }    # Prints "[:a, 1][:b, 2][:c, 3]"

# It also works with two block arguments
h.each do |key, value|                
  print "#{key}:#{value} "     # Prints "a:1 b:2 c:3" 
end

# Iterate over keys or values or both
h.each_key {|k| print k }      # Prints "abc"
h.each_value {|v| print v }    # Prints "123"
h.each_pair {|k,v| print k,v } # Prints "a1b2c3". Like each

In his comment, @mu is too short mentioned that

h = { a: 1, b: 2 } is the same as h = { }; h[:a] = 1; h[:b] = 2

and in another comment that

nothing else would make any sense

Unfortunately, that is not true:

module HashASETWithLogging
  def []=(key, value)
    puts "[]= was called with [#{key.inspect}] = #{value.inspect}"
    super
  end
end

class Hash
  prepend HashASETWithLogging
end

h = { a: 1, b: 2 }
# prints nothing

h = { }; h[:a] = 1; h[:b] = 2
# []= was called with [:a] = 1
# []= was called with [:b] = 2

So, depending on how you interpret that line from the book and depending on how "specification-ish" you judge that book, yes, ordering of literals is guaranteed.

Solution 2

From the documentation:

Hashes enumerate their values in the order that the corresponding keys were inserted.

14,684

Author by

mahemoff

Home http://mahemoff.com GitHub https://github.com/mahemoff Blog http://softwareas.com Twitter @mahemoff LinkedIn Mahemoff

Updated on June 05, 2022

Comments

mahemoff about 2 years

Ruby, since v1.9, supports a deterministic order when looping through a hash; entries added first will be returned first.

Does this apply to literals, i.e. will { a: 1, b: 2 } always yield a before b?

I did a quick experiment with Ruby 2.1 (MRI) and it was in fact consistent, but to what extent is this guaranteed by the language to work on all Ruby implementations?
- Cary Swoveland almost 9 years
  
  Readers, including me, are asking themselves: "what does the order of a hash's keys have to do with the types of objects they are?".
- mu is too short almost 9 years
  
  h = { a: 1, b: 2 } is the same as h = { }; h[:a] = 1; h[:b] = 2 so yes. Finding a specification that says that is another story.
- mahemoff almost 9 years
  
  @muistooshort That'a presumption you've made without citing any evidence.
- mu is too short almost 9 years
  
  Go find a Ruby specification and I'll point out the relevant section.
- mahemoff almost 9 years
  
  Sorry I meant the first part was a presumption about any given implementation ("h = { a: 1, b: 2 } is the same as h = { }; h[:a] = 1; h[:b] = 2"). I agree there's no formal spec afaik, closest thing is probably the MRI tests and any statements from project leadership.
- mu is too short almost 9 years
  
  I say that those two version are equivalent because nothing else would make any sense. There are even things that depend on that being true. End of the day, "Ruby" really means "whatever MRI does".
- Jörg W Mittag almost 9 years
  
  @muistooshort: You can easily test your assumption by monkey-patching Hash and replacing its []= method with one that logs its execution, and you will see that the two forms are most definitely not equivalent.
mahemoff almost 9 years

Yes, as the question mentions, but I'm referring to literal notation.
frostmatthew almost 9 years

There is nothing different/special about literal notation. It's adding them (in the order provided) to the hash (thus will be enumerated in that same order), see github.com/ruby/ruby/blob/trunk/hash.c#L550-L633
mahemoff almost 9 years

That's one Ruby implementation. I'm not aware there are any tests for this. Upvoted for the code ref anyway.
PJP almost 9 years

Why would literal notation make a difference? It's not like a literal notation creates a different kind of hash, it's still a hash.
mahemoff almost 9 years

The Ruby parser has to parse the literal. While the most obvious way to do that is go from top-to-bottom, there could be any number of optimisation-related reasons why it could do so in a different order. Also, the Ruby 1.9 order rule applies to Ruby code; it doesn't necessarily apply to what Ruby's internals may do with a hash (ie even if Ruby inserts from top to bottom, it doesn't mean the order is preserved).
mahemoff almost 9 years

Thanks for this detailed answer.
mu is too short almost 9 years

They're functionally equivalent if you insist on picking nits. Hash literals are most likely handled in C in MRI to avoid the overhead of having to call []= over and over again; they're equivalent as far as ordering goes and that's all that matters here. No other handling of hash literals makes any sense given that hashes are ordered.
mlt over 3 years

The particular implementation link is old and dead. Here is the newer one. Search for rb_hash_s_create if it goes away.