Rails.cache error in Rails 3.1 - TypeError: can't dump hash with default proc

19,528

Solution 1

This might be a little verbose but I had to spend some time with the Rails source code to learn how the caching internals work. Writing things down aids my understanding and I figure that sharing some notes on how things work can't hurt. Skip to the end if you're in a hurry.


Why It Happens

This is the offending method inside ActiveSupport:

def should_compress?(value, options)
  if options[:compress] && value
    unless value.is_a?(Numeric)
      compress_threshold = options[:compress_threshold] || DEFAULT_COMPRESS_LIMIT
      serialized_value = value.is_a?(String) ? value : Marshal.dump(value)
      return true if serialized_value.size >= compress_threshold   
    end
  end
  false  
end

Note the assignment to serialized_value. If you poke around inside cache.rb, you'll see that it uses Marshal to serialize objects to byte strings before they go into the cache and then Marshal again to deserialize objects. The compression issue isn't important here, the important thing is the use of Marshal.

The problem is that:

Some objects cannot be dumped: if the objects to be dumped include bindings, procedure or method objects, instances of class IO, or singleton objects, a TypeError will be raised.

Some things have state (such as OS file descriptors or blocks) that can't be serialized by Marshal. The error you're noting is this:

can't dump hash with default proc

So someone in your model has an instance variable that is a Hash and that Hash uses a block to supply default values. The column_methods_hash method uses such a Hash and even caches the Hash inside @dynamic_methods_hash; column_methods_hash will be called (indirectly) by public methods such as respond_to? and method_missing.

One of respond_to? or method_missing will probably get called on every AR model instance sooner or later and calling either method makes your object unserializable. So, AR model instances are essentially unserializable in Rails 3.

Interestingly enough, the respond_to? and method_missing implementations in 2.3.8 are also backed by a Hash that uses a block for default values. The 2.3.8 cache is "[...]is meant for caching strings." so you were getting lucky with a backend that could handle whole objects or it used Marshal before your objects had hash-with-procs in them; or perhaps you were using the MemoryStore cache backend and that's little more than a big Hash.

Using multiple scope-with-lambdas might end up storing Procs in your AR objects; I'd expect the lambdas to be stored with the class (or singleton class) rather than the objects but I didn't bother with an analysis as the problem with respond_to? and method_missing makes the scope issue irrelevant.

What You Can Do About It

I think you've been storing the wrong things in your cache and getting lucky. You can either start using the Rails cache properly (i.e. store simple generated data rather than whole models) or you can implement the marshal_dump/marshal_load or _dump/_load methods as outlined in Marshal. Alternatively, you can use one of the MemoryStore backends and limit yourself to one distinct cache per server process.


Executive Summary

You can't depend on storing ActiveRecord model objects in the Rails cache unless you're prepared to handle the marshalling yourself or you want to limit yourself to the MemoryStore cache backends.


The exact source of the problem has changed in more recent versions of Rails but there are still many instances of default_procs associated with Hashes.

Solution 2

Thanks to mu-is-too-short for his excellent analysis. I've managed to get my model to serialize now with this:

def marshal_dump
  {}.merge(attributes)
end

def marshal_load stuff
  send :initialize, stuff, :without_protection => true
end

I also have some "virtual attributes" set by a direct SQL join query using AS e.g. SELECT DISTINCT posts.*, name from authors AS author_name FROM posts INNER JOIN authors ON author.post_id = posts.id WHERE posts.id = 123. For these to work I need to declare an attr_accessor for each, then dump/load them too, like so:

VIRTUAL_ATTRIBUTES = [:author_name]

attr_accessor *VIRTUAL_ATTRIBUTES

def marshal_dump
  virtual_attributes = Hash[VIRTUAL_ATTRIBUTES.map {|col| [col, self.send(col)] }]
  {}.with_indifferent_access.merge(attributes).merge(virtual_attributes)
end

def marshal_load stuff
  stuff = stuff.with_indifferent_access
  send :initialize, stuff, :without_protection => true
  VIRTUAL_ATTRIBUTES.each do |attribute|
    self.send("#{attribute}=", stuff[attribute])
  end
end

Using Rails 3.2.18

Solution 3

I realized that using where or some scope created ActiveRecord::Relation objects. I then noticed that doing a simple Model.find worked. I suspected that it didn't like the ActiveRecord::Relation object so I forced conversion to a plain Array and that worked for me.

Rails.cache.fetch([self.id, 'relA']) do
  relA.where(
      attr1: 'some_value'
  ).order(
      'attr2 DESC'
  ).includes(
      :rel_1,
      :rel_2
  ).decorate.to_a
end

Solution 4

just remove the default proc after you finished altering it. something like:

your_hash.default = nil # clear the default_proc
Share:
19,528

Related videos on Youtube

shedd
Author by

shedd

Updated on September 08, 2020

Comments

  • shedd
    shedd almost 4 years

    I running into an issue with the Rails.cache methods on 3.1.0.rc4 (ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-darwin10]). The code works fine within the same application on 2.3.12 (ruby 1.8.7 (2011-02-18 patchlevel 334) [i686-linux], MBARI 0x8770, Ruby Enterprise Edition 2011.03), but started returning an error following the upgrade. I haven't been able to figure out why yet.

    The error seems to occur when trying to cache objects that have more than one scope on them.

    Also, any scopes using lambdas fail regardless of how many scopes.

    I have hit failures from these patterns:

    Rails.cache.fetch("keyname", :expires_in => 1.minute) do
        Model.scope_with_lambda
    end
    
    
    Rails.cache.fetch("keyname", :expires_in => 1.minute) do
        Model.scope.scope
    end
    

    This is the error that I receive:

    TypeError: can't dump hash with default proc
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:627:in `dump'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:627:in `should_compress?'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:559:in `initialize'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:363:in `new'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:363:in `block in write'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:520:in `instrument'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:362:in `write'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/activesupport-3.1.0.rc4/lib/active_support/cache.rb:299:in `fetch'
        from (irb):62
        from /project/shared/bundled_gems/ruby/1.9.1/gems/railties-3.1.0.rc4/lib/rails/commands/console.rb:45:in `start'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/railties-3.1.0.rc4/lib/rails/commands/console.rb:8:in `start'
        from /project/shared/bundled_gems/ruby/1.9.1/gems/railties-3.1.0.rc4/lib/rails/commands.rb:40:in `<top (required)>'
        from script/rails:6:in `require'
        from script/rails:6:in `<main>'
    

    I have tried using the :raw => true option as an alternative, but that isn't working because the Rails.cache.fetch blocks are attempting to cache objects.

    Any suggestions? Thanks in advance!

    • Linus Oleander
      Linus Oleander about 13 years
      Why would you cache a scope? Wouldn't it be better to cache the actual data? Try adding the all method in the end of each scope. Model.scope_with_lambda.all
    • shedd
      shedd about 13 years
      @Oleander - yep, based on the response below, it seems we got lucky with storing the model objects into cache before. We'll recode to cache the data as opposed to the objects. Thanks for the thoughts!
  • shedd
    shedd about 13 years
    Thanks so much for the awesome response! I think your diagnosis is exactly correct - one of the parameters in the scope was pulling a value from a Hash. However, I think your larger point that our implementation of Rails.cache should be re-written to avoid storing the model objects directly into the cache. We did get lucky before, but we should use this as an opportunity to clean it up. I very much appreciate the detailed analysis - it was extremely helpful!
  • mixonic
    mixonic over 11 years
    Was this only caused by the Ruby version change? "Programming Ruby" says of Ruby Marshal "this binary format has one major disadvantage: if the interpreter changes significantly, the marshal binary format my also change, and old dumped files may no longer be loadable." So your 1.8 objects we likely rotten. So this should work with a binary safe backend.
  • mixonic
    mixonic over 11 years
    Oh, marshaling an object will also result in the objects data being serialized, which if it includes an un-marshallabe object will raise an exception. So I suppose Marshall can't even really be relied upon for many complex ActiveRecord objects.
  • mu is too short
    mu is too short over 11 years
    @mixonic: I think the caching change was the major culprit. AFAIK, the old cache was little more than a list of objects stored in memory as-is, the new one involved serialization. Marshal should only be used for temporary storage or transport, you should use something else for real persistence as the format can change and leave you without any pleasant way to upgrade.
  • cpuguy83
    cpuguy83 over 11 years
    I'm finding that an empty AR::Relation can't be dumped: TypeError: no _dump_data is defined for class Mutex
  • mu is too short
    mu is too short over 11 years
    @cpuguy83: I don't know what part of a relation is tied up with threading but I wouldn't expect any database-ish objects to be dumpable,
  • cpuguy83
    cpuguy83 over 11 years
    @muistooshort, it is weird... I can cache an AR::Relation as long as it's not empty.
  • ScotterC
    ScotterC almost 11 years
    I have found with ActiveRecord objects that if you clear their association_cache then Marshal Loading them becomes much more reliable. This is what was done with SimpleCacheable github.com/flyerhzm/simple_cacheable
  • Chris Bloom
    Chris Bloom over 10 years
    Is there a way to find out what attribute is causing the error? I have run into an issue where Marshal.dump Brand.first and Marshal.dump Brand.new work as expected, but Marshal.dump FactoryGirl.create(:brand) triggers an exception, even though the resulting object is valid and appears identical to all other brand objects. Other model instances instantiated with FactoryGirl.create can be dumped OK, but not all models apparently.
  • mu is too short
    mu is too short over 10 years
    @Chrisbloom7: My advice is to never Marshal AR objects, there are too many sharp edges to make it a sane thing to do. If you need to cache something then you'd be better off using a Struct with the same (read-only) interface as your model.
  • jtmarmon
    jtmarmon over 8 years
    FYI: I was running into this issue using the in memory cache. Turns out the in memory cache also marshalls the data github.com/rails/rails/blob/…