ruby: how to require correctly (to avoid circular dependencies)

11,101

Solution 1

After asking about this on the Ruby mailing list a while back, when I used to have a file in my libraries just for requiring things, I changed to these two rules:

  1. If a file needs code from another in the same library, I use require_relative in the file that needs the code.

  2. If a file needs code from a different library, I use require in the file that needs the code.

As far as I understand it, Ruby requires in the order it is asked to, and so it doesn't matter about circular dependencies.

(Ruby v1.9.2)

In answer to the comment about the example showing circular dependency problems:

actually, the problem with the example isn't that the requires are circular, but that B.calling is called before the requires have completed. If you remove the B.calling from b.rb it works fine. For example, in irb without B.calling in the code file but run afterwards:

$ irb
require '/Volumes/RubyProjects/Test/stackoverflow8057625/b.rb'
=> true
B.calling
doing..
=> nil

Solution 2

A couple of basic things that you hopefully already know:

  1. Ruby is interpreted, not compiled, so you can't execute any code that hasn't been seen by the interpreter.

  2. require just inserts the code from the file into that point of the program, in other words, a require at the top of the program will be interpreted before a require at the bottom.

(Note: Edited to account for require statements behavior) So if you were to do: ruby a.rb this is what the ruby interpreter would see and execute:

#load file b.rb <- from require './b.rb' in 'a.rb' file

#load file a.rb <- from require './a.rb' in 'b.rb' file
  #this runs because a.rb has not yet been required

#second attempt to load b.rb but is ignored <- from require './b.rb' in 'a.rb' file

#finish loading the rest of a.rb

module A
  def self.do_something
    puts 'doing..'
  end
end

#finish loading the rest of b.rb

module B
  def self.calling
    ::A.do_something
  end
end
B.calling

#Works because everything is defined 

If instead you ran b first, ruby b.rb, the interpreter would see:

#load file a.rb <- from require './a.rb' in 'b.rb' file

#load file b.rb <- from require './b.rb' in 'a.rb' file
  #this runs because b.rb has not yet been required

#second attempt to load a.rb but is ignored <- from require './a.rb' in 'b.rb' file

#finish loading the rest of b.rb
module B
  def self.calling
    ::A.do_something
  end
end
B.calling #NameError, ::A.do_something hasn't been defined yet.

Hopefully this explains the good answers the others have given you, and if you think about it, why it's hard to answer your last question about where to put require statements. With Ruby, you're requiring files not modules, so where you put the require in your code, depends on how your files are organized.

If you absolutely need to be able to have modules defined and methods execute in random order, then you could implement something like this to collect calls on modules that don't yet exist, and then call them when they pop into being.

module Delay
  @@q = {}  
  def self.call_mod(*args) #args format is method_name, mod_name, *args
    mod_name = args.shift
    method_name = args.shift
    #remaining args are still in args
    mod = Object.const_get(mod_name.to_sym)
    mod.send(method_name.to_sym, *args)
  end

  def self.exec(mod_name, *args)
    begin
      args.unshift(mod_name)
      self.call_mod(*args)
    rescue NameError, NoMethodError
      @@q[mod_name] ||= []
      @@q[mod_name] << args
    end
  end

  def self.included(mod)
    #get queued methods
    q_list = @@q[mod.name.to_sym]
    return unless q_list
    #execute delayed methods now that module exists
    q_list.each do |args|
      self.call_mod(*args)
    end
  end
end 

Be sure to define the Delay module first and then rather than calling B.calling you would use Delay.exec(:B, :calling, any_other_args). So if you have this after the Delay module:

Delay.exec(:B, :calling)   #Module B is not defined

module B
  def self.calling
    ::A.do_something
  end
  include Delay #must be *after* any callable method defs
end

module A
  def self.do_something
    puts 'doing..'
  end
  include Delay #must be *after* any callable method defs
end

Results in:

#=> doing..

Final step is to break the code up into files. One approach could be to have three files

delay.rb   #holds just Delay module
a.rb       #holds the A module and any calls to other modules 
b.rb       #holds the B module and any calls to other modules

As long as you make sure require 'delay' is the first line of the module files (a.rb and b.rb) and Delay included at the end of the module, things should work.

Final Note: This implementation only makes sense if you cannot decouple your definition code from the module execution calls.

Share:
11,101

Related videos on Youtube

user573335
Author by

user573335

Updated on June 04, 2022

Comments

  • user573335
    user573335 almost 2 years

    today i was facing a strange problem: got a 'missing method' error on a module, but the method was there and the file where the module was defined was required. After some searching i found a circular dependency, where 2 files required each other, and now i assume ruby silently aborts circular requires.


    Edit Begin: Example

    File 'a.rb':

    require './b.rb'
    
    module A
        def self.do_something
            puts 'doing..'
        end
    end
    

    File 'b.rb':

    require './a.rb'
    
    module B
        def self.calling
            ::A.do_something
        end
    end
    
    B.calling
    

    Executing b.rb gives b.rb:5:in 'calling': uninitialized constant A (NameError). The requires have to be there for both files as they are intended to be run on their own from command line (i ommitted that code to keep it short). So the B.calling has to be there. One possible solution is to wrap the requires in if __FILE__ == $0, but that does not seem the right way to go.

    Edit End


    to avoid these hard-to-find errors (wouldn't it be nicer if the require threw an exception, by the way?), are there some guidelines/rules on how to structure a project and where to require what? For example, if i have

    module MainModule
      module SubModule
        module SubSubModule
        end
      end
    end
    

    where should i require the submodules? all in the main, or only the sub in the main and the subsub in the sub?

    any help would be very nice.

    Summary

    An explanation why this happens is discussed in forforfs answer and comments.

    So far best practice (as pointed out or hinted to by lain) seems to be the following (please correct me if i'm wrong):

    1. put every module or class in the top namespace in a file named after the module/class. in my example this would be 1 file named 'main_module.rb.' if there are submodules or subclasses, create a directory named after the module/class (in my example a directory 'main_module', and put the files for the subclasses/submodules in there (in the example 1 file named 'sub_module.rb'). repeat this for every level of your namespace.
    2. require step-by-step (in the example, the MainModule would require the SubModule, and the Submodule would require the SubSubModule)
    3. separate 'running' code from 'defining' code. in the running code require once your top-level module/class, so because of 2. all your library functionality should now be available, and you can run any defined methods.

    thanks to everyone who answered/commented, it helped me a lot!

    • sheldonh
      sheldonh over 12 years
      Your example doesn't produce a NameError; it works for me. Looks like you've oversimplified your code to the point where it fails to demonstrate your problem. My answer to your question about load order would have been your example code! :-)
  • Phrogz
    Phrogz over 12 years
    Further, note that (unlike load) multiple requires to the same file will only result in a single loading of the file.
  • user573335
    user573335 over 12 years
    thanks for the answers. but as you can try with the example i added, there are troubles with circular requires.
  • user573335
    user573335 over 12 years
    Yes, that works, but these files should be runnable on their own, so the B.calling has to be there (see edit). Or is that just a very bad idea which only causes trouble?
  • ian
    ian over 12 years
    As @sheldonh says above, the code is so oversimplified now that if I had to comment on it I'd just say don't write it like that. If both files need to be run separately but share behaviour then I'd suggest taking that shared behaviour and putting it in a third file that both require. I also tend to keep "library" type code (stuff that doesn't run on its own) separate from "runnable" type code (stuff that does), so the B.calling would be in its own file.
  • user573335
    user573335 over 12 years
    ok, thanks, i think this is good advice. from now on i will separate the scripts that use the libraries from the libraries themselves. do you have any suggestions concerning the last part of my question?
  • ian
    ian over 12 years
    Most ruby libs tend to have a very flat namespace, so forget the Java/C#ish way of nesting everything. A module as a namespace for the library to wrap everything is normally enough and then modules to do things. I'd suggest you take a look at some gems, ones for small projects are good, so you get a feel for it. Take a look at github.com/yb66/Cicero or github.com/yb66/RandomPerson to see a fairly standard layout, or just have a browse of rubygems.org, and perhaps look at bigger libs like Sequel or Sinatra, which are particularly well thought out.
  • user573335
    user573335 over 12 years
    thanks again for all your help! as my current project seems to be not so small i was having a look at Sequel and some other, bigger projects, and most of the time they seem to require step-by-step (in my example, the MainModule would require the SubModule, and the Submodule would require the SubSubModule), which i like and will try to use from now on.
  • user573335
    user573335 over 12 years
    thanks for your explanations. i always get a little confused over what really happens on a circular require. at the moment it looks like this to me: on running b.rb, the parser encounters require 'a.rb and starts parsing a.rb. there it encounters require 'a.rb and, since b.rb has not been required yet, starts parsing b.rb. there is again a require 'a.rb, but a.rb has already been required, so it moves on in b.rb. at the end it executes B.calling, which tries to execute ::A.do_something. this fails because the parsing of a.rb has not yet been completed (or has been aborted comletely?)
  • forforf
    forforf over 12 years
    Almost, I think the thing you are missing is that require only loads the file once. So it goes like this:
  • forforf
    forforf over 12 years
    (oh, also keep in mind that running and parsing are essentially the same thing in Ruby). So on running b.rb, the parser encounters require 'a.rb' and starts parsing a.rb. The first thing encountered is require 'b.rb', but b.rb is already in the parser, so require 'b.rb' is ignored. The rest of a.rb defines module A, then we go to the rest of b.rb which defines module B and then runs B.calling (which works). However, if you start with a.rb instead of b.rb, things don't work.
  • user573335
    user573335 over 12 years
    running b.rb does not work! (it produces a NameError. however, running a.rb 'works', it produces 'doing..'). that's what confused me at first. i think running is different from requiring (try puts 'c'; require './c.rb' in a file named c.rb. if you run it, you get 'c' twice!), which is the main cause for this trouble.
  • forforf
    forforf over 12 years
    Ah sorry, yes the require behavior is different. I edited the original answer to reflect this more accurately.