Difference between INCLUDE and modules in Fortran

module include fortran

17,177

Solution 1

The conceptual differences between the two map through to very significant practical differences.

An INCLUDE line operates at the source level - it accomplishes simple ("dumb") text inclusion. In the absence of any special processor interpretation of the "filename" (no requirement for that actually to be a file) in the include line the complete source could quite easily be manually spliced together by the programmer and fed to the compiler with no difference what-so-ever in the semantics of the source. Included source has no real interpretation in isolation - its meaning is completely dependent on the context in which the include line that references the included source appears.

Modules operate at the much higher entity level of the program, i.e. at the level where the compiler is considering the things that the source actually describes. A module can be compiled in isolation of its downstream users and once it has been compiled the compiler knows exactly what things the module can provide to the program.

Typically what someone using include lines is hoping to do is what modules were actually designed to do.

Example issues:

Because entity declarations can be spread over multiple statements the entities described by included source might not be what you expect. Consider the following source to be included:

INTEGER :: i

In isolation it looks like this declares the name i as an integer scalar (or perhaps a function? Who knows!). Now consider the following scope that includes the above:

INCLUDE "source from above"
DIMENSION :: i(10,10)

i is now a rank two array! Perhaps you want to make it a POINTER? An ALLOCATABLE? A dummy argument? Perhaps that results in an error, or perhaps it is valid source! Throw implicit typing into the mix to really compound the potential fun.

An entity defined in a module is "completely" defined by the module. Attributes that are specific to the scope of use can be changed (VOLATILE, accessibility, etc), but the fundamental entity remains the same. Name clashes are explicitly called out and can be easily worked around with a rename clause on the USE statement.
Fortran has restrictions on statement ordering (specification statements must go before executable statements, etc.). Included source is also subject to those restrictions, again in the context of the point of inclusion, not the point of source definition.

Mix well with source ambiguity between statement function definitions (specification part) and assignment statements (executable part) for some completely obtuse error messages or, worse, silent acceptance by the compiler of erroneous code.

There are requirements on where the USE statement that references a module appears, but the source for the actual module program unit is completely independent of its point of use.
Fancy having some global state to be shared across related procedures and you want to use include? Let me introduce you to common blocks and the associated underlying concept of sequence association...

Sequence association is a unfortunate bleed-through of early underlying Fortran processor implementation that is an error prone, inflexible, anti-optimisation anachronism.

Module variables make common blocks and their associated evils completely unnecessary.
If you were using include lines, then note that you don't actually include the source of a commonly used procedure (the suggestion in your first paragraph is just going to result in a morass of syntax errors from the compiler). What you would typically do is include source that describes the interface of the procedure. For any non-trivial procedure the source that describes the interface is different from the complete source of the procedure - implying that you now need to maintain two source representations of the same thing. This is an error prone maintenance burden.

As mentioned - the compilers automatically gains knowledge of the interface of a module procedure (the compiler knowledge is "explicit" because it actually saw the procedure's code - hence the term "explicit interface"). No need for the programmer to do anything more.

A consequence of the above is that external subprograms should not be used at all unless there are very good reasons to the contrary (perhaps the existence of circular or excessively extensive dependencies) - the basic starting point should be to put everything in a module or main program.

Other posters have mentioned the source code organisation benefits of modules - including the ability to group related procedures and other "stuff" into the one package, with control over accessibility of internal implementation details.

I accept there is a valid use of INCLUDE lines as per the second paragraph of the question - where large modules become unwieldy in size. F2008 has addressed this with submodules, which also bring a number of other benefits. Once they become widely supported the include line work-around should be abandoned.

A second valid use is to overcome a lack of support by the language for generic programming techniques (what templates provide in C++) - i.e. where the types of objects involved in an operation may vary, but the token sequence that describes what to do on those objects is essentially the same. It might be another decade or so before the language sorts that out.

Solution 2

Placing procedures into modules and using those modules makes the interface of the procedure explicit. It allows a Fortran compiler to check for consistency between the actual arguments in a call and the dummy arguments of the procedure. This guards against a variety of programmer mistakes. An explicit interface is also necessary for certain "advanced" features of Fortran >=90; for example, optional or keyword arguments. Without the explicit interface, the compiler won't generate the correct call. Merely including a file doesn't provide these advantages.

Solution 3

M.S.B.'s answer is great and is probably the most important reason to prefer modules over include. I'd like to add a few more thoughts.

Using modules reduces your compiled binary size if that is something that is important to you. A module is compiled once, and when you use it you are symbolically loading that module to use the code. When you include a file, you are actually inserting the new code into your routine. If you use include a lot it can cause your binary to be large and also increase your compile time.

You can also use modules to fake OOP style coding in Fortran 90 through clever use of public and private functions and user defined types in a module. Even if you didn't want to do that, it provides a nice way to group functions that logically belong together.

17,177

Author by

Nordico

Updated on June 05, 2022

Comments

Nordico almost 2 years

What are the practical differences between using modules with the use statement or isolated files with the include statement? I mean, if I have a subroutine that is used a lot throughout a program: when or why should I put it inside a module or just write it in a separate file and include it in every other part of the program where it needs to be used?

Also, would it be good practice to write all subroutines intended to go in a module in separate files and use include inside the module? Specially if the code in the subroutines is long, so as to keep the code better organized (that way all subroutines are packed in the mod, but if I have to edit one I don't need to go though a maze of code).
High Performance Mark about 11 years

I'd go further, I don't think that there are any good reasons to use include files.
High Performance Mark about 11 years

Like I wrote I don't know of any good reasons to use include files. But it's an opinion
Nordico about 11 years

So, there is no disadvantage in separating subroutines in different files and then using include inside the module, right? I have never heard of submodules before.
IanH about 11 years

I would only consider it if I had very large source files or as part of legacy source migration. If possible I would first consider breaking the module into a number of "child" modules that are then aggregated together with USE statements in the parent module. However, with complicated type/procedure dependencies and/or with the way that Fortran's PUBLIC/PRIVATE accessibility works, using child modules may not always be possible. You may find that stitching the source for a module together with INCLUDE lines confuses some build systems.