Does the C preprocessor strip comments or expand macros first?

29,878

Solution 1

Unfortunately, the original ANSI C Specification specifically excludes any Preprocessor features in section 4 ("This specification describes only the C language. It makes no provision for either the library or the preprocessor.").

The C99 specification handles this explicity, though. The comments are replaced with a single space in the "translation phase", which happens prior to the Preprocessing directive parsing. (Section 6.10 for details).

VC++ and the GNU C Compiler both follow this paradigm - other compilers may not be compliant if they're older, but if it's C99 compliant, you should be safe.

Solution 2

As described in this copy-n-pasted decription of the translation phases in the C99 standard, removing comments (they are replaced by a single whitespace) occurs in translation phase 3, while preprocessing directives are handled and macros are expanded in phase 4.

In the C90 standard (which I only have in hard copy, so no copy-n-paste) these two phases occur in the same order, though the description of the translation phases is slightly different in some details from the C99 standard - the fact that comments are removed and replaced by a single whitespace character before preprocessing directives are handled and macros expanded is not different.

Again, the C++ standard has these 2 phases occur in the same order.

As far as how the '//' comments should be handled, the C99 standard says this (6.4.9/2):

Except within a character constant, a string literal, or a comment, the characters // introduce a comment that includes all multibyte characters up to, but not including, the next new-line character.

And the C++ standard says (2.7):

The characters // start a comment, which terminates with the next newline character.

So your first example is clearly an error on the part of that translator - the ';' character after the foo(a) should be retained when the foo() macro is expanded - the comment characters should not be part of the 'contents' of the foo() macro.

But since you're faced with a buggy translator, you might want to change the macro definition to:

#define foo(x) /* junk */

to workaround the bug.

However (and I'm drifting off topic here...), since line splicing (backslashes just before a new-line) occurs before comments are processed, you can run into something like this bit of nasty code:

#define evil( x) printf( "hello "); // hi there, \
                 printf( "%s\n", x); // you!



int main( int argc, char** argv)
{
    evil( "bastard");

    return 0;
}

Which might surprise whoever wrote it.

Or even better, try the following, written by someone (certainly not me!) who likes box-style comments:

int main( int argc, char** argv)
{
                            //----------------/
    printf( "hello ");      // Hey, what the??/
    printf( "%s\n", "you"); // heck??         /
                            //----------------/
    return 0;
}

Depending on whether your compiler defaults to processing trigraphs or not (compilers are supposed to, but since trigraphs surprise nearly everyone who runs across them, some compilers decide to turn them off by default), you may or may not get the behavior you want - whatever behavior that is, of course.

Solution 3

According to MSDN, comments are replaced with a single space in the tokenization phase, which happens before the preprocessing phase where macros are expanded.

Solution 4

Never put // comments in your macros. If you must put comments, use /* */. In addition, you have a mistake in your macro:

#define foo(x) do { } while(0) /* junk */

This way, foo is always safe to use. For example:

if (some condition)
    foo(x);

will never throw a compiler error regardless of whether or not foo is defined to some expression.

Solution 5

#ifdef _TEST_
#define _cerr cerr
#else
#define _cerr / ## / cerr
#endif
  • will work on some compilers (VC++). When _TEST_ is not defined,

    _cerr ...

    will be replaced by the comment line

    // cerr ...

Share:
29,878
Phil Miller
Author by

Phil Miller

Contact me for help speeding up your technical computing applications, software development process training and guidance, and other sophisticated development needs. LinkedIn or email on that profile preferred.

Updated on July 08, 2022

Comments

  • Phil Miller
    Phil Miller almost 2 years

    Consider this (horrible, terrible, no good, very bad) code structure:

    #define foo(x) // commented out debugging code
    
    // Misformatted to not obscure the point
    if (a)
    foo(a);
    bar(a);
    

    I've seen two compilers' preprocessors generate different results on this code:

    if (a)
    bar(a);
    

    and

    if (a)
    ;
    bar(a);
    

    Obviously, this is a bad thing for a portable code base.

    My question: What is the preprocessor supposed to do with this? Elide comments first, or expand macros first?