Regex to strip comments and multi-line comments and empty lines

39,850

Solution 1

$text = preg_replace('!/\*.*?\*/!s', '', $text);
$text = preg_replace('/\n\s*\n/', "\n", $text);

Solution 2

Keep in mind that any regex you use will fail if the file you're parsing has a string containing something that matches these conditions. For example, it would turn this:

print "/* a comment */";

Into this:

print "";

Which is probably not what you want. But maybe it is, I don't know. Anyway, regexes technically can't parse data in a manner to avoid that problem. I say technically because modern PCRE regexes have tacked on a number of hacks to make them both capable of doing this and, more importantly, no longer regular expressions, but whatever. If you want to avoid stripping these things inside quotes or in other situations, there is no substitute for a full-blown parser (albeit it can still be pretty simple).

Solution 3

//  Removes multi-line comments and does not create
//  a blank line, also treats white spaces/tabs 
$text = preg_replace('!^[ \t]*/\*.*?\*/[ \t]*[\r\n]!s', '', $text);

//  Removes single line '//' comments, treats blank characters
$text = preg_replace('![ \t]*//.*[ \t]*[\r\n]!', '', $text);

//  Strip blank lines
$text = preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $text);

Solution 4

$string = preg_replace('#/\*[^*]*\*+([^/][^*]*\*+)*/#', '', $string);

Solution 5

It is possible, but I wouldn't do it. You need to parse the whole php file to make sure that you're not removing any necessary whitespace (strings, whitespace beween keywords/identifiers (publicfuntiondoStuff()), etc). Better use the tokenizer extension of PHP.

Share:
39,850

Related videos on Youtube

Ahmad Fouad
Author by

Ahmad Fouad

Updated on July 09, 2022

Comments

  • Ahmad Fouad
    Ahmad Fouad almost 2 years

    I want to parse a file and I want to use php and regex to strip:

    • blank or empty lines
    • single line comments
    • multi line comments

    basically I want to remove any line containing

    /* text */ 
    

    or multi line comments

    /***
    some
    text
    *****/
    

    If possible, another regex to check if the line is empty (Remove blank lines)

    Is that possible? can somebody post to me a regex that does just that?

    Thanks a lot.

  • Ahmad Fouad
    Ahmad Fouad about 15 years
    Thanks a lot! The first regex removed single line comments. However the second regex did no change and didn't remove multi line comments. I appreciate your response..thanks again
  • chaos
    chaos about 15 years
    Make sure you have the !s on the first regex; it wasn't in my initial answer. That's what makes it handle multiline comments. The second pattern removes empty lines.
  • St. John Johnson
    St. John Johnson about 15 years
    The !s makes it work 100%. It works much better than my regex, +1 from me.
  • ascx
    ascx almost 7 years
    The single line comment replace doesn't work when there are URLs involved. https://example.com is also replaced.

Related