Regular expression to remove one parameter from query string

35,341

Solution 1

If you want to do this in just one regular expression, you could do this:

/&foo(=[^&]*)?|^foo(=[^&]*)?&?/

This is because you need to match either an ampersand before the foo=..., or one after, or neither, but not both.

To be honest, I think it's better the way you did it: removing the trailing ampersand in a separate step.

Solution 2

Having a query string that starts with & is harmless--why not leave it that way? In any case, I suggest that you search for the trailing ampersand and use \b to match the beginning of foo w/o taking in a previous character:

 /\bfoo\=[^&]+&?/

Solution 3

It's a bit silly but I started trying to solve this with a regexp and wanted to finally get it working :)

$str[] = 'foo=123';
$str[] = 'foo=123&bar=456';
$str[] = 'bar=456&foo=123';
$str[] = 'abc=789&foo=123&bar=456';

foreach ($str as $string) {
    echo preg_replace('#(?:^|\b)(&?)foo=[^&]+(&?)#e', "'$1'=='&' && '$2'=='&' ? '&' : ''", $string), "\n";
}

the replace part is messed up because apparently it gets confused if the captured characters are '&'s

Also, it doesn't match afoo and the like.

Solution 4

Thanks. Yes it uses backslashes for escaping, and you're right, I don't need the /'s.

This seems to work, though it doesn't do it in one line as requested in the original question.

    public static string RemoveQueryStringParameter(string url, string keyToRemove)
    {
        //if first parameter, leave ?, take away trailing &
        string pattern = @"\?" + keyToRemove + "[^&]*&?"; 
        url = Regex.Replace(url, pattern, "?");
        //if subsequent parameter, take away leading &
        pattern = "&" + keyToRemove + "[^&]*"; 
        url =  Regex.Replace(url, pattern, "");
        return url;
    }

Solution 5

I based myself on your implementation to get a Java impl that seems to work:

  public static String removeParameterFromQueryString(String queryString,String paramToRemove) {
    Preconditions.checkArgument(queryString != null,"Empty querystring");
    Preconditions.checkArgument(paramToRemove != null,"Empty param");
    String oneParam = "^"+paramToRemove+"(=[^&]*)$";
    String begin = "^"+paramToRemove+"(=[^&]*)(&?)";
    String end = "&"+paramToRemove+"(=[^&]*)$";
    String middle = "(?<=[&])"+paramToRemove+"(=[^&]*)&";
    String removedMiddleParams = queryString.replaceAll(middle,"");
    String removedBeginParams = removedMiddleParams.replaceAll(begin,"");
    String removedEndParams = removedBeginParams.replaceAll(end,"");
    return removedEndParams.replaceAll(oneParam,"");
  }

I had troubles in some cases with your implementation because sometimes it did not delete a &, and did it with multiple steps which seems easier to understand.

I had a problem with your version, particularly when a param was in the query string multiple times (like param1=toto&param2=xxx&param1=YYY&param3=ZZZ&param1....)

Share:
35,341
Kip
Author by

Kip

I've been programming since I got my hands on a TI-83 in precalculus class during junior year of high school. Some cool stuff I've done: Chord-o-matic Chord Player: find out what those crazy chords are named! Everytime: keep track of the current time in lots of time zones from your system tray BigFraction: open source Java library for handling fractions to arbitrary precision. JSON Formatter: a completely client-side JSON beautifier/uglifier. QuickReplace: a completely client-side regex tool. It's behind some ugly developer UI since I created it for myself to use. (Sorry not sorry.)

Updated on July 09, 2022

Comments

  • Kip
    Kip almost 2 years

    I'm looking for a regular expression to remove a single parameter from a query string, and I want to do it in a single regular expression if possible.

    Say I want to remove the foo parameter. Right now I use this:

    /&?foo\=[^&]+/
    

    That works as long as foo is not the first parameter in the query string. If it is, then my new query string starts with an ampersand. (For example, "foo=123&bar=456" gives a result of "&bar=456".) Right now, I'm just checking after the regex if the query string starts with ampersand, and chopping it off if it does.

    Example edge cases:

    Input                    |  Expected Output
    -------------------------+--------------------
    foo=123                  |  (empty string)
    foo=123&bar=456          |  bar=456
    bar=456&foo=123          |  bar=456
    abc=789&foo=123&bar=456  |  abc=789&bar=456
    

    Edit

    OK as pointed out in comments there are there are way more edge cases than I originally considered. I got the following regex to work with all of them:

    /&foo(\=[^&]*)?(?=&|$)|^foo(\=[^&]*)?(&|$)/
    

    This is modified from Mark Byers's answer, which is why I'm accepting that one, but Roger Pate's input helped a lot too.

    Here is the full suite of test cases I'm using, and a Javascript snippet which tests them:

    $(function() {
        var regex = /&foo(\=[^&]*)?(?=&|$)|^foo(\=[^&]*)?(&|$)/;
        
        var escapeHtml = function (str) {
            var map = {
              '&': '&amp;',
              '<': '&lt;',
              '>': '&gt;',
              '"': '&quot;',
              "'": '&#039;'
            };
            
            return str.replace(/[&<>"']/g, function(m) { return map[m]; });
        };
    
        
        //test cases
        var tests = [
            'foo'     , 'foo&bar=456'     , 'bar=456&foo'     , 'abc=789&foo&bar=456'
           ,'foo='    , 'foo=&bar=456'    , 'bar=456&foo='    , 'abc=789&foo=&bar=456'
           ,'foo=123' , 'foo=123&bar=456' , 'bar=456&foo=123' , 'abc=789&foo=123&bar=456'
           ,'xfoo'    , 'xfoo&bar=456'    , 'bar=456&xfoo'    , 'abc=789&xfoo&bar=456'
           ,'xfoo='   , 'xfoo=&bar=456'   , 'bar=456&xfoo='   , 'abc=789&xfoo=&bar=456'
           ,'xfoo=123', 'xfoo=123&bar=456', 'bar=456&xfoo=123', 'abc=789&xfoo=123&bar=456'
           ,'foox'    , 'foox&bar=456'    , 'bar=456&foox'    , 'abc=789&foox&bar=456'
           ,'foox='   , 'foox=&bar=456'   , 'bar=456&foox='   , 'abc=789&foox=&bar=456'
           ,'foox=123', 'foox=123&bar=456', 'bar=456&foox=123', 'abc=789&foox=123&bar=456'
        ];
        
        //expected results
        var expected = [
            ''        , 'bar=456'         , 'bar=456'         , 'abc=789&bar=456'
           ,''        , 'bar=456'         , 'bar=456'         , 'abc=789&bar=456'
           ,''        , 'bar=456'         , 'bar=456'         , 'abc=789&bar=456'
           ,'xfoo'    , 'xfoo&bar=456'    , 'bar=456&xfoo'    , 'abc=789&xfoo&bar=456'
           ,'xfoo='   , 'xfoo=&bar=456'   , 'bar=456&xfoo='   , 'abc=789&xfoo=&bar=456'
           ,'xfoo=123', 'xfoo=123&bar=456', 'bar=456&xfoo=123', 'abc=789&xfoo=123&bar=456'
           ,'foox'    , 'foox&bar=456'    , 'bar=456&foox'    , 'abc=789&foox&bar=456'
           ,'foox='   , 'foox=&bar=456'   , 'bar=456&foox='   , 'abc=789&foox=&bar=456'
           ,'foox=123', 'foox=123&bar=456', 'bar=456&foox=123', 'abc=789&foox=123&bar=456'
        ];
        
        for(var i = 0; i < tests.length; i++) {
            var output = tests[i].replace(regex, '');
            var success = (output == expected[i]);
            
            $('#output').append(
                '<tr class="' + (success ? 'passed' : 'failed') + '">'
                + '<td>' + (success ? 'PASS' : 'FAIL') + '</td>'
                + '<td>' + escapeHtml(tests[i]) + '</td>'
                + '<td>' + escapeHtml(output) + '</td>'
                + '<td>' + escapeHtml(expected[i]) + '</td>'
                + '</tr>'
            );
        }
        
    });
    #output {
        border-collapse: collapse;
        
    }
    #output tr.passed { background-color: #af8; }
    #output tr.failed { background-color: #fc8; }
    #output td, #output th {
        border: 1px solid black;
        padding: 2px;
    }
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
    <table id="output">
        <tr>
            <th>Succ?</th>
            <th>Input</th>
            <th>Output</th>
            <th>Expected</th>
        </tr>
    </table>
  • catchmeifyoutry
    catchmeifyoutry over 14 years
    Using a trailing ampersand will give a problem with the third example.
  • JSBձոգչ
    JSBձոգչ over 14 years
    Note that the trailing ampersand is optional in the regex that I gave.
  • Admin
    Admin over 14 years
    Why is both not valid? Input: ?blah&foo=abc&blah
  • Kip
    Kip over 14 years
    @Roger Pate: both is valid input, but you only want to match exactly one of them (because i'm replacing whatever is matched with empty string)
  • Kip
    Kip over 14 years
    yeah i thought about leaving the extra &, but it looked a little sloppy to me. This regex will leave a trailing ampersand on the result. i.e. \bfoo\=[^&]+&? -> bar=456&. to get it to work with foo or foo=, and not with xfoo or foox, I modified it to this: /\bfoo(\=[^&]*)?(&|$)/
  • Greg Bacon
    Greg Bacon over 14 years
    Try running this pattern against Roger's test cases.
  • Kip
    Kip over 14 years
    Accepted this because the solution I got working to all my test cases (see edit to my question) was modified version of this idea: /&foo(\=[^&]*)?(?=&|$)|^foo(\=[^&]*)?(&|$)/
  • Mark Byers
    Mark Byers over 14 years
    gbacon: the only cases it failed on were those containing 'foo' without a value. I've updated the regex to handle this, and it passes all cases now.
  • Kip
    Kip over 14 years
    @MarkByers: this will change something like foobar=123 to bar=123. You need the non-matching (?=&|$) at the end of the left half, and (&|$) at the end of the right half.
  • Mirko Cianfarani
    Mirko Cianfarani about 11 years
    Then this is a solution for this question??
  • Kip
    Kip over 8 years
    in the original question, the input to the regex is only the query string (i.e. everything after the ?), not the whole url, so there is no ? in the string. that's why the accepted answer doesn't consider that scenario.
  • Ion Andrei Bara
    Ion Andrei Bara over 8 years
    that's correct. however I don't see how this does not qualify or even more I don't see any reason for downvoting (!??) as this answer addresses a quite common, more general scenario. I have updated the answer with remarks.
  • Kip
    Kip over 8 years
    I wasn't the one who gave the downvote. But the reason why someone else might have is that, in your original answer, you were answering a different question from what was asked and saying the accepted answer was wrong because it didn't answer that question.
  • Kip
    Kip over 8 years
    also, your answer fails to remove the parameter in several of the edge cases outlines in the original post: foo, foo&bar=456, bar=456&foo, abc=789&foo&bar=456, foo=, foo=123, xfoo=&bar=456, abc=789&xfoo=&bar=456, xfoo=123&bar=456, abc=789&xfoo=123&bar=456
  • Kip
    Kip over 8 years
    Here is a jsFiddle showing the answer (from the OP): jsfiddle.net/1b6ukaw9 Here is a jsFiddle showing the cases where your regex fails: jsfiddle.net/o0b2rrkd This regex works for your case in perl/php: /&foo(\=[^&]*)?(?=&|$)|^foo(\=[^&]*)?(&|$)|(?<=\?)foo(\=[^&]‌​*)?(&|$)/. But it doesn't work in Javascript because it doesn't support look-behind assertions. Here is a version which does work in Javascript, but I had to change the replace code as well: jsfiddle.net/ba7m8wz8