how can we filter elements in array with regex in array with javascript?

13,023

Solution 1

I would probably go something like this

var regexs = [
    /rat/i,
    /cat/i,
    /dog/i,
    /[1-9]/i
]

var texts = [
    'the dog is hiding',
    'cat',
    'human',
    '1'
]

var goodStuff = texts.filter(function (text) {
    return !regexs.some(function (regex) {
         return regex.test(text);
    });
});

But realistically, performance differences are so negligible here unless you are doing it 10,000 times.

Please note that this uses ES5 methods, which are easily shimmable (I made up a word I know)

Solution 2

Here's my solution:

var words = [ 'rat', 'cat', 'dog', '[1-9]' ];

var texts = [ ... ];

// normalise (and compile) the regexps just once
var regex = words.map(function(w) {
    return new RegExp('\\b' + w + '\\b', 'i');
});

// nested .filter calls, removes any word that is
// found in the regex list
texts = texts.filter(function(t) {
    return regex.filter(function(re) {
        return re.test(t);
    }).length === 0;
});

http://jsfiddle.net/SPAKK/

Solution 3

You clearly have to process the texts array elemnt by element. However you could combine your regexps into a single one by joining with '|'

The regexps array you show are actually simple strings. I would remove the leading and trailing / characters and then construct a single regexp. Something like :

function reduce (texts, re) {
  re = new RegExp (re.join ('|'));
  for (var r = [], t = texts.length; t--;)
    !re.test (texts[t]) && r.unshift (texts[t]);
  return r;
}

alert (reduce (['the dog is hiding', 'cat', 'human', '1'], ['rat', 'cat', 'dog', '[1-9]']))

Be aware that if your re strings contain RegExp special characters like .{[^$ etc you will need to escape them either in the strings or process them in the function.

See jsfiddle : http://jsfiddle.net/jstoolsmith/D3uzW/

Share:
13,023
user1780413
Author by

user1780413

Updated on June 13, 2022

Comments

  • user1780413
    user1780413 almost 2 years

    Let's say I have two arrays: one is the regex and the other one is the input. What, then, is the best way - in terms of performance and readability - to do something like the output?

    var regex = [
        '/rat/',
        '/cat/'
        '/dog/',
        '/[1-9]/'
    ]
    
    var texts = [
        'the dog is hiding',
        'cat',
        'human',
        '1'
    ]
    

    the end result is

    result = [
        'human'
    ]
    

    Well, what I was thinking was to do something like reduce:

    // loop by text
    for (var i = texts.length - 1; i >= 0; i--) {
        // loop by regex
        texts[i] = regex.reduce(function (previousValue, currentValue) {
            var filterbyRegex = new RegExp("\\b" + currentValue + "\\b", "g");  
            if (previousValue.toLowerCase().match(filterbyRegex)) {
                delete texts[i];
            };
            return previousValue;
        }, texts[i]);
    }
    

    But, is that not readable? Maybe there is another way that I haven't thought of.