Count number of occurrences based on 2 conditions or Regexp

13,082

Solution 1

The most pedestrian solution to your problem (tested in Excel and Google Docs) is to simply add the result of several countif formulas:

=COUNTIF(B5:O5, "*yes*") + COUNTIF(B5:O5, "*no*")

This expression will count the total of cells with "yes" or "no". It will double count a cell with "yesno" or "noyes" since it matches both expressions. You could try to take out the doubles with

=COUNTIF(B5:O5, "*yes*") + COUNTIF(B5:O5, "*no*") - COUNTIF(B5:O5, "*no*yes*") - COUNTIF(B5:O5, "*yes*no*")

But that will still get you in trouble with a string like noyesno.

However there is a rather clever trick in Google Docs that may just be a hint of the solution you are looking for:

=COUNTA(QUERY(A1:A9, "select A where A matches '(.*yes.*)|(.*no.*)'"))

The QUERY function is like a mini database thing. In this case it looks at the table in range A1:A9, and selects only elements in column A where the corresponding element in column A matches (in the preg regex sense of the word) the expression that follows - in this case, "anything followed by yes followed by anything, or anything followed by no followed by anything". In a simple example I made, this counts a yesnoyes only once - making it exactly what you were asking for (I think...)

Right now your range B5:O5 is several columns wide, and only one row high; that makes it hard to use the QUERY trick. Something rather less elegant (but that works regardless of the shape of the range) is this:

=countif(arrayformula(isnumber(find("yes",A1:A9))+isnumber(find("no",A1:A9))),">0")

The sum of the isnumber functions acts as an element-wise OR - unfortunately, the regular OR function doesn't seem to work on individual elements of an array. As before, this finds cells that contain either "yes" or "no", and counts the ones that have either of these strings contained within.

Solution 2

This is heavily inspired by Floris's answer. See the comments, in particular. If you TRANSPOSE the row of items to compare against, QUERY works fine for horizontal data as well:

=COUNTA(QUERY(TRANSPOSE(B5:O5), "select * where Col1 matches '.*(yes|no).*'"))

As far as I can tell, Col1 is "special" and case sensitive!

Share:
13,082

Related videos on Youtube

NGix
Author by

NGix

Updated on June 04, 2022

Comments

  • NGix
    NGix almost 2 years

    How can I get the number of occurrences for some range based on

    1. A regular expression

    2. 2+ conditions; let's say cells that contain "yes" and / or "no"

    What I've got for the moment:

    COUNTIF(B5:O5; "*yes*")
    

    I tried to use COUNTIF(B5:O5; {"*yes*", "*no*"}) or COUNTIF(B5:O5; "(*yes*)|(*no*)"), but neither of them worked.

    Or, how do I count cells that contain some domain names—yahoo.com, hotmail.com, and gmail.com—using regexp? e.g.:

    (\W|^)[\w.+\-]{0,25}@(yahoo|hotmail|gmail)\.com(\W|$)
    
  • Kris Khaira
    Kris Khaira about 6 years
    This doesn't work in Google Sheets. Note that the question is tagged with google-docs.
  • Michael
    Michael about 5 years
    Can you TRANSPOSE(B5:O5) before sending it to QUERY? Or does QUERY have problems with the column identifiers on anonymous matrices?
  • Floris
    Floris about 5 years
    @Michael have you tried? I can’t test it right now but I would be quite interested in the answer.
  • Michael
    Michael about 5 years
    I can't get it to work. Not with QUERY(TRANSPOSE(B5:O5), "select A"), with …, "select col1"), nor with QUERY({"foo"; TRANSPOSE(B5:O5)}, "select foo", 1). It always complains that the column identifier is missing. select * works, but you can't use * in the where clause.
  • Michael
    Michael about 5 years
    I tried again and got it working. See below: Apparently Col1 is the "magic" identifier, and it is case sensitive. Note that it does not have the right case in the documentation's sample identifiers.