R: What are operators like %in% called and how can I learn about them?

18,958

Solution 1

There are several different things going on here with the percent symbol:

Binary Operators

As several have already pointed out, things of the form %%, %in%, %*% are binary operators (respectively modulo, match, and matrix multiply), just like a +, -, etc. They are functions that operate on two arguments that R recognizes as being special due to their name structure (starts and ends with a %). This allows you to use them in form:

Argument1 %fun_name% Argument2

instead of the more traditional:

fun_name(Argument1, Argument2)

Keep in mind that the following are equivalent:

10 %% 2 == `%%`(10, 2)
"hello" %in% c("hello", "world") == `%in%`("hello", c("hello", "world"))
10 + 2 == `+`(10, 2)

R just recognizes the standard operators as well as the %x% operators as special and allows you to use them as traditional binary operators if you don't quote them. If you quote them (in the examples above with backticks), you can use them as standard two argument functions.

Custom Binary Operators

The big difference between the standard binary operators and %x% operators is that you can define custom binary operators and R will recognize them as special and treat them as binary operators:

`%samp%` <- function(e1, e2) sample(e1, e2)
1:10 %samp% 2
# [1] 1 9

Here we defined a binary operator version of the sample function

"%" (Percent) as a token in special function

The meaning of "%" in function like sprintf or format is completely different and has nothing to do with binary operators. The key thing to note is that in those functions the % character is part of a quoted string, and not a standard symbol on the command line (i.e. "%" and % are very different). In the context of sprintf, inside a string, "%" is a special character used to recognize that the subsequent characters have a special meaning and should not be interpreted as regular text. For example, in:

sprintf("I'm a number: %.2f", runif(3))
# [1] "I'm a number: 0.96" "I'm a number: 0.74" "I'm a number: 0.99"

"%.2f" means a floating point number (f) to be displayed with two decimals (.2). Notice how the "I'm a number: " piece is interpreted literally. The use of "%" allows sprintf users to mix literal text with special instructions on how to represent the other sprintf arguments.

Solution 2

The R Language Definition, section 3.1.4 refers to them as "special binary operators". One of the ways they're special is that users can define new binary operators using the %x% syntax (where x is any valid name).

The Writing your own functions section of An Introduction to R, refers to them as Binary Operators (which is somewhat confusing because + is also a binary operator):

10.2 Defining new binary operators

Had we given the bslash() function a different name, namely one of the form

%anything%

it could have been used as a binary operator in expressions rather than in function form. Suppose, for example, we choose ! for the internal character. The function definition would then start as

> "%!%" <- function(X, y) { ... }

(Note the use of quote marks.) The function could then be used as X %!% y. (The backslash symbol itself is not a convenient choice as it presents special problems in this context.)

The matrix multiplication operator, %*%, and the outer product matrix operator %o% are other examples of binary operators defined in this way.

Solution 3

They don’t have a special name as far as I know. They are described in R operator syntax and precedence.

The %anything% operators are just normal functions, which can be defined by yourself. You do need to put the name of the operator in backticks (`…`), though: this is how R treats special names.

`%test%` = function (a, b) a * b

2 %test% 4
# 8

The sprintf format strings are entirely unrelated, they are not operators at all. Instead, they are just the conventional C-style format strings.

Solution 4

The help file, and the general entry, is indeed a good starting point: ?'%in%'

For example, you can see how the operator '%in%' is defined:

"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0

You can even create your own operators:

'%ni%' <- Negate('%in%')

Share:
18,958

Related videos on Youtube

vocaloidict
Author by

vocaloidict

n00b programmer

Updated on July 28, 2022

Comments

  • vocaloidict
    vocaloidict almost 2 years

    I know the basics like == and !=, or even the difference (vaguely) between & and &&. But stuff like %in% and %% and some stuff used in the context of sprintf(), like sprintf("%.2f", x) stuff I have no idea about.

    Worst of all, they're hard to search for on the Internet because they're special characters and I don't know what they're called...

    • shadow
      shadow almost 10 years
      You can search for them in R with ?"%in%" or ?sprintf. When you have read the help page, you should either be able to use them or at least have some idea on how to search for them.
    • konvas
      konvas almost 10 years
      You can try ls("package:base", pattern = "%") (replacing "base" with any other package) to see these. The help for these functions can be invoked by "?`%in%`" i.e. when you want to call such a function you have to enclose it in "`" or quotation marks
    • Aaron McDaid
      Aaron McDaid almost 8 years
      It's also possible to define an infix := (for example, used in data.table). Does anyone have a full list of what infix operators are possible? For example, why is := possible while =: isn't?
  • Konrad Rudolph
    Konrad Rudolph almost 10 years
    @Rico Quotation marks work but are conceptually backwards: they just denote strings. R simply allows (all) function names to be put into strings (probably for historical reasons, it definitely does not make any sense nowadays), and uses match.fun internally to retrieve the actual function given a string with the function name. Backticks, on the other hand, are simply R’s syntactic mechanism for allowing otherwise invalid characters in variable names. This works for functions, but also for other variables (try it: `a b` = 42).
  • thelatemail
    thelatemail almost 10 years
    I was always under the impression they are called "infix operators" as per cran.r-project.org/doc/manuals/r-release/…
  • Konrad Rudolph
    Konrad Rudolph almost 10 years
    @thelatemail Other operator are also infix operators. “infix” just means that it’s between two operands, as opposed to prefix or postfix operators (with ! being a prefix operator, and the subscript x[y] usually being seen as postfix).
  • Konrad Rudolph
    Konrad Rudolph almost 10 years
    “binary operator” is any operator taking two operands. + is also a binary operator.
  • Joshua Ulrich
    Joshua Ulrich almost 10 years
    @KonradRudolph: yes, but users cannot define new binary operators outside of using the %...% syntax (short of re-compiling R from source).
  • BrodieG
    BrodieG almost 10 years
    @KonradRudolph, isn't [ actually infix? The arguments between in x[y] are x and y, so [ is actually between x and y, with ] just being there for syntax reasons. I don't know of a postfix operator in R.
  • Konrad Rudolph
    Konrad Rudolph almost 10 years
    I’m not sure how that is relevant. The question was whether they had a name (= differentiating them from other operators), and this is not it.
  • Konrad Rudolph
    Konrad Rudolph almost 10 years
    @BrodieG Arguable, all I can say is that they are normally grouped with the postfix operators in parsing, and the way they are parsed actually differs fundamentally from infix operators (because the trailing ] removes ambiguities aboutx precedence in chained operators: for a + b * c, the parser has to keep track of the precedence to group b * c together; with a[b] * c, no such track-keeping is required).
  • Joshua Ulrich
    Joshua Ulrich almost 10 years
    I fail to see the distinction. "What are operators like %% called?" They're called binary operators, just like +; and they happen to be the only way users can define binary operators. %in% and %*% exist because in is a reserved word and * does element-by-element multiplication. I didn't read the question as "Do the binary operators like %% have a special name?", in which case the answer is "no".
  • Konrad Rudolph
    Konrad Rudolph almost 10 years
    The OP was asking for a name specifically to google for these operators and learn more about them. Not operators in general, the %…% operators. And from your answer, without prior knowledge, I would conclude that “binary operator” is the specific name given to this kind of operator. Hence my comment.
  • BrodieG
    BrodieG almost 10 years
    Fair enough, agree that this is not a particularly clear cut case. And certainly the meaning of infix becomes peculiar if you consider cases such as x[i, j].
  • Joshua Ulrich
    Joshua Ulrich almost 10 years
    To be fair, the OP did not ask for a name specifically to search for. They asked how to learn about them and stated that they were hard to search for. I can see how you inferred that though. Also, a search for R binary operators reveals the "R Language Definition" calls them "special binary operators", which I have added to my answer. Thanks for prompting the clarification.
  • Nova
    Nova almost 6 years
    I came here looking for help with a tutorial I'm writing, and instead this helped me with a project I'm working on for a contract. I was trying to find a short way to write x < value, but have it ignore NA values. '%less.than%' <- function(x, val) {is.na(x) | x < val} does the trick perfectly -- e.g. 7 %less.than% 10 -- thank you!!!
  • pietrodito
    pietrodito over 3 years
    What you say in the paragraph custom binary operator is wrong. You can do the same with standard operator: just define '+' <- function(a, b) a * b and you have 3 + 7 == 21