What are the rules for evaluation order in Java?

20,659

Solution 1

Let me say this very clearly, because people misunderstand this all the time:

Order of evaluation of subexpressions is independent of both associativity and precedence. Associativity and precedence determine in what order the operators are executed but do not determine in what order the subexpressions are evaluated. Your question is about the order in which subexpressions are evaluated.

Consider A() + B() + C() * D(). Multiplication is higher precedence than addition, and addition is left-associative, so this is equivalent to (A() + B()) + (C() * D()) But knowing that only tells you that the first addition will happen before the second addition, and that the multiplication will happen before the second addition. It does not tell you in what order A(), B(), C() and D() will be called! (It also does not tell you whether the multiplication happens before or after the first addition.) It would be perfectly possible to obey the rules of precedence and associativity by compiling this as:

d = D()          // these four computations can happen in any order
b = B()
c = C()
a = A()
sum = a + b      // these two computations can happen in any order
product = c * d
result = sum + product // this has to happen last

All the rules of precedence and associativity are followed there -- the first addition happens before the second addition, and the multiplication happens before the second addition. Clearly we can do the calls to A(), B(), C() and D() in any order and still obey the rules of precedence and associativity!

We need a rule unrelated to the rules of precedence and associativity to explain the order in which the subexpressions are evaluated. The relevant rule in Java (and C#) is "subexpressions are evaluated left to right". Since A() appears to the left of C(), A() is evaluated first, regardless of the fact that C() is involved in a multiplication and A() is involved only in an addition.

So now you have enough information to answer your question. In a[b] = b = 0 the rules of associativity say that this is a[b] = (b = 0); but that does not mean that the b=0 runs first! The rules of precedence say that indexing is higher precedence than assignment, but that does not mean that the indexer runs before the rightmost assignment.

(UPDATE: An earlier version of this answer had some small and practically unimportant omissions in the section which follows which I have corrected. I've also written a blog article describing why these rules are sensible in Java and C# here: https://ericlippert.com/2019/01/18/indexer-error-cases/)

Precedence and associativity only tell us that the assignment of zero to b must happen before the assignment to a[b], because the assignment of zero computes the value that is assigned in the indexing operation. Precedence and associativity alone say nothing about whether the a[b] is evaluated before or after the b=0.

Again, this is just the same as: A()[B()] = C() -- All we know is that the indexing has to happen before the assignment. We don't know whether A(), B(), or C() runs first based on precedence and associativity. We need another rule to tell us that.

The rule is, again, "when you have a choice about what to do first, always go left to right". However, there is an interesting wrinkle in this specific scenario. Is the side effect of a thrown exception caused by a null collection or out-of-range index considered part of the computation of the left side of the assignment, or part of the computation of the assignment itself? Java chooses the latter. (Of course, this is a distinction that only matters if the code is already wrong, because correct code does not dereference null or pass a bad index in the first place.)

So what happens?

  • The a[b] is to the left of the b=0, so the a[b] runs first, resulting in a[1]. However, checking the validity of this indexing operation is delayed.
  • Then the b=0 happens.
  • Then the verification that a is valid and a[1] is in range happens
  • The assignment of the value to a[1] happens last.

So, though in this specific case there are some subtleties to consider for those rare error cases that should not be occurring in correct code in the first place, in general you can reason: things to the left happen before things to the right. That's the rule you're looking for. Talk of precedence and associativity is both confusing and irrelevant.

People get this stuff wrong all the time, even people who should know better. I have edited far too many programming books that stated the rules incorrectly, so it is no surprise that lots of people have completely incorrect beliefs about the relationship between precedence/associativity, and evaluation order -- namely, that in reality there is no such relationship; they are independent.

If this topic interests you, see my articles on the subject for further reading:

http://blogs.msdn.com/b/ericlippert/archive/tags/precedence/

They are about C#, but most of this stuff applies equally well to Java.

Solution 2

Eric Lippert's masterful answer is nonetheless not properly helpful because it is talking about a different language. This is Java, where the Java Language Specification is the definitive description of the semantics. In particular, §15.26.1 is relevant because that describes the evaluation order for the = operator (we all know that it is right-associative, yes?). Cutting it down a little to the bits that we care about in this question:

If the left-hand operand expression is an array access expression (§15.13), then many steps are required:

  • First, the array reference subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason; the index subexpression (of the left-hand operand array access expression) and the right-hand operand are not evaluated and no assignment occurs.
  • Otherwise, the index subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and the right-hand operand is not evaluated and no assignment occurs.
  • Otherwise, the right-hand operand is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and no assignment occurs.

[… it then goes on to describe the actual meaning of the assignment itself, which we can ignore here for brevity …]

In short, Java has a very closely defined evaluation order that is pretty much exactly left-to-right within the arguments to any operator or method call. Array assignments are one of the more complex cases, but even there it's still L2R. (The JLS does recommend that you don't write code that needs these sorts of complex semantic constraints, and so do I: you can get into more than enough trouble with just one assignment per statement!)

C and C++ are definitely different to Java in this area: their language definitions leave evaluation order undefined deliberately to enable more optimizations. C# is like Java apparently, but I don't know its literature well enough to be able to point to the formal definition. (This really varies by language though, Ruby is strictly L2R, as is Tcl — though that lacks an assignment operator per se for reasons not relevant here — and Python is L2R but R2L in respect of assignment, which I find odd but there you go.)

Solution 3

a[b] = b = 0;

1) array indexing operator has higher precedence then assignment operator (see this answer):

(a[b]) = b = 0;

2) According to 15.26. Assignment Operators of JLS

There are 12 assignment operators; all are syntactically right-associative (they group right-to-left). Thus, a=b=c means a=(b=c), which assigns the value of c to b and then assigns the value of b to a.

(a[b]) = (b=0);

3) According to 15.7. Evaluation Order of JLS

The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.

and

The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated.

So:

a) (a[b]) evaluated first to a[1]

b) then (b=0) evaluated to 0

c) (a[1] = 0) evaluated last

Solution 4

Your code is equivalent to:

int[] a = {4,4};
int b = 1;
c = b;
b = 0;
a[c] = b;

which explains the result.

Share:
20,659
ipkiss
Author by

ipkiss

Updated on January 19, 2020

Comments

  • ipkiss
    ipkiss over 4 years

    I am reading some Java text and got the following code:

    int[] a = {4,4};
    int b = 1;
    a[b] = b = 0;
    

    In the text, the author did not give a clear explanation and the effect of the last line is: a[1] = 0;

    I am not so sure that I understand: how did the evaluation happen?

  • Mat
    Mat almost 13 years
    The question is why that is the case.
  • Jérôme Verstrynge
    Jérôme Verstrynge almost 13 years
    @Mat The answer is because this is what happens under the hood considering the code provided in the question. That's how the evaluation happens.
  • Mat
    Mat almost 13 years
    Yes, I know. Still not answering the question though IMO, which is why this is how the evaluation happens.
  • Jérôme Verstrynge
    Jérôme Verstrynge almost 13 years
    @Mat 'Why this is how the evaluation happens?' is not the asked question. 'how the evaluation happened?' is the asked question.
  • CodesInChaos
    CodesInChaos almost 13 years
    Personally I prefer the mental model where in the first step you build an expression tree using precedence and associativity. And in the second step recursively evaluate that tree starting with the root. With the evaluation of a node being: Evaluate the immediate child nodes left to right and then the note itself. | One advantage of this model is it trivially handles the case where binary operators have a side-effect. But the main advantage is that it simply fits my brain better.
  • Neil G
    Neil G almost 13 years
    Am I correct that C++ doesn't guarantee this? What about Python?
  • Donal Fellows
    Donal Fellows almost 13 years
    @Neil: C++ does not guarantee anything about the order of evaluation, and never did. (Nor does C.) Python guarantees it strictly by precedence order; unlike everything else, assignment is R2L.
  • Neil G
    Neil G almost 13 years
    @Donal: I don't understand what you're saying about Python: isn't Eric's point that precedence is separate from subexpression evaluation order?
  • aroth
    aroth over 12 years
    And one last one: "The rule is, again, 'when you have a choice about what to do first, always go left to right': the a[b] is to the left of the b=0, so the a[b] runs first, resulting in a[1]"...so you're saying that if b=0 was the leftmost expression, the assignment would be made into a[0] instead of a[1]? Because that is demonstrably not what happens. No matter the order of assignments, a[b] is evaluated first, and then assignments proceed from right to left. As the precedence and associativity rules say they should.
  • CodesInChaos
    CodesInChaos over 12 years
    @aroth to me you sound just confused. And the precedence rules only imply that children need to be evaluated before parents. But they say nothing about the order in which children are evaluated. Java and C# chose left to right, C and C++ chose undefined behavior.
  • Donal Fellows
    Donal Fellows over 12 years
    @Neil: In general, Eric's right (as he should be as an expert on this). For Python, it's different; I checked it (by googling it, natch, so find the link yourself) before asserting this. Insights from one language do not apply to another and so the rules must be checked when switching.
  • Donal Fellows
    Donal Fellows over 12 years
    Rereading that bit I quoted, I just realized that it means that array assignment is really a concealed ternary operator (at the logical level; syntactically it's a composition of several operators).
  • Donal Fellows
    Donal Fellows over 12 years
    And of course the operands have to be evaluated before the operator itself. (Well, there are exceptions; for Python, and, or, any and all. They're documented as special in this regard.)
  • Neil G
    Neil G over 12 years
    @Donal: Do you have a link to documentation saying that and, or, any, and all have special subexpression evaluation order? I think you just mean that they early-out. If so, that's unrelated. Anyway, I checked the Python documentation, and it appears that it is also left-to-right subexpression evaluation, except for assignments, when the RHS is evaluated first.
  • aroth
    aroth over 12 years
    A better example of where this matters is int r = i++ + a[i];(ideone.com/qgwYi). The other examples don't give a result different from the precedence rules. Also, in the original expression (a[b] = b = 0) there is only a single sub-expression (in terms of how this answer means "sub-expression"): a[b]. The b = 0 is not a sub-expression in the same sense, which is why it doesn't matter if it happens to the left or to the right of assignment to a[b]. And as the OP expression has just one sub-expression, evaluation order of sub-expressions is not really relevant in that case.
  • Jérôme Verstrynge
    Jérôme Verstrynge over 12 years
    My code is just explaining the application of Java evaluation rules. Lippert's approved answer is not correct and Fellows' answer (or mine) should be the approved instead. "When you have a choice about what to do first, always go left to right" is not equivalent to "First, the array reference subexpression of the left-hand operand array access expression is evaluated". With all due respect, there is no choice in this case.
  • Jérôme Verstrynge
    Jérôme Verstrynge over 12 years
    @Eric Lippert "When you have a choice about what to do first, always go left to right" is not the right explanation to this question regarding Java. "First, the array reference subexpression of the left-hand operand array access expression is evaluated." as reported by Donal Fellows is the right answer.
  • configurator
    configurator over 12 years
    So what you're saying is Eric's answer is wrong because Java defines it specifically to be exactly what he said?
  • Donal Fellows
    Donal Fellows over 12 years
    @configurator: No, I'm saying it's wrong for a question about Java because he talks about C#.
  • configurator
    configurator over 12 years
    The relevant rule in Java (and C#) is "subexpressions are evaluated left to right" - Sounds to me like he's talking about both.
  • Donal Fellows
    Donal Fellows over 12 years
    @configurator: The point is not only that they are L2R (which Eric wrote about) but that the Java spec (which I cited) states specifically what has to happen in this exact case.
  • noober
    noober over 12 years
    If "people get this stuff wrong all the time", the stuff is wrong itself. Obviously, the subexpressions order should be somehow related to precedence and associativity.
  • GreenieMeanie
    GreenieMeanie over 12 years
    A little confused here - does this make the above answer by Eric Lippert any less true, or is it just citing a specific reference as to why it is true?
  • Donal Fellows
    Donal Fellows over 12 years
    @Greenie: Eric's answer is true, but as I stated you cannot take an insight from one language in this area and apply it to another without being careful. So I cited the definitive source.
  • Eric Lippert
    Eric Lippert over 12 years
    @JVerstry: How are they not equivalent? The array reference subexpression of the left hand operand is the leftmost operand. So saying "do the leftmost first" is exactly the same as saying "do the array reference first". If the Java spec authors chose to be unnecessarily wordy and redundant in explaining this particular rule, good for them; this sort of thing is confusing and should be more rather than less wordy. But I don't see how my concise characterization is semantically different from their prolix characterization.
  • Eric Lippert
    Eric Lippert over 12 years
    @noober: OK, consider: M(A() + B(), C() * D(), E() + F()). Your wish is for the subexpressions to be evaluated in what order? Should C() and D() be evaluated before A(), B(), E() and F() because multiplication is higher precedence than addition? It's easy to say that "obviously" the order should be different. Coming up with an actual rule that covers all cases is rather more difficult. The designers of C# and Java chose a simple, easy-to-explain rule: "go left-to-right". What is your proposed replacement for it, and why do you believe your rule is better?
  • Museful
    Museful over 10 years
    @Eric, I wish you would come and say something about this over here.
  • ZhongYu
    ZhongYu over 8 years
    the odd thing is, the right-hand is evaluated before the left-hand variable is resolved; in a[-1]=c, c is evaluated, before -1 is recognized as invalid.
  • Marko Topolnik
    Marko Topolnik over 7 years
    The most ironic aspect of this answer is that it "ignores for brevity" precisely the rule that makes Java break the strict "left-to-right" order, which would make the whole point of criticizing Eric's answer legitimate. Only subcomponents (array reference and array index) are evaluated, but the array itself is not dereferenced and, by implication, the array element isn't dereferenced, either, before evaluating the RHS. This, as opposed to anything actually stated in the answer, gives fundamental substance to the claim that array assignment is a ternary, non-L2R expression.
  • Marko Topolnik
    Marko Topolnik over 7 years
    "a[b] runs first, resulting in a[1]. Then the b=0 happens"---this is actually incorrect. To be more precise: in A()[B()] = C(), this is the order: 1. a = A() 2. b = B() 3. c = C() 4. resolve a[b] into an lvalue 5. assign c to that lvalue. This means that, even if the evaluation of a[b] completes abruptly due to a == null or b out of range, the RHS will nevertheless get evaluated.
  • Pshemo
    Pshemo over 7 years
    You are saying that "associativity between the operators + and - operators" is "RIGHT TO LEFT". Try to use that logic and evaluate 10 - 4 - 3.
  • Pshemo
    Pshemo over 7 years
    I suspect that this mistake may be caused by fact that at top of introcs.cs.princeton.edu/java/11precedence + is unary operator (which has right to left associativity), but additive + and - have just like multiplicative * / % left to right associativity.
  • Mark
    Mark over 7 years
    Well spotted and amended accordingly, thank you Pshemo
  • user207421
    user207421 over 6 years
    This answer doesn't even say how, let alone why. 'Why' is answered by citing relevant portions of the language specification. 'How' is answered by posting bytecode, not the imaginary output of some imaginary language processor.
  • Fabio says Reinstate Monica
    Fabio says Reinstate Monica over 5 years
    This answer explains precedence and associativity, but, as Eric Lippert explained, the question is about evaluation order, which is very different. As a matter of fact, this doesn't answer the question.