C/C++: Pointer Arithmetic

16,422

Solution 1

Pointer subtraction yields the number of array elements between two pointers of the same type.

For example,

int buf[10] = /* initializer here */;

&buf[10] - &buf[0];  // yields 10, the difference is 10 elements

Pointer comparison. For example, for the > relational operator: the > operation yields 1 if the pointed array element or structure member on the left hand side is after the pointed array element or structure member on the right hand side and it yields 0 otherwise. Remember arrays and structures are ordered sequences.

 &buf[10] > &buf[0];  // 1, &buf[10] element is after &buf[0] element

Solution 2

Several answers here have stated that pointers are numbers. This is not an accurate description of pointers as specified by the C standard.

In large part, you can think of pointers as numbers, and as addresses in memory, provided (a) you understand that pointer subtraction converts the difference from bytes to elements (of the type of the pointers being subtracted), and (b) you understand the limits where this model breaks.

The following uses the 1999 C standard (ISO/IEC 9899, Second edition, 1999-12-01). I expect the following is more detailed than the asker requested, but, given some of the misstatements here, I judge that precise and accurate information should be given.

Per 6.5.6 paragraph 9, you may subtract two pointers that point to elements of the same array or to one past the last element of the array. So, if you have int a[8], b[4];, you may subtract a pointer to a[5] from a pointer to a[2], because a[5] and a[2] are elements in the same array. You may also subtract a pointer to a[5] from a pointer to a[8], because a[8] is one past the last element of the array. (a[8] is not in the array; a[7] is the last element.) You may not subtract a pointer to a[5] from a pointer to b[2], because a[5] is not in the same array as b[2]. Or, more accurately, if you do such a subtraction, the behavior is undefined. Note that it is not merely the result that is unspecified; you cannot expect that you will get some possibly nonsensical number as a result: The behavior is undefined. According to the C standard, this means that the C standard does not say anything about what occurs as a consequence. Your program could give you a reasonable answer, or it could abort, or it could delete files, and all those consequences would be in conformance to the C standard.

If you do an allowed subtraction, then the result is the number of elements from the second pointed-to element to the first pointed-to element. Thus, a[5]-a[2] is 3, and a[2]-a[5] is −3. This is true regardless of what type a is. The C implementation is required to convert the distance from bytes (or whatever units it uses) into elements of the appropriate type. If a is an array of double of eight bytes each, then a[5]-a[2] is 3, for 3 elements. If a is an array of char of one byte each, then a[5]-a[2] is 3, for 3 elements.

Why would pointers ever not be just numbers? On some computers, especially older computers, addressing memory was more complicated. Early computers had small address spaces. When the manufacturers wanted to make bigger addresses spaces, they also wanted to maintain some compatibility with old software. They also had to implement various schemes for addressing memory, due to hardware limitations, and those schemes may have involved moving data between memory and disk or changing special registers in the processor that controlled how addresses were converted to physical memory locations. For pointers to work on machines like that, they have to contain more information than just a simple address. Because of this, the C standard does not just define pointers as addresses and let you do arithmetic on the addresses. Only a reasonable amount of pointer arithmetic is defined, and the C implementation is required to provide the necessary operations to make that arithmetic work, but no more.

Even on modern machines, there can be complications. On Digital’s Alpha processors, a pointer to a function does not contain the address of the function. It is the address of a descriptor of the function. That descriptor contains the address of the function, and it contains some additional information that is necessary to call the function correctly.

With regard to relational operators, such as >, the C standard says, in 6.5.8 paragraph 5, that you may compare the same pointers you may subtract, as described above, and you may also compare pointers to members of an aggregate object (a struct or union). Pointers to members of an array (or its end address) compare in the expected way: Pointers to higher-indexed elements are greater than pointers to lower-indexed elements. Pointers to two members of the same union compare equal. For pointers to two members of a struct, the pointer to the member declared later is greater than the pointer to the member declared earlier.

As long as you stay within the constraints above, then you can think of pointers as numbers which are memory addresses.

Usually, it is easy for a C implementation to provide the behavior required by the C standard. Even if a computer has a compound pointer scheme, such as a base address and offset, usually all elements of an array will use the same base address as each other, and all elements of a struct will use the same base address as each other. So the compiler can simply subtract or compare the offset parts of the pointer to get the desired difference or comparison.

However, if you subtract pointers to different arrays on such a computer, you can get strange results. It is possible for the bit pattern formed by a base address and offset to appear greater (when interpreted as a single integer) than another pointer even though it points to a lower address in memory. This is one reason you must stay within the rules set by the C standard.

Solution 3

Subtracting two pointer addresses returns the number of elements of that type.

So if you have an array of integers and two pointers into it, subtracting those pointers will return the number of int values between, not the number of bytes. Same with char types. So you need to be careful with this, especially if you are working with a byte buffer or wide characters, that your expression is calculating the right value. If you need byte-based buffer offsets for something that does not use a single byte for storage (int, short, etc) you need to cast your pointers to char* first.

Share:
16,422
Mohamed Ahmed Nabil
Author by

Mohamed Ahmed Nabil

Updated on June 15, 2022

Comments

  • Mohamed Ahmed Nabil
    Mohamed Ahmed Nabil almost 2 years

    I was reading a bit in Pointer Arithmetic, and I came upon 2 things I couldn't understand neither know it's use

    address_expression - address_expression
    

    and also

    address_expression > address_expression
    

    Can someone please explain them to me, how do they work and when they are used.

    Edit:

    What I meant to say is what do they produce if I just take two addresses and subtract them

    And If I take two addresses and compare them what is the result or comparing based upon

    Edit: I now understand the result of subtracting addresses, but comparing addresses I still don't get it.

    I understand that 1<2, but how is an address greater than another one and what are they compared upon

  • Mohamed Ahmed Nabil
    Mohamed Ahmed Nabil almost 12 years
    What I meant to say is what do they produce if I just take two addresses and subtract them And If I take two addresses and compare them what is the result or comparing based upon
  • pb2q
    pb2q almost 12 years
    the subtraction produces a number. in my example the number means the distance between the address, or number of memory addresses between. The comparison is true or false based on the arguments
  • Mohamed Ahmed Nabil
    Mohamed Ahmed Nabil almost 12 years
    What about comparing, what are the two addresses compared based upon?
  • Dietrich Epp
    Dietrich Epp almost 12 years
    Two things: strLength has an off-by-one error, and it would be nice to explain the difference between C pointer arithmetic and the equivalent arithmetic in assembly -- i.e., subtracting two int * pointers will give you a different result than if you cast them to char * first.
  • Dietrich Epp
    Dietrich Epp almost 12 years
    Minor detail: with char types, subtracting always counts bytes because char is defined to be one byte by the C standard.
  • Eric Postpischil
    Eric Postpischil almost 12 years
    Pointers are not necessarily simple memory addresses. The C standard allows room for more complicated forms of addressing that some platforms use. Furthermore, pointer subtraction in C does not merely subtract one address from another. It also divides the address difference by the size of the pointed-to objects. More accurately, the result of the subtraction operator in C, applied to pointers to two objects in the same array (or an end address for the array) is the number of elements from one object to the next.
  • Eric Postpischil
    Eric Postpischil almost 12 years
    @MohamedAhmedNabil: If you compare pointers to two objects within an array (or an end address for the array, that is, the address of an element one beyond the last element actually in the array), then the pointer to the greater-indexed element in the array is greater than the pointer to the lesser-indexed element in the array. If you compare pointers to two members within a struct object, then the pointer to the later element is greater than the pointer to the earlier element. If you compare pointers to things other than the above, then the behavior is undefined.
  • Eric Postpischil
    Eric Postpischil almost 12 years
    Pointers are not just numbers. On some platforms, pointers are base addresses and offsets, and different combinations of base addresses and offsets can point to the same location.
  • David Rodríguez - dribeas
    David Rodríguez - dribeas almost 12 years
    Additionally, it is important to note that the result of p1 - p2 and p1 < p2 is undefined if the two pointers do not refer to subobjects inside the same superobject (elements inside the same array).
  • Mohamed Ahmed Nabil
    Mohamed Ahmed Nabil almost 12 years
    Thank You. Although Other answers offered a lot more detail and explained to me a lot of things I didn't know. This is the most straight forward answer, answering my main question
  • vedosity
    vedosity almost 12 years
    I was editing this after you commented on the above post. Does that help at all?
  • Mohamed Ahmed Nabil
    Mohamed Ahmed Nabil almost 12 years
    @EricPostpischill That is a great answer
  • fredoverflow
    fredoverflow almost 12 years
    - and > only work for pointers into the same array. It is undefined behavior to use them on anything else.
  • ouah
    ouah almost 12 years
    @FredOverflow Or one past the last element of the array (like in my two examples) and for the relational operators you can also use the operators for the same structure or union object.
  • fredoverflow
    fredoverflow almost 12 years
    Interesting, I just verified the struct rule, and the standard indeed guarantees it. Is this a C++11 extension? Anyway, +1 from me.
  • ouah
    ouah almost 12 years
    @FredOverflow actually I read the C rather than the C++ standard, and the structure rule for relational operator is there since C89 so I suspect this is also the case since the first C++ standard.
  • Tim Seguine
    Tim Seguine almost 8 years
    @DietrichEpp That is slightly misleading since the C definition of a byte is not necessarily an octet.
  • Lorenzo Gatti
    Lorenzo Gatti almost 8 years
    int * p=234 is terribly wrong, and dangerous if allowed by misguided compilers. In the words of g++ 5.3.0, it is an invalid conversion from 'int' to 'int*'. Assuming sizeof(int)==4 is equally wrong
  • Cosine
    Cosine almost 8 years
    Yes, of course. I meant if the internal value of int * p happens to be 234 after some instruction (such as p = new int[12];), we can do pointer arithmetic with it safely.
  • chux - Reinstate Monica
    chux - Reinstate Monica over 6 years
    "between two pointers of the same type." is not a string enough condition. It should be "between two pointers of the same type and elements of the same array (or 1 pass)."
  • chux - Reinstate Monica
    chux - Reinstate Monica over 6 years
    "Memory addresses are just numbers, so they can be compared and manipulated in the same way as integral data types." --> Hmm I can multiply (or / % & | ^) two integers together, yet not two pointers.
  • pb2q
    pb2q over 6 years
    @chux thanks for the comment. Your comment prompted me to review my answer, and more importantly review the accepted answer and Eric PostPischil's excellent answer above. I agree that my answer is lacking. I'll either remove my answer or edit it so that it's better
  • pb2q
    pb2q over 6 years
    Really great answer: I can say that I've learned something after reading this. You've really succeeded in providing concrete and instructive reasoning on why it isn't as simple as "addresses are just numbers", and why the spec is specific (or rather, leaves it to the implementation) on this point. I'll be editing my own answer to make it better, or removing it entirely. Thanks
  • jww
    jww almost 6 years
    When you subtract two pointers what is the resulting type? A ptrdiff_t? A uintptr_t? Something else?
  • Eric Postpischil
    Eric Postpischil almost 6 years
    @jww: The result of subtracting two pointers has the type ptrdiff_t.
  • Eric Postpischil
    Eric Postpischil about 3 years
    @dreamcrash: No, “let you do arithmetic with them” is part of “just.” The C standard does not define pointers as addresses, and the C standard does not let you do arithmetic with them (pointers as addresses). But I will rephrase a bit.