andi vs. addi instruction in MIPS with negative immediate constant

assembly mips sign-extension zero-extension immediate-operand

10,746

Your expectation is correct, but your interpretation of your experimental results is not

$t2 becomes 0x0005550 This is confirmed by the MIPS emulator.

No, this is incorrect. So, one of the following:

Somehow, you're misreading what the emulator is doing. The actual value from the emulator is what you expected it to be.
Or, you don't have 0x55555550 in $t2 before the andi as you assume, but 0x5550 instead (i.e.) your test program doesn't set up $t2 correctly.

However, it is not what I expected. I think the answer should be 0x55555550 & 0xFFFFFFFF = 0x55555550. I think the constant -1 was sign extended to 0xFFFFFFFF before the and logic.

Yes, this is correct. And, I'll explain what is happening and why below.

But it appears that the answer was 0x55555550 & 0x0000FFFF. Why -1 is sign extended to 0x0000FFFF instead of 0xFFFFFFFF

It wasn't. It was sign extended to 0xFFFFFFFF. Again, you're reading the experimental results incorrectly [or your test program has a bug].

mips simulators and assemblers have pseudo ops.

These are instructions that may or may not exist as real, physical instructions. However, they are interpreted by the assembler to generate a sequence of physical/real instructions.

An example of a "pure" pseudo-op is li ("load immediate"). It has no corresponding instruction, but usually generates a two instruction sequence: lui, ori (which are physical instructions).

Pseudo-ops should not be confused with assembler directives, such as .text, .data, .word, .eqv, etc.

Some pseudo-ops can overlap with actual physical instructions. That is what is happening with your example.

In fact, the assembler examines any given instruction as a potential pseudo-op. It may determine that in can fulfill the intent with a single physical instruction. If not, it will generate a 1-3 instruction sequence and may use the [reserved] $at register [which is $1] as part of that sequence.

In mars, to see the actual real instructions, look in the Basic column of the source window.

For the sake of the completeness of my answer, all that follows is prefaced by the top comments.

I've created three example programs:

The addi as in your original post
The andi as in your corrected post
An andi that uses an unsigned argument

(1) Here is the assembler source for your original question using addi:

    .text
    .globl  main
main:
    li      $t2,0x55555550
    addi    $t3,$t2,-1
    nop

Here is how mars interpreted it:

 Address    Code        Basic                     Source

0x00400000  0x3c015555  lui $1,0x00005555         4     li      $t2,0x55555550
0x00400004  0x342a5550  ori $10,$1,0x00005550
0x00400008  0x214bffff  addi $11,$10,0xffffffff   5     addi    $t3,$t2,-1
0x0040000c  0x00000000  nop                       6     nop

addi will sign extend its 16 bit immediate, so we have 0xFFFFFFFF. Then, doing a two's complement add operation, we have a final result of 0x5555554F

Thus, the assembler didn't need to generate extra instructions for the addi, so the addi pseudo-op generated a single real addi

(2) Here is the andi source:

    .text
    .globl  main
main:
    li      $t2,0x55555550
    andi    $t3,$t2,-1
    nop

Here is the assembly:

 Address    Code        Basic                     Source

0x00400000  0x3c015555  lui $1,0x00005555         4     li      $t2,0x55555550
0x00400004  0x342a5550  ori $10,$1,0x00005550
0x00400008  0x3c01ffff  lui $1,0xffffffff         5     andi    $t3,$t2,-1
0x0040000c  0x3421ffff  ori $1,$1,0x0000ffff
0x00400010  0x01415824  and $11,$10,$1
0x00400014  0x00000000  nop                       6     nop

Whoa! What happened? The andi generated three instructions.

A real andi instruction does not sign extend its immediate argument. So, the largest unsigned value we can use in a real andi is 0xFFFF

But, by specifying -1, we told the assembler that we did want sign extension (i.e. 0xFFFFFFFF)

So, the assembler could not fulfull the intent with a single instruction and we get the sequence above. And the generated sequence could not use andi but had to use the register form: and. Here is the andi generated code converted back into more friendly asm source:

    lui     $at,0xFFFF
    ori     $at,$at,0xFFFF
    and     $t3,$t2,$at

As to result, we're anding 0x55555550 and 0xFFFFFFFF which is a [still unchanged] value of 0x55555550

(3) Here is the source for an unsigned version of andi:

    .text
    .globl  main
main:
    li      $t2,0x55555550
    andi    $t3,$t2,0xFFFF
    nop

Here is the assembler output:

 Address    Code        Basic                     Source

0x00400000  0x3c015555  lui $1,0x00005555         4     li      $t2,0x55555550
0x00400004  0x342a5550  ori $10,$1,0x00005550
0x00400008  0x314bffff  andi $11,$10,0x0000ffff   5     andi    $t3,$t2,0xFFFF
0x0040000c  0x00000000  nop                       6     nop

When the assembler sees that we're using a hex constant (i.e. the 0x prefix), it tries to fulfill the value as an unsigned operation. So, it doesn't need to sign extend. And, the real andi can fulfill the request.

The result of this is 0x5550

Note that if we had used a mask value of 0x1FFFF, that would be unsigned. But, it's larger than 16 bits, so the assembler would generate a multi-instruction sequence to fulfill the request.

And, the result here would be 0x15550

10,746

Lin Yu Cheng

Updated on June 04, 2022

Comments

Lin Yu Cheng almost 2 years
Assume $t2=0x55555550, then executing the following instruction:
```
andi $t2, $t2, -1
```
$t2 becomes 0x0005550

This is confirmed by the MIPS emulator¹

However, it is not what I expected. I think the answer should be 0x55555550 & 0xFFFFFFFF = 0x55555550. I think the constant -1 was sign extended to 0xFFFFFFFF before the and logic. But it appears that the answer was 0x55555550 & 0x0000FFFF

Why -1 is sign extended to 0x0000FFFF instead of 0xFFFFFFFF

Footnote 1: Editor's note: MARS with "extended pseudo-instructions" enabled does expand this to multiple instructions to generate 0xffffffff in a tmp register, thus leaving $t2 unchanged. Otherwise MARS and SPIM both reject it with an error as not encodeable. Other assemblers may differ.
- Craig Estey over 7 years
  
  Your expectation for value is correct. Which mips emulator? I've tested this in both mars and spim [single stepping] and they both produce 0x5555554F. Are you sure you are preloading the original value correctly (e.g. li $t2,0x55555550)? If you are not checking the post addi value with single step, how are you getting the 5550 value? At worst, with 16 bit truncation, you'd get 554F. This is such a basic operation, you can verify it on a hex calculator.
- Lin Yu Cheng over 7 years
  
  @CraigEstey My big big apology. I was dizzy late last night and totally have my thought wrong. It should be andi instead of addi. Could you please take some time to read my question again. Sorry.......
- Michael Foukarakis over 7 years
  
  Well, MIPS immediate values are 16-bits wide, anyway.
- Craig Estey over 7 years
  
  @LinYuCheng I hope you're feeling better. Again, your expectation for value is correct. But, you're misreading the emulator results. See my answer as to what is really going on.
Craig Estey over 7 years

This is correct as far a real/physical instruction goes. But, mips simulators/assemblers interpret instructions as pseudo-ops and may generate something completely different. In OP's code, the assembler doesn't generate a real andi. It generates a 3 instruction sequence. See my answer for details.
Lin Yu Cheng over 7 years

Thanks for your excellent explanation. I got it clear now. It seems I have to read more about the "physical" machine code.
Craig Estey over 7 years

You're welcome. Here's a good list of instructions: mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html As an example, notice that there are six real single register forms for branch: bltz/blez/beqz/bnez/bgez/bgtz. But, only two real two register versions: bne/beq. The remaining [not shown] blt/ble/bge/bgt are pseudo ops. They use slt in some fashion to set $at. BTW, it customary to upvote a good answer and accept the best one.
Peter Cordes almost 4 years

@CraigEstey: I wouldn't be surprised if some assembler simply truncates -1 to fit in a zero-extended immediate. The current versions of SPIM and MARS don't, though. MARS with extended pseudo-instructions accepts it as a pseudo, otherwise MARS and SPIM both reject -1. So does clang's built-in assembler. (Also, this would be a better answer if it mentioned that other MIPS instruction sign-extend their immediate, including addiu. What does "extend immediate to 32 bits" mean in MIPS?)
Craig Estey almost 4 years

@PeterCordes This is covered in my answer [for andi, at least]. If you do, andi $t2,0xFFFF, you get a real andi that is truncated to 16 bits (i.e. 0x0000FFFF). If you do, andi $t2,-1, mars treats this as pseudo op, and generates three instructions to produce a full, sign extended, 32 bit and (equivalent to 0xFFFFFFFF).