Identifying signed and unsigned values in assembly

assembly x86 reverse-engineering

15,346

Solution 1

Your best bet is too look for comparisons and associated actions/flag usage like a branch. Depending on the type the compiler will generate different code. As most (relevant) architectures provide flags to deal with signed values. Taking x86 for example:

jg, jge, jl, jle = branch based on a signed comparison (They check for the SF flag)
ja, jae, jb, jbe = branch based on a unsigned comparison (They check for the CF flag)

Most instructions on a CPU will be the same for signed/unsigned operations, because we're using a Two's-Complement representation these days. But there are exceptions.

Lets take right shifting as an example. With unsigned values on X86 you would use SHR, to shift something to the right. This will add zeros on on every "newly created bit" on the left.

But for signed values usually SAR will be used, because it will extend the MSB into all new bits. Thats called "sign extension" and again only works because we're using Two's-Complement.

Last but not least there are different instructions for signed/unsigned multiplication/division.

idiv or one-operand imul = signed
div or mul/mulx = unsigned

As noted in the comments, imul with 2 or 3 operands doesn't imply anything, because like addition, non-widening multiply is the same for signed and unsigned. Only imul exists in a form that doesn't waste time writing a high-half result, so compilers (and humans) use imul regardless of signedness, except when they specifically want a high-half result, e.g. to optimize uint64_t = u32 * (uint64_t)u32. The only difference will be in the flags being set, which are rarely looked at, especially by compiler-generated code.

Also the NEG instruction will usually only be used on signed values, because it's a two's complement negation. (If used as part of an abs(), the result may be considered unsigned to avoid overflow on INT_MIN.)

Solution 2

In general, you won't be able to. Many things that happen to integral values happen the same way for signed or unsigned values. Assignment, for example. The only way to tell is if the code happens to be doing arithmetic. You absolutely can't tell by looking at the value; all possible bit patterns are valid either way.

Solution 3

In most processors (at least those that use two's complement math), there is no inherent sign-ness for the integers stored in registers or memory. The interpretation depends on the instructions used. A short summary:

Addition and subtraction produce exactly the same bit patterns for signed and unsigned numbers, so usually there is no signed addition or subtraction. (Hovewer, MIPS has separate instructions which cause a trap if the operation overflows).
Division and multiplication does produce different results for signed vs. unsigned numbers, so if the processor supports it, they come in pairs (x86: mul/imul, div/idiv).
conditional branches also may differ depending on the interpretation of the comparison result (usually implemented as subtraction). For example, on x86 there is jg for signed greater and ja for unsigned above.

Note that the floating-point numbers (at lease IEEE format) use an explicit sign bit, so the above does not apply to them.

15,346

Author by

user1466594

Updated on June 15, 2022

Comments

user1466594 almost 2 years

I always find this confusing when I am looking at the disassembly of code written in C/C++.

There is a register with some value. I want to know if it represents a signed number or an unsigned number. How can I find this out?

My understanding is that if it's a signed integer, the MSB will be set if it is negative and not set if it is positive. If I find that it's an unsigned integer, the MSB doesn't matter. Is this correct?

Regardless, this doesn't seem to help: I still need to identify if the integer is signed before I can use this informatin. How can this be done?