Bit-reverse a byte on 68HC12

25,677

Solution 1

If you can spare the 256 bytes extra code size, a lookup table is probably the most efficient way to reverse a byte on a 68HCS12. But I am pretty sure this is not what your instructor is expecting.

For the "normal" solution, consider the data bits individually. Rotates and shifts allow you to move bits around. For a first solution, isolate the eight bits (with bitwise "and" operations), move them to their destination positions (shifts, rotates...), then combine them together again (with bitwise "or" operations). This will not be the most efficient or simplest implementation, but you should first concentrate on getting a correct result -- optimization can wait.

Solution 2

Hints: If you do a shift, one bit gets shifted out and a zero (probably) gets shifted in. Where does that shifted out bit go to? You need to shift that in to the other end of the destination register or memory address.

I'm sure that 25 years ago I could do this in Z80 machine code without an assembler :)

Solution 3

Consider two registers as stacks of bits. What happens if you move one bit at a time from one to another?

Solution 4

When you do a right shift, what was the least significant bit goes into the carry flag.

When you do a rotate, the carry flag is used to fill in the vacated bit of the result (LSB for a ROL, MSB for a ROR).

Solution 5

FIrst of all work out the algorithm for doing what you need to do. Express it as pseudo code or C or plain English or diagrams or whatever you are comfortable with. Once you have cleared this conceptual hurdle the actual implementation should be quite simple.

Your CPU probably has instructions which let you shift and/or rotate a register, perhaps including the carry flag as an additional bit. These instructions will be very useful.

View more solutions

25,677

dohlfhauldhagen

Updated on April 10, 2020

Comments

dohlfhauldhagen about 4 years

I'm in a microprocessors class and we are using assembly language in Freescale CodeWarrior to program a 68HCS12 micro controller. Our assignment this week is to revers a byte, so if the byte was 00000001, the output would be 10000000, or 00101011 to 11010100. We have to use assembly language, and were told we could use rotates and shifts (but not limited to!) to accomplish this task. I'm really at a loss as to where I should start.
Spacedman about 13 years

There are hackier methods: graphics.stanford.edu/~seander/bithacks.html#BitReverseObvio‌us (in C, but could be done in assembly...)
dohlfhauldhagen about 13 years

Ok, I figured out how to do it using shifts and rotates. However, we can get extra credit for having the most efficient code. He doesn't care how we do it. I honestly don't know how to do a lookup table. I was kind of reading about them when researching an answer to this problem before, but I didn't really understand how to implement them.
Spacedman about 13 years

Suppose your program in its lifetime is going to be doing zillions of bit reversals. When you start the program you just do all 256 possibilities by bit rotates and store the results in 256 bytes of contiguous memory, starting at, say, BASE. Now, whenever you need to flip the bits of a register, you just look at (BASE+value). That's a lookup table (LUT). You can even hardcode the LUT into the assembly as a chunk of 256 constants if you can precompute them. Then there's no init needed. Win.
Olof Forshell about 13 years

To save space over the 256 byte table you can have a 16 byte table containing the values for four bits (nibbles) at a time. The algorithm then would be "revval=revdigit[inval&0x0f]<<4|revdigit[inval>>4]". If I were a prof I'd like the two parts where one shift is in the indexing and the other outside.
Peter Cordes over 4 years

It's actually easier in assembly than C because you have rotate-through-carry. Shift a bit into the carry flag then do something like adc same,same to shift the carry flag into the bottom of a register.
Peter Cordes about 4 years

You don't need a slow rcl; you can use adc ax, ax. Both are 2 bytes (in 16-bit mode) and adc is faster on modern CPUs. Your breakdown in terms of shl eax,1 and adc al,0 is wrong: it would be shl ax,1 and adc ax,0, both with the same operand-size as the rcl, and shifting before adding in the carry. Also, dec cl is 2 bytes; maybe you were thinking of dec cx? The one-byte inc/dec opcodes are only for 16 or 32-bit operand-size, not 8-bit. (If you were really optimizing for speed over size, you'd use the slow loop instruction. But don't do that)
Peter Cordes about 4 years

Also you could make it non-destructive by using ror di, 1 instead of shr. Oh also, you only set CX=8, but DI and AX are 16-bit registers.
Antonin GAVREL about 4 years

By the way I am puzzled by your comment about slow loop, may you elaborate ? I also checked for ror vs shr stackoverflow.com/questions/4976636/…, but what the point to make it non-destructive?
Peter Cordes about 4 years

Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?. And re: non-destructive: if you used ror di,1 16 times in a loop, the final value of DI would be the same as the initial. But your shr loop leaves DI=0 (if you did 16 iterations). So you have a choice of which one is more useful: a zeroed register or the original value.
Antonin GAVREL about 4 years

Thank you very interesting, I edited the answer! Also in this very specific situation do you see any other point to have ror over shr since we do not really care of keeping di value?
Peter Cordes about 4 years

How do you know whether someone using this code wants the original value as well? Sometimes you do want to keep the original for later use when bit-reversing. But anyway, why did you waste an instruction by using shl and adc ax,0 separately instead of adc ax,ax like I initially recommended? Adding a number to itself is a left-shift by 1, and using ADC shifts in CF. So it's identical to RCL except for avoiding partial-flag stuff (adc writes all the flags).
Antonin GAVREL about 4 years

Good point, and sorry I missed that, edited. Also about the loop instruction, it is not worthwhile to use at all it seems?
Peter Cordes about 4 years

Also, you broke your explanation in the last paragraph: it's shift then adc so the new bit is in the bottom position. And rcl ax, 1 is a 2-byte instruction while adc ax, 0 is 3. RCL by 1 is often less efficient than adc ax,ax, but on some CPUs it's break-even with separate shift and adc. See uops.info/…. On AMD those are all 1-uop instructions so separate shift-and-add are worse than a single RCL by 1.
Antonin GAVREL about 4 years

Let us continue this discussion in chat.