Is there any alternative to using % (modulus) in C/C++?

58,141

Solution 1

Ah, the joys of bitwise arithmetic. A side effect of many division routines is the modulus - so in few cases should division actually be faster than modulus. I'm interested to see the source you got this information from. Processors with multipliers have interesting division routines using the multiplier, but you can get from division result to modulus with just another two steps (multiply and subtract) so it's still comparable. If the processor has a built in division routine you'll likely see it also provides the remainder.

Still, there is a small branch of number theory devoted to Modular Arithmetic which requires study if you really want to understand how to optimize a modulus operation. Modular arithmatic, for instance, is very handy for generating magic squares.

So, in that vein, here's a very low level look at the math of modulus for an example of x, which should show you how simple it can be compared to division:


Maybe a better way to think about the problem is in terms of number bases and modulo arithmetic. For example, your goal is to compute DOW mod 7 where DOW is the 16-bit representation of the day of the week. You can write this as:

 DOW = DOW_HI*256 + DOW_LO

 DOW%7 = (DOW_HI*256 + DOW_LO) % 7
       = ((DOW_HI*256)%7  + (DOW_LO % 7)) %7
       = ((DOW_HI%7 * 256%7)  + (DOW_LO%7)) %7
       = ((DOW_HI%7 * 4)  + (DOW_LO%7)) %7

Expressed in this manner, you can separately compute the modulo 7 result for the high and low bytes. Multiply the result for the high by 4 and add it to the low and then finally compute result modulo 7.

Computing the mod 7 result of an 8-bit number can be performed in a similar fashion. You can write an 8-bit number in octal like so:

  X = a*64 + b*8 + c

Where a, b, and c are 3-bit numbers.

  X%7 = ((a%7)*(64%7) + (b%7)*(8%7) + c%7) % 7
      = (a%7 + b%7 + c%7) % 7
      = (a + b + c) % 7

since 64%7 = 8%7 = 1

Of course, a, b, and c are

  c = X & 7
  b = (X>>3) & 7
  a = (X>>6) & 7  // (actually, a is only 2-bits).

The largest possible value for a+b+c is 7+7+3 = 17. So, you'll need one more octal step. The complete (untested) C version could be written like:

unsigned char Mod7Byte(unsigned char X)
{
    X = (X&7) + ((X>>3)&7) + (X>>6);
    X = (X&7) + (X>>3);

    return X==7 ? 0 : X;
}

I spent a few moments writing a PIC version. The actual implementation is slightly different than described above

Mod7Byte:
       movwf        temp1        ;
       andlw        7        ;W=c
       movwf        temp2        ;temp2=c
       rlncf   temp1,F        ;
       swapf        temp1,W ;W= a*8+b
       andlw   0x1F
       addwf        temp2,W ;W= a*8+b+c
       movwf        temp2   ;temp2 is now a 6-bit number
       andlw   0x38    ;get the high 3 bits == a'
       xorwf        temp2,F ;temp2 now has the 3 low bits == b'
       rlncf   WREG,F  ;shift the high bits right 4
       swapf   WREG,F  ;
       addwf        temp2,W ;W = a' + b'

 ; at this point, W is between 0 and 10


       addlw        -7
       bc      Mod7Byte_L2
Mod7Byte_L1:
       addlw        7
Mod7Byte_L2:
       return

Here's a liitle routine to test the algorithm

       clrf    x
       clrf    count

TestLoop:
       movf        x,W
       RCALL   Mod7Byte
       cpfseq count
        bra    fail

       incf        count,W
       xorlw   7
       skpz
        xorlw        7
       movwf   count

       incfsz        x,F
       bra        TestLoop
passed:

Finally, for the 16-bit result (which I have not tested), you could write:

uint16 Mod7Word(uint16 X)
{
 return Mod7Byte(Mod7Byte(X & 0xff) + Mod7Byte(X>>8)*4);
}

Scott


Solution 2

If you are calculating a number mod some power of two, you can use the bit-wise and operator. Just subtract one from the second number. For example:

x % 8 == x & 7
x % 256 == x & 255

A few caveats:

  1. This only works if the second number is a power of two.
  2. It's only equivalent if the modulus is always positive. The C and C++ standards don't specify the sign of the modulus when the first number is negative (until C++11, which does guarantee it will be negative, which is what most compilers were already doing). A bit-wise and gets rid of the sign bit, so it will always be positive (i.e. it's a true modulus, not a remainder). It sounds like that's what you want anyway though.
  3. Your compiler probably already does this when it can, so in most cases it's not worth doing it manually.

Solution 3

There is an overhead most of the time in using modulo that are not powers of 2. This is regardless of the processor as (AFAIK) even processors with modulus operators are a few cycles slower for divide as opposed to mask operations.

For most cases this is not an optimisation that is worth considering, and certainly not worth calculating your own shortcut operation (especially if it still involves divide or multiply).

However, one rule of thumb is to select array sizes etc. to be powers of 2.

so if calculating day of week, may as well use %7 regardless if setting up a circular buffer of around 100 entries... why not make it 128. You can then write % 128 and most (all) compilers will make this & 0x7F

Solution 4

Unless you really need high performance on multiple embedded platforms, don't change how you code for performance reasons until you profile!

Code that's written awkwardly to optimize for performance is hard to debug and hard to maintain. Write a test case, and profile it on your target. Once you know the actual cost of modulus, then decide if the alternate solution is worth coding.

Solution 5

@Matthew is right. Try this:

int main() {
  int i;
  for(i = 0; i<=1024; i++) {
    if (!(i & 0xFF)) printf("& i = %d\n", i);
    if (!(i % 0x100)) printf("mod i = %d\n", i);
  }
}
Share:
58,141
JeffV
Author by

JeffV

Electronics Technologist, currently working as a Software Engineer on VR Simulation Systems. Primarily work in Modern C++, C (embedded), Java, C#. Involved in full engineering life cycle including requirements analysis, system decomposition, high level and low level design, implementation and system verification. Designed and implemented sonar processing systems for beam forming line arrays using distributed services and message passing with ZeroMQ. Designed and implemented VR training simulators with hardware stimulation. Designed and implemented wireless blasting receivers for through the earth communications. Extreme low power and reliability design goals: 6mos on a single C-cell battery. Spend my spare time with my wife and daughter, and enjoy mountain biking, boating, scuba, photography and flying. I like to ask the good questions so all can benefit from the answers. Hard to beat the fastest guns in the west around here. ;-)

Updated on July 09, 2022

Comments

  • JeffV
    JeffV almost 2 years

    I read somewhere once that the modulus operator is inefficient on small embedded devices like 8 bit micro-controllers that do not have integer division instruction. Perhaps someone can confirm this but I thought the difference is 5-10 time slower than with an integer division operation.

    Is there another way to do this other than keeping a counter variable and manually overflowing to 0 at the mod point?

    const int FIZZ = 6;
    for(int x = 0; x < MAXCOUNT; x++)
    {
        if(!(x % FIZZ)) print("Fizz\n"); // slow on some systems
    }
    

    vs:

    The way I am currently doing it:

    const int FIZZ = 6;
    int fizzcount = 1;
    for(int x = 1; x < MAXCOUNT; x++)
    {
        if(fizzcount >= FIZZ) 
        {
            print("Fizz\n");
            fizzcount = 0;
        }
    }
    
  • JeffV
    JeffV over 15 years
    How would you handle the case where it wasn't a neat power of two number?
  • JeffV
    JeffV over 15 years
    I agree with you 100%, however, do you have an alternative to modulus in the case where performance matters?
  • JeffV
    JeffV over 15 years
    I meant division and modulus is 5-10 times faster on a micro with an integer division op code in the instruction set.
  • Rob Rolnick
    Rob Rolnick over 15 years
    Oh, sorry, I misunderstood. (Clearly.) Have you profiled the other suggestion?
  • JeffV
    JeffV over 15 years
    Yes, I see that too. I changed it to 7 because 6 evently divisible by two. I'm looking for an answer that could be used for any number known up front.
  • JeffV
    JeffV over 15 years
    Wow, if I ever have to do it for a number other than 7, I'll have to come back and re-read this post.
  • shmuelp
    shmuelp over 15 years
    I don't have a specific alternative. If I couldn't use a bitmask or right shift like @Paul-Tomblin suggested, I'd keep a counter, just like you suggested in your question. Of course, I'm used to working with more overhead (we're running CORBA on our processors).
  • Paul Tomblin
    Paul Tomblin over 15 years
    Jeff V: Interesting fact about math: x * 7 == x + (x * 2) + (x * 4), or x + x >> 1 + x >> 2. Integer addition is usually pretty cheap.
  • JeffV
    JeffV almost 12 years
    Power of two is the easy case.
  • Johan Lundberg
    Johan Lundberg almost 11 years
    "The C and C++ standards don't specify the sign of the modulus when the first number is negative." This has now been resolved with C++11. stackoverflow.com/questions/12710801/c-operator-guarantees
  • étale-cohomology
    étale-cohomology over 7 years
    This is precisely what I was looking for!!
  • MSalters
    MSalters almost 4 years
    Modulus is slow on microcontrollers that lack a hardware division instruction. So this doesn't actually solve the problem.