Is BitArray faster in C# for getting a bit value than a simple conjuction with bitwise shift?

31,985

Solution 1

@Jonathon Reinhart,

your benchmark is unfortunately inconclusive. It does not take into account the effects of possible lazy-loading, caching and/or prefetching (by the CPU, the host OS and/or the .NET runtime).

Shuffle the order of the tests (or call the test methods multiple times) and you might notice different time measurments.

I did your original benchmark built with "Any CPU" platform target and .NET 4.0 client profile, running on my machine with a i7-3770 CPU and 64-bit Windows 7.

What i got was this:

Testing with 10000000 operations:
   A UInt32 bitfield took 484 ms.
   A BitArray (32) took 459 ms.
   A List<bool>(32) took 393 ms.

which is pretty much in line with your observations.

However, executing the BitArray test before the UInt32 test yielded this:

Testing with 10000000 operations:
   A BitArray (32) took 513 ms.
   A UInt32 bitfield took 456 ms.
   A List<bool>(32) took 417 ms.

By looking at the times for the UInt32 and BitArray tests you will notice that the measured time does not seem to be connected to the tests themselves, but rather to the order in which the tests are run.

To alleviate these side effects at least a little bit, i executed the test methods twice in each program run with the following results.

Test order UInt32, BitArray, BoolArray, UInt32, BitArray, BoolArray:

Testing with 10000000 operations:
   A UInt32 bitfield took 476 ms.
   A BitArray (32) took 448 ms.
   A List<bool>(32) took 367 ms.

   A UInt32 bitfield took 419 ms.  <<-- Watch this.
   A BitArray (32) took 444 ms.    <<-- Watch this.
   A List<bool>(32) took 388 ms.

Test order BitArray, UInt32, BoolArray, BitArray, UInt32, BoolArray:

Testing with 10000000 operations:
   A BitArray (32) took 514 ms.
   A UInt32 bitfield took 413 ms.
   A List<bool>(32) took 379 ms.

   A BitArray (32) took 444 ms.    <<-- Watch this.
   A UInt32 bitfield took 413 ms.  <<-- Watch this.
   A List<bool>(32) took 381 ms.

Looking at the second invocations of the test methods, it appears that at least on i7 CPUs with up-to-date .NET runtime, the UInt32 test is faster than the BitArray test, while the BoolArray test is still being the fastest.

(I apologize that i had to write my response to Jonathon's benchmark as an answer, but as a new SO user i am not allowed to comment...)

EDIT:

Instead of shuffling the order of test methods, you might try putting a Thread.Sleep(5000) or similar right before calling the first test...

Also the original test seems to put the UInt32 test at disadvantage, because it includes a boundary check "if (bitnum < 0 || bitnum > 31)", which is executed 30 million times. None of the other two tests include such a boundary check. However, this is actually not the whole truth, since both the BitArray and the bool array do boundary checks internally.

Although i didn't test, i expect that eliminating the boundary checks will make the UInt32 and BoolArray tests perform similarly, but that would not be a good proposition for a public API.

Solution 2

BitArray is going to be able to handle an arbitrary number of boolean values, whereas a byte will hold only 8, int only 32, etc. This is going to be the biggest difference between the two.

Also, BitArray implements IEnumerable, where a integral type obviously does not. So it all depends on the requirements of your project; if you need an IEnumerable or array-like interface, then go with the BitArray.

I would actually use a bool[] over either solution, simply because it is more explicit in what kind of data you're keeping track of. T

BitArray or bitfield will use approximately 1/8th the space of a bool[] because they "pack" 8 boolean values into a single byte, whereas a bool by itself will take up the whole 8-bit byte. The space advantage of using a bitfield or BitArray isn't going to matter though until you being storing lots of bools. (The math is left up to the reader :-))


Benchmark

Results: For my primitive test environment, it appears that BitArray is a bit faster, but is on the same order of magnitude as doing it yourself with an integral type. Also tested was a bool[], which was unsurprisingly the fastest. Accessing single bytes in memory is going to be less complex than accessing individual bits in different bytes.

Testing with 10000000 operations:
   A UInt32 bitfield took 808 ms.
   A BitArray (32) took 574 ms.
   A List<bool>(32) took 436 ms.

Code:

class Program
{
    static void Main(string[] args)
    {
        Random r = new Random();
        r.Next(1000);

        const int N = 10000000;

        Console.WriteLine("Testing with {0} operations:", N);

        Console.WriteLine("   A UInt32 bitfield took {0} ms.", TestBitField(r, N));
        Console.WriteLine("   A BitArray (32) took {0} ms.", TestBitArray(r, N));
        Console.WriteLine("   A List<bool>(32) took {0} ms.", TestBoolArray(r, N));

        Console.Read();
    }


    static long TestBitField(Random r, int n)
    {
        UInt32 bitfield = 0;

        var sw = Stopwatch.StartNew();
        for (int i = 0; i < n; i++) {

            SetBit(ref bitfield, r.Next(32), true);
            bool b = GetBit(bitfield, r.Next(32));
            SetBit(ref bitfield, r.Next(32), b);
        }
        sw.Stop();
        return sw.ElapsedMilliseconds;
    }

    static bool GetBit(UInt32 x, int bitnum) {
        if (bitnum < 0 || bitnum > 31)
            throw new ArgumentOutOfRangeException("Invalid bit number");

        return (x & (1 << bitnum)) != 0;
    }

    static void SetBit(ref UInt32 x, int bitnum, bool val)
    {
        if (bitnum < 0 || bitnum > 31)
            throw new ArgumentOutOfRangeException("Invalid bit number");

        if (val)
            x |= (UInt32)(1 << bitnum);
        else
            x &= ~(UInt32)(1 << bitnum);
    }



    static long TestBitArray(Random r, int n)
    {
        BitArray b = new BitArray(32, false);     // 40 bytes

        var sw = Stopwatch.StartNew();
        for (int i = 0; i < n; i++) {

            b.Set(r.Next(32), true);
            bool v = b.Get(r.Next(32));
            b.Set(r.Next(32), v);
        }
        sw.Stop();
        return sw.ElapsedMilliseconds;
    }



    static long TestBoolArray(Random r, int n)
    {
        bool[] ba = new bool[32];

        var sw = Stopwatch.StartNew();
        for (int i = 0; i < n; i++) {

            ba[r.Next(32)] = true;
            bool v = ba[r.Next(32)];
            ba[r.Next(32)] = v;
        }
        sw.Stop();
        return sw.ElapsedMilliseconds;
    }
}
Share:
31,985

Related videos on Youtube

Secret
Author by

Secret

Updated on July 09, 2022

Comments

  • Secret
    Secret almost 2 years

    1). var bitValue = (byteValue & (1 << bitNumber)) != 0;

    2). using System.Collections.BitArray with a Get(int index) method

    • What is faster?
    • In what situations for the .NET projects BitArray could be more useful than a simple conjunction with the bitwise shift?
    • Corey Ogburn
      Corey Ogburn almost 11 years
      Using System.Diagnostics.Stopwatch you could time if it's faster. It's best to try it in as close to the production environment as possible.
  • Yinda Yin
    Yinda Yin almost 11 years
    I've removed the second question from the original post, and reopened. Interestingly, I've got a bunch of SetBit and GetBit functions in a current project that look just like these.
  • Yinda Yin
    Yinda Yin almost 11 years
    Also, it looks like your code tests the speed of the random number generator as well as the bit shifting. It wouldn't surprise me if the random number generation takes significantly longer.
  • Seng Cheong
    Seng Cheong almost 11 years
    @RobertHarvey You're correct, but I wasn't too concerned about it. A) The random number generation should be pretty constant, and it's done the same between all methods, so it can be ignored. B) To do this without timing the random number generation would be much more complex, and also not test the functionality is well. If you've got a different idea, I'd appreciate hearing it for sure.
  • Yinda Yin
    Yinda Yin almost 11 years
    I ran your code unchanged on my machine, and got results of 1525ms and 1185ms, respectively. Then I changed your random r to an int everywhere, set it to zero, and ran it again. The results were 581ms and 209ms, suggesting that the BitArray is more than twice as fast, and Random is consuming 2/3 of the processing time. I got substantially the same results setting r to 31 (I used zero and 31 for the two runs).
  • Seph
    Seph over 10 years
    You should really run all your tests entirely separate and independent of each other and not just run one the next then the next.
  • Admin
    Admin over 10 years
    @Seph, you are correct. For a proper benchmark, this would be the way to go. However, the code i wrote was just meant to demonstrate the famous principle "You don't measure what you think you are measuring" ;)
  • Trap
    Trap over 10 years
    One thing I've learned over the years about benchmarking jit-ed languages is that no one seems to do it right. There's a lot of 'external forces' hidden by the compiler that will affect your results. I pretty much doubt these tests are valid or conclusive.
  • Seng Cheong
    Seng Cheong about 10 years
    @Trap Have any better ideas?
  • mmg666
    mmg666 over 9 years
    I just came across this and wanted to add that saying that the space difference between an array of booleans and a BitArray will not matter until storing lots of bools can be misleading. I mean it depends on what we mean by lots of. Because a list or an array may run out of allocable memory before the theoretical limit. And most importantly, for me, the largest advantage of a BitArray is not the smaller space it requires to store the values ; it is that thanks to its tiny spacing it can store much more values than a bool array could.
  • Neil
    Neil over 9 years
    It would be interesting to run the test on larger arrays. As you start outgrowing memory caches and eventually start paging to disk, BitArray's performance will look better and better.
  • Neil
    Neil over 9 years
    Also, I'm not sure why you say it's more explicit to use bool[] than a BitArray to store boolean values. To me, both structures seem equally intended for storing boolean values. I suppose people may be less familiar with BitArrays.
  • Seng Cheong
    Seng Cheong over 9 years
    @NeilWhitaker How often do you really run your systems to the point of zero free RAM? Time is money, and memory is very cheap.
  • RBT
    RBT about 6 years
    Solution of this problem on HackerRank times out if I create the solution using BitArray. It works in acceptable time limits only when I use bool[]. I'm getting a strong evidence from my pratical experience that bitArray is slower. Only advantage in case of BitArray is space saving ~ 1 byte per bool flag.