How are bitmap indexes helpful?

11,314

Solution 1

A better representation of a bitmap index, is if given the sample above:

Identifier    Gender          RowID
1             Female          R1
2             Male            R2
3             Male            R3
4             Unspecified     R4
5             Female          R5

the a bitmap index on the gender column would (conceptually) look like this:

Gender       R1    R2   R3   R4   R5
Female       1     0    0    0    1
Male         0     1    1    0    0
Unspecified  0     0    0    1    0

Bitmap indexes are used when the number of distinct values in a column is relatively low (consider the opposite where all values are unique: the bitmap index would be as wide as every row, and as long making it kind of like one big identity matrix.)

So with this index in place a query like

SELECT * FROM table1 WHERE gender = 'Male'

the database looks for a match in the gender values in the index, finds all the rowids where the bit was set to 1, and then goes and gets the table results.

A query like:

SELECT * FROM table1 WHERE gender IN ('Male', 'Unspecified')

would get the 1 bits for Male, the 1 bits for Unspecified, do a bitwise-OR then go get the rows where the resulting bits are 1.

So, the advantages of using a bitmap index over a b*tree index are storage (with low cardinality, bitmap indexes are pretty compact), and the ability to do bitwise operations before resolving the actual rowids which can be pretty quick.

Note that bitmap indexes can have performance implications with inserts/deletes (conceptually, you add/remove a column to/from the bitmap and rejig it accordingly...), and can create a whole lot of contention as an update on a row can lock the entire corresponding bitmap entry and you can't update a different row (with the same bitmap value) until the first update is committed/rolled back.

Solution 2

The benefit comes when filtering on multiple columns, then the corresponding indexes can be merged with bitwise operations before actually selecting the data. If you have gender, eye_colour, hair_colour then the query

select * from persons where
                      gender = 'male' and 
                      (eye_colour = 'blue' or hair_colour = 'blonde')

would first make a bitwise or between the eye_colour['blue'] index and the hair_colour['blonde'] index and finally bitwise and between the result and the gender['male'] index. This operation performs really fast both computationally and I/O.
The resulting bit stream would be used for picking the actual rows.

Bitmap indexes are typically used in "star joins" in data warehouse applications.

Solution 3

As indicated in the Wikipedia article, they use bitwise operations, which can perform better than comparing data types such as integers, so the short answer is increased speed of queries.

Theoretically, it should take up less computations and less time to select all males or all females from your example.

Just thinking about how this works under the hood should make why this is faster obvious. A bit is logically either true or false. If you want to do a query using a WHERE clause, this will eventually evaluate to either a true or a false for the records in order to determine whether to include them in your results.

Preface - the rest of this is meant to be layman's terns and non-techie

So the next question is what does it take to evaluate to true? Even comparing numeric values means that the computer has to...

  1. Allocate memory for the value you want to evaluate
  2. Allocate memory for the control value
  3. Assign the value to each (count this as two steps)
  4. Compare the two - for a numeric this should be quick, but for strings, there are more bytes to compare.
  5. Assign the results to a 0(false) or 1 (true) value.

repeat if you're using a multiple part where clause such as Where "this = this AND that = that"

  1. perform bitwise operations on the results generated in step 5
  2. Come up with the final value
  3. deallocate the memory allocated in steps 1-3

But using bitwise logic, you're just looking at 0 (false) and 1 (true) values. 90% of the overhead for the comparison work is eliminated.

Share:
11,314
Moeb
Author by

Moeb

Updated on June 01, 2022

Comments

  • Moeb
    Moeb almost 2 years

    Wikipedia gives this example

    Identifier    Gender         Bitmaps
                                  F    M
    1           Female            1    0
    2           Male              0    1
    3           Male              0    1
    4           Unspecified       0    0
    5           Female            1    0
    

    But I do not understand this.

    • How is this an index first of all? Isn't an index supposed to point to rows (using rowid's) given the key?
    • What would be the typical queries where such indexes would be useful? How are they better than B-tree indexes? I know that if we use a B-tree index on Gender here, we will get a lot of results if for example, we look for Gender = Male, which need to be filtered out further (so not very useful). How does a Bitmap improve the situation?
  • Beginner
    Beginner about 7 years
    Would the database scan the whole bitmap for 'Unspecified' to search for all corresponding rows or is there some case of lookup structure?
  • Patrick Marchand
    Patrick Marchand about 7 years
    @Beginner, see "Bitmap Storage Structure" here: docs.oracle.com/database/121/CNCPT/indexiot.htm#CNCPT88851
  • Dan Lenski
    Dan Lenski about 7 years
    This is the best explanation I've yet read on why bitmap indexes may be useful. What I'm still unclear on, however, is why a bitmap index would be better than a normal b-tree index when searching on only one column. A b-tree index should also allow me to quickly determine the subset of rows that correspond to Male or Female or Male | Unspecified, right?
  • Patrick Marchand
    Patrick Marchand about 7 years
    @DanLenski - Yes, absolutely, b-tree indexes are very quick and bitmap indexes are not a general replacement for them. I'm not sure where you got the impression that bitmap indexes are better than a b-tree in performance; as stated, the advantages bitmap indexes have over b-trees are storage (where cardinality is low on the indexed column) and the ability to do bitwise operations with other bitmap indexes. (In fact, bitmap indexes can actually cause trouble when you have a lot of DML going on: akadia.com/services/ora_bitmapped_index.html)
  • Dan Lenski
    Dan Lenski about 7 years
    Thanks, @Patrick Marchand. I've read a couple other articles about bitmap indexes that suggested they would be faster than B-tree for single-column search, which never made any sense to me, since I could only understand how they would accelerate the search for multiple low-cardinality columns (thanks to bitwise ops and efficient storage).
  • vaughan
    vaughan about 3 years
    Thanks, many answers mentioned bitwise without explaining its applicability like you have.