Difference between BooleanClause.Occur.Must and BooleanClause.Occur.SHOULD in lucene

16,524

Solution 1

BooleanClause.Occur.SHOULD means that the clause is optional, whereas BooleanClause.Occur.Must means that the clause is compulsory.

However, if a boolean query only has optional clauses, at least one clause must match for a document to appear in the results.

For better control over what documents match a BooleanQuery, there is also a minimumShouldMatch parameter which lets you tell Lucene that at least minimumShouldMatch BooleanClause.Occur.SHOULD clauses must match for a document to appear in the results.

Solution 2

I will try to explain using an example:

Let's assume that there are two clauses: Clause A and Clause B. Now the effect of the BooleanClause.Occur will be as follows:

  • In first case, both clause A and B have BooleanClause.Occur.Should flag set. This will imply that even if one of the clause is satisfied (A or B), then the document will be a hit.

  • In second case, clause A has BooleanClause.Occur.Must flag set and clause B has BooleanClause.Occur.Should flag set.

    In this case, a document will be a hit when it "will" satisfy clause A. Whether this document satisfies clause B or not will have no effect on it being a hit.

    But if the document does not satisfies clause A, then no matter whether it satisfies clause B or not, it will not be a hit.

  • In third case, both clause A and clause B have the BooleanClause.Occur.Must flag set.

    In this case, a document will be a hit, only when it will satisfy "both" the clauses. If it will fail to satisfy even one of the clause, then it will not be a hit.

Solution 3

BooleanClause.Occur.Must stands for a Mandatory clause. The Clause should be met for the result to be returned. Basically AND.

BooleanClause.Occur.SHOULD stands for the optional clause and would behave like an OR

Solution 4

SHOULD clause is the most important feature in lucene when your most important concern is RANKING !

When you use the SHOULD clause Lucene ranks the retrieved document by the summation of SHOULD clause points. Thus you can join some query together with SHOULD clause with different boosts (according to their importance). This is the concept behind ExtendedDismaxQuery in Solr.

Share:
16,524
Jagadesh
Author by

Jagadesh

Updated on June 03, 2022

Comments

  • Jagadesh
    Jagadesh almost 2 years

    Can anyone explain the difference between the BooleanClause.Occur.Must and BooleanClause.Occur.SHOULD in lucene in BooleanQuery with an example?

  • physicsmichael
    physicsmichael about 8 years
    According to this lucene documentation if a query has only optional clauses it still remains optional. setMinimumNumberShouldMatch controls that, but by default it treats the query like 0 was used.