Solr Did you mean (Spell check component)

13,084

For your first problem you could use WordBreakSpellChecker

As for your second problem you could set <str name="spellcheck.onlyMorePopular">true</str> to <str name="spellcheck.onlyMorePopular">false</str> and see if this has the expected result.

Share:
13,084
ZendMind
Author by

ZendMind

Updated on June 14, 2022

Comments

  • ZendMind
    ZendMind almost 2 years

    I use solr for my apps and i integrated the spellcheck component but i have some problems :

    First : When i type a term separated by space they give me the correction for each term

    Eg : "wat ters" => "what term" but the true is watters

    Second : When i type some phrase with some wrong term. although the other terms are correct they apply the spell for all terms.

    Eg : "Difreences in lankuage use conventions" => "Differences in language use conversions".

    The true is "Differences in language use conventions"

    This is my config in solrconfig.xml :

    <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
                <str name="queryAnalyzerFieldType">textSpell</str>
                <lst name="spellchecker">
                    <str name="name">default</str>
                    <str name="field">spell</str>
                    <str name="spellcheckIndexDir">spellchecker</str>
                </lst>
    </searchComponent>
    
    <requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
                <lst name="defaults">
                    <str name="spellcheck.onlyMorePopular">true</str>
                    <str name="spellcheck.extendedResults">false</str>
                    <str name="spellcheck.count">1</str>
                </lst>
                <arr name="last-components">
                    <str>spellcheck</str>
                </arr>
    </requestHandler>
    

    Schema.xml :

    Field type:

    <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
                <analyzer type="index">
                    <tokenizer class="solr.StandardTokenizerFactory"/>
                    <filter class="solr.LowerCaseFilterFactory"/>
                    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
                    <filter class="solr.ASCIIFoldingFilterFactory" />
                    <filter class="solr.SnowballPorterFilterFactory" language="English"/>
                    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
                </analyzer>
                <analyzer type="query">
                    <tokenizer class="solr.StandardTokenizerFactory"/>
                    <filter class="solr.LowerCaseFilterFactory"/>
                    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
                    <filter class="solr.ASCIIFoldingFilterFactory" />
                    <filter class="solr.SnowballPorterFilterFactory" language="English"/>
                    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
                </analyzer>
                <analyzer type="multiterm" >
                    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                    <filter class="solr.ASCIIFoldingFilterFactory" />
                </analyzer>
            </fieldType>
    
    
        <fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100" omitNorms="true">
                    <analyzer type="index">
                       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                         <filter class="solr.LowerCaseFilterFactory"/>
                         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
                         <filter class="solr.StandardFilterFactory"/>
                         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
                    </analyzer>
                    <analyzer type="query">
                         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
                         <filter class="solr.LowerCaseFilterFactory"/>
                         <!--<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>-->
                         <filter class="solr.StandardFilterFactory"/>
                         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
                    </analyzer>
                </fieldType>
    

    Fields :

    <field name="title" type="text" indexed="true" stored="true" termVectors="true"/>
    <field name="spell" type="textSpell" indexed="true" stored="true" multiValued="true"/>
    

    Copy Fields

    <copyField source="title" dest="spell"/>
    

    I would appreciate your help

    Cheers

  • ZendMind
    ZendMind about 12 years
    Hi Klein, thnk's for your response, am using appach-solr3.5 to interact with php. Can you tell me how can i use this patch ? Cheers
  • ZendMind
    ZendMind about 12 years
    Is WordBreakSpellChecker is standard in appach-solr3.5 ?
  • Okke Klein
    Okke Klein about 12 years
    It's not standard. You need to apply it to source and build new jar/war.
  • ZendMind
    ZendMind about 12 years
    Hi, I am a php coder on windows platforms so java development is new to me. can you tel me how to build the new jar/war on apach-solr After search i found the spellCheckComponent.class file but there ise spellCheckComponent.java file
  • ZendMind
    ZendMind about 12 years
    i opened the apache-solr-3.5.0.war that exist on the /dist path
  • ZendMind
    ZendMind about 12 years
    Thank's very much Mr. Klein, i'll check it out and keep you informed. Cheers