Sitecore Standard Analyzer : Turn off the stop words filter

In a previous blog post, my colleague and friend, Sheetal Jain wrote this blog post: Sitecore Standard Analyzer : Managing your own stop words filter. If you only want to turn stop words off, you can use the following patch.config file without dropping an empty stopwords.txt file on the server:

<configuration xmlns:x="<a href="http://www.sitecore.net/xmlconfig/">http://www.sitecore.net/xmlconfig/</a>">

  <sitecore>

    <contentSearch>

      <indexConfigurations>

        <defaultLuceneIndexConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">

          <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.PerExecutionContextAnalyzer, Sitecore.ContentSearch.LuceneProvider">

            <param desc="defaultAnalyzer" type="Sitecore.ContentSearch.LuceneProvider.Analyzers.DefaultPerFieldAnalyzer, Sitecore.ContentSearch.LuceneProvider">

              <param desc="defaultAnalyzer" type="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net">

                <param desc="stopWords" type="System.IO.StringReader, mscorlib">

                  <param hint="s"></param>

                </param>

              </param>

            </param>

          </analyzer>

        </defaultLuceneIndexConfiguration>

      </indexConfigurations>

    </contentSearch>

  </sitecore>

</configuration>

For the Lucene.Net.Analysis.Standard.StandardAnalyzer, the following constructor was used Sheetal’s previous post:

public StandardAnalyzer(Version matchVersion, FileInfo stopwords)

Instead of using the constructor mentioned above for the Lucene.Net.Analysis.Standard.StandardAnalyzer, this patch.config will now call the following constructor:

public StandardAnalyzer(Version matchVersion, TextReader stopwords)

It will automatically convert the System.IO.StringReader to a System.IO.TextReader since StringReader inherits from TextReader. Line 19 of the patch.config code file passes an empty string to the System.IO.StreamReader constructor, so there will be no stop words in passed to the StandardAnalyzer.

Note: check your /sitecore/admin/showconfig.aspx and make sure that:

<param hint="version">Lucene_30</param>

comes before:

<param desc="stopWords" type="System.IO.StringReader, mscorlib" patch:source="LuceneComputedFields.config">
  <param hint="s"/>
</param>

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: