Maintaining Lucene Indexes Uptime in Sitecore Production Environments

Greetings Readers,

In this post I am going to share some of my thoughts on how to maintain lucene indexes high uptime in sitecore production environments, This post is tested on Sitecore 8.0 Update 1, but it should also be applicable to Sitecore 7.2 + versions.

Production Environment Set up

Before diving deep I would like to go over my production environment set up (For the sake of simplicity, I have chose to just show the Master Web and Core Database Configurations).

The production environment is broken into 2 major parts

  • Content Management Server (CM) : This box is used primarily for content authoring
  • Content Delivery Server (CD1 and CD2) : Load Balanced Delivery Servers
  • Database Server : This box holds all of my databases
    • CM Points to (Master,CMWeb,Core*)
    • CD1 and CD2 Points to (CDWeb,Core*)
    • Core* : The core database is shared across all the 3 boxes (CM,CD1 and CD2)

Challenge 1 : How to maintain lucene index availability during rebuild ?

Lucene Indexes became unavailable for a short time whenever we decided to rebuild our lucene indexes in production environments either due to corrupted indexes or due to content editors complaining about their updates not showing up on the site, Since Lucene deletes the file system subdirectory that contains the indexes before rebuilding the indexes. This caused the indexes to return empty results until the rebuild got finished.

Solution 1 : Using SwitchOnRebuildLuceneIndex Provider

In order to avoid downtime, we really needed a way to tell lucene to not delete the existing index file system directory and rebuild it in a separate directory and Once the rebuild completes the new directory becomes the active index.

Hence the SwitchonRebuildLuceneIndex provider, This class inherits from Lucene Index and adds important capability of maintaining two directories for a particular index. This solves the problem of Lucene Index implementation which resets (deletes) the index directory before a full index rebuild.

Note : After you make the change , you will need to rebuild your indexes twice so that both primary and secondary (this folder will have _sec appended to its name) index folders will get created in your filesystem.

To use this implementation, change the type reference on a particular search index to
Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex:

[code language=”xml”]
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<contentSearch>
<configuration>
<indexes>
<index id="your_custom_index">
<patch:attribute name="type">Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex,Sitecore.ContentSearch.LuceneProvider</patch:attribute>
</index>
</indexes>
</configuration>
</contentSearch>
</sitecore>
</configuration>
[/code]
Challenge 2 : Ensuring that the content/index updates on CM server are made on CD servers

I have always faced issues with the lucene indexes remaining in sync with the changes made on the CM server. This is very important if your websites functionality relies on indexes such as Search. I have faced difficulties starting from delays in indexing to index becoming stale on one of the delivery boxes.

Solution 2 : Index Update Strategies to your rescue

Index Update Strategies are designed to provide a transparent and flexible model for index maintenance. Each index can be configured with unique set of index update strategies.

Below is the set of index update strategies which I have configured and tested

Strategy 1 : onPublishEndAsync 

This strategy is used to incrementally update the index when a content author has published an item. During the initialization, it subscribes to the OnPublishEnd event and triggers an incremental index rebuild. With separate CM and CD servers, this event will be triggered via the EventQueue object, meaning that the EventQueue object needs to be enabled for this strategy to work in such environment.

Note: If you take a look at the file at App_Config\Include\Sitecore.ContentSearch.DefaultConfigurations.config , There are two things to take note about the onPublishEndAsync strategy

  • The “database” parameter defines the database from where to look up the item changes for the processing.(Default : Web)
  • In order to prevent excessive processing of the Event Queue, the strategy will force a full index rebuild when the number of entries in the history table exceeds the number defined in the following setting: Indexing.FullRebuildItemCountThreshold. In most cases, this means that a substantial publishing or deployment occurred, which should always trigger a full index rebuild. This behavior will only be triggered when the following property in configuration is set to true (which is the default)
  • The Indexing.FullRebuildItemCountThreshold setting is not set out of the box and defaults to 100000.

[code language=”xml”]
<!– REINDEX ON PUBLISH END
This strategy is triggered on publish:end and uses the EventQueue to incrementally rebuild the index.
–>
<onPublishEndAsync type="Sitecore.ContentSearch.Maintenance.Strategies.OnPublishEndAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">web</param>
<!– Whether or not a full index rebuild should be triggered when the number of items in the EventQueue exceeds the number specified
in ContentSearch.FullRebuildItemCountThreshold. –>
<CheckForThreshold>true</CheckForThreshold>
</onPublishEndAsync>
[/code]

Strategy 2 : RemoteRebuildStrategy

This strategy is used to trigger rebuild of indexes on remote servers (CD1 and CD2). This strategy subscribes to the OnIndexingEndedRemote event which is triggered when a particular index is rebuilt. This strategy will react only when a full index rebuild is performed.

This strategy can be combined with any other strategy and can be quite handy within multi-server environments where each Sitecore instance maintains its own copy of the index. This way full rebuild can be triggered from one CM server, and this event will be raised on all remote servers where the index is configured with this strategy.

Important Considerations to be aware of while using RemoteRebuildStrategy

  • The database that is assigned for system event queue storage (core by default) should be shared between the Sitecore instance where the rebuild happened (Content Management) and where it needs to be replayed.(Content Delivery)
  • The Index Name should be identical across all the environments
  • Event Queues should be enabled across all the environments

Index Configuration with the Update Strategies

[code language=”xml”]
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<contentSearch>
<configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
<indexes hint="list:AddIndex">
<index id="your_custom_index" type="Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex, Sitecore.ContentSearch.LuceneProvider">
<param desc="name">$(id)</param>
<param desc="folder">$(id)</param>
<!– This initializes index property store. Id has to be set to the index id –>
<param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
<configuration ref="contentSearch/indexConfigurations/CareersSearchConfiguration" />
<strategies hint="list:AddStrategy">
<!– NOTE: order of these is controls the execution order –>
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync" />
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/remoteRebuild" />
</strategies>
<commitPolicyExecutor type="Sitecore.ContentSearch.CommitPolicyExecutor, Sitecore.ContentSearch">
<policies hint="list:AddCommitPolicy">
<policy type="Sitecore.ContentSearch.TimeIntervalCommitPolicy, Sitecore.ContentSearch" />
</policies>
</commitPolicyExecutor>
<locations hint="list:AddCrawler">
<crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
<Database>master</Database>
<Root>/sitecore/content/home/</Root>
</crawler>
</locations>
</index>
</indexes>
</configuration>
[/code]

Summary

Using the above to solutions, I have my lucene indexes in sync with high uptime in the production environments.

Should you have any questions, Please do not hesitate to tweet me @sjain_hi or comment on the below post