Creating a custom index update strategy in Sitecore

Index update strategies provide a way for you to customize how and when a Sitecore index get updated. In a recent project we had an index that contained computed fields based on related items. We needed a way to update the index entry for one item when a related item was published.

As with many other things Sitecore related it all begins with a configuration change. If you take a look at the definition of an index in Sitecore you can see that it contains a strategies node. Any strategies used by the index will be listed here. When making a custom strategy you can either define it here or add it to contentSearch/indexConfigurations/indexUpdateStrategies and then reference it. For the purpose of this demo we’ll choose the former simply to keep our demo code more compact.

[sourcecode language=”xml”]
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<databases>
<database id="web">
<Engines.HistoryEngine.Storage>
<obj type="Sitecore.Data.SqlServer.SqlServerHistoryStorage, Sitecore.Kernel">
<param connectionStringName="$(id)"/>
<EntryLifeTime>30.00:00:00</EntryLifeTime>
</obj>
</Engines.HistoryEngine.Storage>
</database>
</databases>
<contentSearch>
<configuration type="Sitecore.ContentSearch.ContentSearchConfiguration, Sitecore.ContentSearch">
<indexes hint="list:AddIndex">
<index id="sitesearch_web" type="Sitecore.ContentSearch.LuceneProvider.LuceneIndex, Sitecore.ContentSearch.LuceneProvider">
<param desc="name">$(id)</param>
<param desc="folder">$(id)</param>
<!–
This initializes index property store. Id has to be set to the index id
–>
<param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)"/>
<configuration ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration"/>
<strategies hint="list:AddStrategy">
<!–
NOTE: order of these is controls the execution order
–>
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync"/>
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/rebuildAfterFullPublish"/>
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/remoteRebuild"/>

<!– This is our new strategy–>
<MyUpdateStrategy type="LaunchSitecore.IndexStrategy.MyUpdateStrategy, LaunchSitecore">
<Database>web</Database>
</MyUpdateStrategy>
</strategies>
<commitPolicyExecutor type="Sitecore.ContentSearch.CommitPolicyExecutor, Sitecore.ContentSearch">
<policies hint="list:AddCommitPolicy">
<policy type="Sitecore.ContentSearch.TimeIntervalCommitPolicy, Sitecore.ContentSearch"/>
</policies>
</commitPolicyExecutor>
<locations hint="list:AddCrawler">
<crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
<Database>web</Database>
<Root>/sitecore/content/Home</Root>
</crawler>
</locations>
</index>
</indexes>
</configuration>
</contentSearch>
</sitecore>
</configuration>
[/sourcecode]

As you can see we’re creating a new index for this demo. This index is a copy of the Sitecore_web_index but adds a MyUpdateStrategy node to the strategies. MyUpdateStrategy consists of two main parts. The type attribute and a database. The type attribute references a class that defines our implementation of this strategy while the database value will be passed into our class and provides an easy way to manage settings from the configuration file.

Next we’ll create our MyUpdateStrategy class. This class will need to implement IIndexUpdateStrategy but should also have a public property named “Database”. This property will be set from the configuration values.

So far we should have the following. Let’s take a break and test this out.

[sourcecode language=”csharp”]
namespace LaunchSitecore.IndexStrategy
{
public class MyUpdateStrategy : IIndexUpdateStrategy
{
public string Database { get; set; }

public void Initialize(Sitecore.ContentSearch.ISearchIndex searchIndex)
{
throw new NotImplementedException("But our database is: " + Database);
}
}
}

[/sourcecode]

Build your solution get Sitecore started and check your logs. You should now see an error similar to:

7472 09:09:42 ERROR Error loading hook: <hook type=”Sitecore.ContentSearch.Hooks.Initializer, Sitecore.ContentSearch” patch:source=”Sitecore.ContentSearch.config” xmlns:patch=”http://www.sitecore.net/xmlconfig/” />

Exception: System.Reflection.TargetInvocationException

Message: Exception has been thrown by the target of an invocation.

Nested Exception

 

Exception: System.NotImplementedException

Message: But our database is: web

Source: LaunchSitecore

at LaunchSitecore.IndexStrategy.MyUpdateStrategy.Initialize(ISearchIndex searchIndex) in c:\inetpub\wwwroot\SC75CommerceConnectDemo\Website\IndexStrategy\MyUpdateStrategy.cs:line 16

at Sitecore.ContentSearch.LuceneProvider.LuceneIndex.AddStrategy(IIndexUpdateStrategy strategy)

Since we’re intentionally throwing an exception this is what we should expect at this point. From here we can begin our actual implementation. As a sample use case we’ll configure our strategy to update the index for any items that a published item links to in our droptree field named “Foo”.

The first step will be to set up a handler to account for the publish end event. We’ll do this by using the EventHub.

[sourcecode language=”csharp”]
EventHub.PublishEnd += (sender, args) => HandlePublishEnd(sender, args); //Who knew such magic exsisted!
[/sourcecode]

And, of course, we’ll need a HandlePublishEnd event.

[sourcecode language=”csharp”]
private void HandlePublishEnd(object sender, EventArgs args)
{
throw new NotImplementedException("Not yet…");
}
[/sourcecode]

Let’s stop again and test things. Build your solution and publish an item in Sitecore. Be sure to re-publish rather than smart publish. Again, if everything is working so far we should see our error in the logs.

IndexStratErrorSS

We’re now running our code when a publish ends, however we’ll want to know the contents of that publish. That’s where Sitecore’s history engine comes into play.

Once we know we’re able to run our code on publish end, let’s register, then trigger, an action with the OperationMonitor.

[sourcecode language=”csharp”]
private void HandlePublishEnd(object sender, EventArgs args)
{
OperationMonitor.Register(new Action(this.Run));
OperationMonitor.Trigger();
}

public void Run()
{

}
[/sourcecode]

For my example I’ll implement the run method by getting the database from the name specified in the config, reading from the database history, extracting items from our “Foo” field, then using the IndexCustodian to refresh said items.

[sourcecode language=”csharp”]
public void Run()
{
CrawlingLog.Log.Info(string.Format("[Index={0}] MyUpdateStrategy Publish ended", Index.Name), null);

var database = Sitecore.Data.Database.GetDatabase(Database);
if (database == null)
{
CrawlingLog.Log.Error(string.Format("[Index={0}] MyUpdateStrategy unable to find database: {1}", Index.Name, Database), null);
return;
}

if (Index == null)
{
CrawlingLog.Log.Error(string.Format("[Index={0}] MyUpdateStrategy index is null", Index.Name), null);
return;
}

var historyKey = string.Format("{0}{1}", Environment.MachineName, GetType().FullName);

var historyItems = database.DatabaseHistory().GetLatestHistoryEntries(historyKey);
var itemsToReIndex = historyItems.Select(x => x.ItemId)
.Where(x => !x.IsNull)
.Distinct()
.Select(x => database.GetItem(x))
.Where(x => TemplateManager.GetTemplate(x).DescendsFromOrEquals(TestTemplateID))
.Select(x => ((Sitecore.Data.Fields.LookupField)x.Fields["Foo"]))
.Where(x => x != null)
.Select(x => x.TargetItem)
.ToList();

foreach (var indexableItem in itemsToReIndex)
{
CrawlingLog.Log.Info(string.Format("[Index={0}] MyUpdateStrategy refreshing {1}", Index.Name, indexableItem.Paths.FullPath), null);
IndexCustodian.Refresh(Index, new SitecoreIndexableItem(indexableItem));
}
}

[/sourcecode]

You should now be able to fully customize how and when an index is updated. You’ll be able to key off different Sitecore events, pull data from other sources and specify which items in the index should be refreshed.

Another great reference to use is Dan Cruickshank’s blog post.