Creating Custom Sorts In Apache Solr
I attended Drupalcon for the first time this year in San Francisco, and of all the sessions I attended, one that really stood out for me was Apache Solr Search Mastery, put on by Robert Douglass and Peter Wolanin of Acquia and James McKinney of Evolving Web. It was like the first time I read "Pro Drupal Development", in that it really opened my eyes as to what is possible with the Solr module. On top of that, I was almost immediately able to take what I learned back to a project I was working on and start using it.
One of the first use cases I had was the need to add a custom sort to the search results. Out of the box, the module has the following five sorts available:
- Relevancy
- Title
- Type
- Author
- Date
However, the client also wanted to add a search by the number of times a node had been viewed, as provided by the core Statistics module. In order to make this data available in Solr as a sortable field, we have to follow a two step process:
- Add the field to the Solr index
- Tell Solr that the field is sortable
Adding the field to the index
The apachesolr module has multiple available hooks that can be used to alter the search data. In this case, the one that we want is hook_apachesolr_update_index. This hook is run for each node, and is passed the $document object by reference. $document is the XML document that is passed to Solr for indexing. The source for our data is the totalcount field in the node_counter table.
function mymodule_apachesolr_update_index(&$document, $node) { // Get count from node_counter table $count = db_result(db_query("SELECT totalcount FROM {node_counter} WHERE nid = %d", $node->nid)); // Add field to index if ($count !== FALSE) { $document->tis_hit_count = (int) $count; } }
So all we do is run a query to get our count value, and add it to the Solr index in the tis_hit_count field, but only if the query returns a value. If you try to add an empty value, such as the case where there is no record in node_counter for a node yet, Solr will throw an error.
One important thing to notice is the field name. The 'tis_' at the beginning of the name tells Solr what kind of data is stored in the field. If you look in the schema.xml file that comes with the module (and that you use to replace the default Solr file when installing the module), there is a section that lists all of the dynamic field naming patterns.
<!-- Dynamic field definitions. If a field name is not found, dynamicFields will be used if the name matches any of the patterns. RESTRICTION: the glob-like pattern in the name attribute must have a "*" only at the start or the end. EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, z_i) Longer patterns will be matched first. if equal size patterns both match, the first appearing in the schema will be used. --> <dynamicField name="is_*" type="integer" indexed="true" stored="true" multiValued="false"/> <dynamicField name="im_*" type="integer" indexed="true" stored="true" multiValued="true"/> <dynamicField name="sis_*" type="sint" indexed="true" stored="true" multiValued="false"/> <dynamicField name="sim_*" type="sint" indexed="true" stored="true" multiValued="true"/> <dynamicField name="sm_*" type="string" indexed="true" stored="true" multiValued="true"/> <dynamicField name="tm_*" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"/> <dynamicField name="ss_*" type="string" indexed="true" stored="true" multiValued="false"/> <dynamicField name="ts_*" type="text" indexed="true" stored="true" multiValued="false" termVectors="true"/> <dynamicField name="tsen2k_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="false" omitNorms="true" omitTermFreqAndPositions="true" /> <dynamicField name="ds_*" type="date" indexed="true" stored="true" multiValued="false"/> <dynamicField name="dm_*" type="date" indexed="true" stored="true" multiValued="true"/> <dynamicField name="tds_*" type="tdate" indexed="true" stored="true" multiValued="false"/> <dynamicField name="tdm_*" type="tdate" indexed="true" stored="true" multiValued="true"/> <dynamicField name="bm_*" type="boolean" indexed="true" stored="true" multiValued="true"/> <dynamicField name="bs_*" type="boolean" indexed="true" stored="true" multiValued="false"/> <dynamicField name="fs_*" type="sfloat" indexed="true" stored="true" multiValued="false"/> <dynamicField name="fm_*" type="sfloat" indexed="true" stored="true" multiValued="true"/> <dynamicField name="ps_*" type="sdouble" indexed="true" stored="true" multiValued="false"/> <dynamicField name="pm_*" type="sdouble" indexed="true" stored="true" multiValued="true"/> <dynamicField name="tis_*" type="tint" indexed="true" stored="true" multiValued="false"/> <dynamicField name="tim_*" type="tint" indexed="true" stored="true" multiValued="true"/> <dynamicField name="tls_*" type="tlong" indexed="true" stored="true" multiValued="false"/> <dynamicField name="tlm_*" type="tlong" indexed="true" stored="true" multiValued="true"/> <dynamicField name="tfs_*" type="tfloat" indexed="true" stored="true" multiValued="false"/> <dynamicField name="tfm_*" type="tfloat" indexed="true" stored="true" multiValued="true"/> <dynamicField name="tps_*" type="tdouble" indexed="true" stored="true" multiValued="false"/> <dynamicField name="tpm_*" type="tdouble" indexed="true" stored="true" multiValued="true"/> <!-- Sortable version of the dynamic string field --> <dynamicField name="sort_ss_*" type="sortString" indexed="true" stored="false"/> <copyField source="ss_*" dest="sort_ss_*"/>
In this case, the 'tint' field type is most appropriate for this data. For more information on naming fields you can read the schema.xml document, or watch the video from the Drupalcon session (Peter talks about dynamic fields starting at 15:50 into the video).
Once this has been saved, you will need to re-index the content at admin/settings/apachesolr/index. You can then go to admin/reports/apachesolr and verify that your field has been indexed.
Telling Solr that the field is sortable
Now that the data has been added to the index, we need to tell Solr that the field is sortable. To do this, we use another hook, hook_apachesolr_prepare_query(). This hook is run after the query has been generated by Solr, and it allows you to make any modifications before it is run.
function mymodule_apachesolr_prepare_query(&$query, &$params) { $query->set_available_sort('tis_hit_count', array( 'title' => t('Number of Views'), 'default' => 'asc', )); }
As you can see in the code above, the $query object is passed by reference, and we use the set_available_sort method to add this as a sort field that will be displayed in the Apache Solr Sorting: Core block. The first parameter is the field name that was defined in hook_apachesolr_update_index(), and the second is an array that contains the string that will be displayed in the Sorting block, and the type of sort (ascending or descending). In addition, if you do not want to use any of the default sorts, they can be removed at the same time with the remove_available_sort method. passing the field name:
$query->remove_available_sort('solr_field_name');
And voila! You should now have a Number of Views link in your Sort block.
One additional thing to note is that there are actually two different hooks for modifying the $query object: hook_apachesolr_prepare_query and hook_apachesolr_modify_query. The one you want to use is determined by whether or not you want the user to see the change. Anything modified in prepare_query will be visible to your users, and anything modified in modify_query will not be visible to your users (a more detailed explanation is given by James McKinney in the DCSF session video from 37:20-39:15). In this case, since we want the change to be visible to our users, we use hook_apachesolr_prepare_query.
There are many more customizations that can be done in Solr with the available hooks. In subsequent posts, I will show how to add a field so that it is displayed with the search results, and a more complicated use case of creating a custom search path that automatically applies a filter to the results.






Hey thx for this post, but
Hey thx for this post, but could you please format the text that someone could read this cool informations? It's horrible to read! What a pitty for this cool infos :-)
Sorry, had some issues with
Sorry, had some issues with input formats and I didn't get the format of this one changed back. Thanks for the heads up.
hiiii Steve, Nice stuff, it
hiiii Steve,
Nice stuff, it helped me allot for understanding solr. I also used this for my project.
Hey but there is one problem I found my field 'im_cck_field_work_date_earliest' is indexed at admin/reports/apachesolr,
but when i use
function mymodule_apachesolr_prepare_query(&$query, &$params) {
$query->set_available_sort('im_cck_field_work_date_earliest', array(
'title' => t('Number of Views'),
'default' => 'asc',
));
}
and click on option Number of views it gives me an error "The Apache Solr search engine is not available. Please contact your site administrator."
but for other core option its working fine.
Can you please help me whts wrong i m doing. As i m new to drupal
and sorry for my language
Thanks Steve, The problem is
Thanks Steve, The problem is solved. Again thanks for Information about solr it helped me allot
Wonderful!! Thank you very
Wonderful!! Thank you very much for this post. It helped a lot to me. Great and nice explanation, everyone can understood this very much. Without your help I can't move forward. Thank you once again.
Keep rocking like this.
--Sudharshan
Specifically, in what
Specifically, in what directory and in what file do you save your functions so that they're found by Solr?
As with any Drupal hook
As with any Drupal hook implementations, you can place them in a .module file in any module. Usually this will be in a custom module specific to the site. In the example above, I created separate module for all of my Solr functions and hook implementations.
Hello I have a pb I cannot
Hello
I have a pb I cannot solve, without understanding why.
I have a content type "article" in which I had a field "president"
label : president
field : field_president
type : text
widget: textfield
I just want to create a specific attribute in my index in order to stock this field (later I will facet on this field)
So, i create a custom_solr module
with a hook for hook_apachesolr_update_index
The ss_field_president is ... empty.
Could anyone help me?
Here is my code
function custom_solr_apachesolr_update_index(&$document, $node) {
if (isset($node->field_president)){
$document->addField('ss_field_president', $node->field_president);
}
}
Drupal core: 7.x
Post new comment