问题
I want to implement Solr hierarchical facet for my application where there is 2 level hierarchy between Category and SubCategory. I want to use a solution mentioned on http://wiki.apache.org/solr/HierarchicalFaceting#Pivot_Facets link.
The flattened data will be as below:
Doc#1: NonFic > Law
Doc#2: NonFic > Sci
Doc#3: NonFic > Sci > Phys
And this data should be split into a separate field for each level of the hierarchy at index time. Same as below.
Indexed Terms
Doc#1: category_level0: NonFic; category_level1: Law
Doc#2: category_level0: NonFic; category_level1: Sci
Doc#3: category_level0: NonFic; category_level1: Sci, category_level2:Phys
So can anyone please suggest ways to implement this? How do I define Solr schema to achieve this? I could not find any reference for splitting data as mentioned above at Index time.
Thanks,
Priyanka
回答1:
Do you need to display those individual fields as part of the documents returned? In which case you need those split values in 'stored' version of the field. If you only need to have them during search or during faceting, you can ignore the 'stored' form and concentrate on 'indexed' form.
In either case, if you need to split one field into several, you can do that with copyField or with UpdateRequestProcessor.
With copyField, the 'stored' form will be the same for all fields, but you can have different processors for each field, picking different part of the hierarchy for the 'indexed' part.
With UpdateRequestProcessor, you can write a custom one that takes one field and then spits out several fields, each with only its part of the path. You can do a custom one or do a couple of field copies and then different Regex processor on each field.
回答2:
To split the data, use a ScriptTransformer that allows you to transform the data using Javascript within your config files.
Add the following to your db-data-config at the same level as dataSource and document. This defines a function that splits the string within a field on the delimiter, >, and adds a field for each of the split values called category_level0, category_level1,...
<script><![CDATA[
function CategoryPieces(row) {
var pieces = row.get('ColumnToSplit').split('>');
for (var i=0; i < pieces.length; i++) {
row.put('category_level' + i, pieces[i]);
}
return row;
}
]]></script>
Then in your main <entity>
tag, add transformer="script:CategoryPieces"
, and add the columns to your field list.
<field column="category_level0" name="Category_Level0" />
<field column="category_level1" name="Category_Level1" />
Last, in your schema.xml, add the new fields.
<field name="Category_Level0" type="string" indexed="true" stored="true" multiValued="false" />
<field name="Category_Level1" type="string" indexed="true" stored="true" multiValued="false" />
来源:https://stackoverflow.com/questions/15089549/how-to-create-solr-schema-for-hierarchical-facet-by-splitting-data-into-multiple