Customized Indexing Leveraging Lucene

Topics: General
Oct 30, 2014 at 3:56 PM
Edited Oct 30, 2014 at 4:48 PM
I am working on a site right now with lots of custom content types. I would like to be able to have the content within each content type to be included in the index so that Lucene can do its magic appropriately.

What I am attempting to do is iterate through the list of content types that I have set up for indexing, grab the html from each URL and strip out the tags (using the CSQuery plugin), and add a Lucene index field to the appropriate Lucene Document with the raw text.

I have nearly everything in place at this point. The only struggle is determining how to update the index (i.e. What interface needs to be implemented, if necessary).

Has anyone attempted something similar in the past?
Oct 30, 2014 at 4:57 PM
Looks like to me that there is absolutely no code at all to write. Everything is already in place. Worst case you would clone the search controller to create your own query field targeting a specific index. But it's in 1.8.x right now, or maybe just 1.x, TBC.

Enable Lucene and Search and Indexing. Create new Indexes from the admin if necessary. Go to the content type definitions, add the types you want to the indexes you want. Done.
Oct 30, 2014 at 5:05 PM
Thanks for your quick response Sebastien. I have Lucene, Search, and Indexing turned on. However, can I be 100% certain that the search functionality will appropriately scour all content items within a page, regardless of their type? For instance, if I have a containable content item that contains a containable item within it it (say for N levels), will the Search functionality iterate through all nested content items to find the desired search parameter?
Oct 30, 2014 at 5:12 PM
Edited Oct 30, 2014 at 5:14 PM
Also, do I have to make sure that I am checking "Include in the index" on all content types that have a Content Item Picker?

Here is the checkbox I'm referring to: Example