Wednesday, January 07, 2009
Register  |  Login
Information Science * Information Retrieval  * Query Cooking Tricks
 Links  
 Print     
Hover here, then click toolbar to edit content
 Lucene Query Cookbook  
The Lucene Query Cookbook

Lucene Query Cooking Tips

   Query cooking is the art of tweaking how queries to achieve more satisfying results. Most time what you get out of the box will not be satisfactory. When Goggle changes is query cooking formulas many web based shops can become black listed and subsequently go out of business.

   In terms of quality control for an information retrieval system are three primary modes of failure.

  • False positives - when irrelevant results are displayed.

  • False negatives - when relevant results are not retrieved.

  • Ranking issues- when the relevance does gracefully degrade according to the scheme above.


   When running a query your users will quickly distinguishes between exact results and approximate ones. The different levels of relevance in a typical search scenario in decreasing order are:

  1. I'm feeling lucky match a result so relevant that it can be followed blindly.

  2. Exact match within a qualifying field which would indicate a result most relevant.

  3. All the search terms generated using the AND operator.

  4. High incidence of most search terms generated using an OR operator.

  5. Documents with no search terms but with semantically equivalent terms generated using a thesaurus expansion.

  6. Documents with no equivalent terms but with high similarity to other results generated using document term vector similarity.

  7. Noise, irrelevant documents containing search terms.

  8. Static or completely irrelevant results generated using different expansion option or through bugs.

   Items seven and eight namely noise and static will invariably creep into the results above. However using different query cooking techniques in this cookbook one hopes that for the most part the noisier results will not be viewed by users as they will be out-ranked by more precise results.



 Rating
Rate This Page:PoorGreat  | Rate Content |
Average rating: 5   
12345
Number of Ratings : 1
 Comments
Add Comment
SuperUser Account
Great material.

Consider moving the discussion on graceful degrading of ranking to its own entry and just explain what is query cooking
Posted At 27-12-2008 04:51:27




 |   | View Topic History  |
Syndicate   Print     
 Download Area  
 Print