jilohydro.blogg.se

Boolean search operators precedence
Boolean search operators precedence







boolean search operators precedence

boolean search operators precedence

An Occur flag, which has one of four values.A BooleanQuery consists of one or more BooleanClauses, each of which contains two pieces of information:.

#Boolean search operators precedence code#

The BooleanQuery class is probably one of the most misleading class names in the entire Lucene code base because it doesn’t model simple boolean logic query operations at all. To really understand why the boolean operators are inferior to the prefix operators, you have to start by considering the underlying implementation. This brings us to the crux of why I think it’s a bad idea to use the “Boolean Operators” in query strings: because it’s not how the underlying query structures actually work, and it’s not as expressive as the alternative for describing what you want. (The user is probably more interested in a document that discusses the similarities and differences between Alligators to Crocodiles then in documents that only mention one or the other a great many times). Likewise, if a user is looking for “all documents that contain the words ‘Alligator’ or ‘Crocodile'”, a simple boolean logic union of the sets of documents from the individual queries would not generate results as good as a query that took into account the term & document statistics for the individual queries, as well as considering which documents matches both queries. Instead algorithms like TF/IDF are used to try and identify the ordered list of matching documents, such that the “best” matches come first. When a user is looking for “all documents that contain the word ‘Alligator'” they aren’t going to very be happy if a search system applied simple boolean logic to just identify the unordered set of all matching documents. In either case, there is no concept of “relevancy” - either something is true or it’s false either it is in a set, or it is not in the set. Depending on how you look at it, boolean logic is all about truth values and/or set intersections. But when it comes to building a search engine, boolean logic tends to not be very helpful. Background: Boolean Logic Makes For Terrible Scoresīoolean Algebra is (as my father would put it) “pretty neat stuff” and the world as we know it most certainly wouldn’t exist with out it. It’s understandable that novice users may tend to think about the queries they want to run in those terms, but as you become more familiar with IR concepts in general, and what Solr specifically is capable of, I think it’s a good idea to try to “set aside childish things” and start thinking (and encouraging your users to think) in terms of the superior “Prefix Operators” (“+”, “-“). I really dislike the so called “Boolean Operators” (“AND”, “OR”, and “NOT”) and generally discourage people from using them. The following is written with Solr users in mind, but the principles apply to Lucene users as well.ĮDITED: - Code samples and examples were updated to reflect Lucene 6.5 APIs and new default Similarity.









Boolean search operators precedence