Lucene Vs Solr
Lucene is a search library built in Java. Solr is a web application built on top of Lucene. Certainly Solr = Lucene + Added features. Often there would a question, when to choose Solr and when to choose Lucene.
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search.
- To have more control. It is a plain Jar and it could be used as the way we require.
- Cannot depend on any Web server.
- To use termvector, termdocs etc. For example, to calculate the most indexed term in a given period of time. TermVector gives information about the terms and its occurances.
- Lot of contrib modules like spell checker, hit highlighting are available.
- Near real time search support.
- It is widely used in many of the open source projects. There are lot more derivate search products available on top of Lucene.
- Incremental Updates: When ever new documents are added, IndexReader needs to reopen to get the new documents reflect in search.
- Warming the searcher: When ever new searcher is opened or reopened, It should be warmed by performing couple of search. This will help to load the cache and subsequent search will be faster.
Solr major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. This site is powered by Solr.
Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs.
- To index and search docs easily by writting few code
- Solr is a standalone App and it takes care most of the stuff like incremental updates, warmup the reader etc.
- Solr could be extended to multiple nodes. It supports distributed search but not distributed indexing.
- To use Facet search and hit highlighting.
- For Java developers, Solrj library is available, which helps to communicate with the server via API.
- Solr could be used from any programming language which supports HTTP/XML and JSON.
Summary: To get more control use Lucene. For faster development, easy to learn, choose Solr.
comments powered by Disqus
Lucene is most powerful and widely used Search engine. Here is the list of 7 search engines which is built on top of Lucene. You could imagine how powerful they are.
Solr is a search engine built on top of Lucene. It supports REST interface and has lot of built-in capabilities. Solr package has Admin UI interface which has support to perform query and even delete the contents of the index. If you are using Solr in production then you may need to restrict access. I saw couple of questions in the group related to this topic. Thought to write an article explaining few tips to restrict the user access to Solr admin UI.
Lucene and Solr are most popular and widely used search engine. It indexes the content and delivers the search result faster. It has all capabilities of NoSQL database. This article describes about its pros and cons.
Most of the database has support of full text search, basically indexing and saarching. MySQL, Oracle and many more databases has in-built full text search. Then what is the need to go for external search engine like Lucene, Sphinx, Solr etc. Check out the advantage of using Searchengine.
The release 4.0 is one of the important milestone for Lucene and Solr. It has lot of new features and performance important. Few important ones are highliggted in this article.
Solr and Elastic Search are built on top of Lucene. Both are open source and both have extra features which makes programmer life easy. This article explains the difference and the best situation to choose between them.
Enterprise search software should be capable to search the data available in the entire organization or personnel desktop. The data could be in File system, Web or in Database. It should search contents of Emails, file formats like doc, xls, ppt, pdf and lot more. There are many commercial products available but LucidWorks and SearchBlox are best and free.
Lucidworks Enterprise search solution is built on top of Apache Solr. It scales seamlessly w/sub-second response times under extreme query loads for multi-billion document collections. It has user friendly UI, which does all the job of configuration and search.
As open source getting popular day by day, many have questions like How to make money from Open Source? Lot more products are getting introduced and don't know who is making money. Certainly open source means, give the product and source for free then how to make money? Yes sell the product for free but get paid for its services.
Microsoft is monopoly in the commercial software. Here are 15 best alternatives to most popular and widely used Microsoft products.