Solr
Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.
Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.
Its feature set include Rich Document Parsing and Indexing (PDF, Word, HTML, etc) using Apache Tika, An Administration Interface, Monitorable Logging, Fast Incremental Updates and Index Replication, Highly Scalable Distributed search with sharded index across multiple hosts, HTTP interface with configurable response formats (XML/XSLT, JSON, Python, Ruby, PHP, Velocity)
Solrj is embedded solr. It provides Java based API and it takes care of constructing, parsing, sending and receiving HTTP request.
http://lucene.apache.org/solr/
comments powered by Disqus
Related Products
IndexTank - Search Engine powers Reddit
IndexTank search engine powers search in Reddit, Social bookmarking site. IndexTank is acquired by LinkedIn and released the project as open source. It includes features like Variables boosts, Facets, Faceted search, Snippeting, Custom scoring functions, Suggest, and Autocomplete.
SenseiDB - Search engine used in LinkedIn
Sensei is a distributed data system that was built to support many product initiatives at LinkedIn, including the real-time faceted search in LinkedIn Signal and the news feed and tabs on the Homepage. Sensei is both a search engine and a database. It is designed to query and navigate through documents that consist of unstructured text and well-formed and structured metadata.
Constellio - Enterprise Search engine
Constellio Open Source Enterprise Search is based on Apache Solr and using Google Search Appliances connectors architecture, it allows, with a single click, to find all relevant content in your organization (Web, email, ECM, CRM etc.).
Bobo - Faceted search library based on Lucene
Bobo Browse is an information retrieval technology that provides navigational browsing into a semi-structured dataset. Beyond the result set from queries and selections, Bobo Browse also provides the facets from this point of browsing. It provides support to sort documents on fields that have multiple values. It is stable and used by LinkedIn.
Sphinix
Sphinix is free open-source SQL full-text search engine. How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.
Grub
Grub Next Generation is distributed web crawling system (clients/servers) which helps to build and maintain index of the Web. It is client-server architecture where client crawls the web and updates the server. The peer-to-peer grubclient software crawls during computer idle time.
Xapian - Search Engine Library
Xapian is an Open Source Search Engine Library. It is written in C++, with bindings to allow use from Perl, Python, PHP, Java, Tcl, C# and Ruby. Xapian is a highly adaptable toolkit which allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also supports a rich set of boolean query operators.
ElasticSearch
ElasticSearch is an Open Source (Apache 2 license), distributed, RESTful Search Engine built for the cloud.
Open Search Server
Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.
Jumper - Collaborative search engine in PHP
Jumper 2.0 is a collaborative community search platform that revolutionizes search by crowdsourcing knowledge management powered by a shared bookmarking engine. It is easily and quickly deployed into a community of practice that benefits users with complex and specialized search requirements. Jumper delivers universal search of any databases, flat files, fileshares, content systems, web pages, blogs and wikis, even people - through one simple search box.