ElasticSearch
ElasticSearch could be opted if, We build a web site or an application and want to add search to it, and then it hits us: getting search working is hard. We want our search solution to be fast, we want a painless setup and a completely free search schema, we want to be able to index data simply using JSON over HTTP, we want our search server to be always available, we want to be able to start with one machine and scale to hundreds, we want real-time search, we want simple multi-tenancy, and we want a solution that is built for the cloud.
Its feature include:
- Distributed and Highly Available Search Engine.
- Each index is fully sharded with a configurable number of shards.
- Each shard can have one or more replicas.
- Read / Search operations performed on either one of the replica shard.
- Multi Tenant with Multi Types.
- Support for more than one index.
- Support for more than one type per index.
- Index level configuration (number of shards, index storage, …).
- Various set of APIs
- HTTP RESTful API
- Native Java API.
- All APIs perform automatic node operation rerouting.
- Document oriented
- No need for upfront schema definition.
- Schema can be defined per type for customization of the indexing process.
- Reliable, Asynchronous Write Behind for long term persistency.
- (Near) Real Time Search.
- Built on top of Lucene
- Each shard is a fully functional Lucene index
- All the power of Lucene easily exposed through simple configuration / plugins.
- Per operation consistency
- Single document level operations are atomic, consistent, isolated and durable.
comments powered by Disqus
Related Products
Solr
Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.
SenseiDB - Search engine used in LinkedIn
Sensei is a distributed data system that was built to support many product initiatives at LinkedIn, including the real-time faceted search in LinkedIn Signal and the news feed and tabs on the Homepage. Sensei is both a search engine and a database. It is designed to query and navigate through documents that consist of unstructured text and well-formed and structured metadata.
Constellio - Enterprise Search engine
Constellio Open Source Enterprise Search is based on Apache Solr and using Google Search Appliances connectors architecture, it allows, with a single click, to find all relevant content in your organization (Web, email, ECM, CRM etc.).
Katta - Lucene and more in the cloud.
Katta is a scalable, failure tolerant, distributed, data storage for real time access. Katta serves large, replicated, indices as shards to serve high loads and very large data sets. These indices can be of different type. Currently implementations are available for Lucene and Hadoop mapfiles.
Sphinix
Sphinix is free open-source SQL full-text search engine. How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.
Open Search Server
Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.
MongoDB
MongoDB (from "humongous") is a scalable, high-performance, open source, dynamic-schema, document-oriented database. MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and traditional RDBMS systems.
Lucene
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Xapian - Search Engine Library
Xapian is an Open Source Search Engine Library. It is written in C++, with bindings to allow use from Perl, Python, PHP, Java, Tcl, C# and Ruby. Xapian is a highly adaptable toolkit which allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also supports a rich set of boolean query operators.
IndexTank - Search Engine powers Reddit
IndexTank search engine powers search in Reddit, Social bookmarking site. IndexTank is acquired by LinkedIn and released the project as open source. It includes features like Variables boosts, Facets, Faceted search, Snippeting, Custom scoring functions, Suggest, and Autocomplete.