Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected. 
 
 Its feature include: 
 <ul>
	<li>Distributed and Highly Available Search Engine.
	<ul>
		<li>Each index is fully sharded with a configurable number of shards.</li>
		<li>Each shard can have one or more replicas.</li>
		<li>Read / Search operations performed on either one of the replica shard.</li>
	</ul></li>
	<li>Multi Tenant with Multi Types.
	<ul>
		<li>Support for more than one index.</li>
		<li>Support for more than one type per index.</li>
		<li>Index level configuration (number of shards, index storage, &#8230;).</li>
	</ul></li>
	<li>Various set of APIs
	<ul>
		<li>HTTP RESTful API</li>
		<li>Native Java API.</li>
		<li>All APIs perform automatic node operation rerouting.</li>
	</ul></li>
	<li>Document oriented
	<ul>
		<li>No need for upfront schema definition.</li>
		<li>Schema can be defined per type for customization of the indexing process.</li>
	</ul></li>
	<li>Reliable, Asynchronous Write Behind for long term persistency.</li>
	<li>(Near) Real Time Search.</li>
	<li>Built on top of Lucene
	<ul>
		<li>Each shard is a fully functional Lucene index</li>
		<li>All the power of Lucene easily exposed through simple configuration / plugins.</li>
	</ul></li>
	<li>Per operation consistency
	<ul>
		<li>Single document level operations are atomic, consistent, isolated and durable.</li>
	</ul></li>
</ul>

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected. 

ElasticSearch - Distributed, RESTful search and analytics engine

OpenGrok is a fast and usable source code search and cross reference engine, written in Java. It helps you search, cross-reference and navigate your source tree. It can understand various program file formats and version control histories of many source code management systems.
 
It can search for full text, definitions, symbols, path and revision history, Search query with Google like syntax, incrementally update its index. i.e update only the changed files since last time it was updated. It also provides a read-only web interface for version control systems like Mercurial, CVS, SVN, SCCS or TeamWare.

OpenGrok is a fast and usable source code search and cross reference engine, written in Java. It helps you search, cross-reference and navigate your source tree. It can understand various program file formats and version control histories of many source code management systems.

OpenGrok - Fast and usable source code search and cross reference engine, written in Java

Qdrant ( quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. Qdrant is tailored to extended filtering support. It makes it useful for all sorts of neural-network or semantic-based matching, faceted search, and other applications. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more.
 
The neural search uses semantic embeddings instead of keywords and works best with short texts. With Qdrant and a pre-trained neural network, you can build and deploy semantic neural search on your data in minutes.

Qdrant ( quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. Qdrant is tailored to extended filtering support. It makes it useful for all sorts of neural-network or semantic-based matching, faceted search, and other applications. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more.

Qdrant - Neural Search Engine, Vector Similarity Search Engine with extended filtering support

Usergrid is an open-source Backend-as-a-Service (“BaaS” or “mBaaS”) composed of an integrated distributed NoSQL database, application layer and client tier with SDKs for developers looking to rapidly build web and/or mobile applications. It provides elementary services (user registration & management, data storage, file storage, queues) and retrieval features (full text search, geolocation search, joins) to power common app features.
 
It is a multi-tenant system designed for deployment to public cloud environments (such as Amazon Web Services, Rackspace, etc.) or to run on traditional server infrastructures so that anyone can run their own private BaaS deployment. For architects and back-end teams, it aims to provide a distributed, easily extendable, operationally predictable and highly scalable solution. For front-end developers, it aims to simplify the development process by enabling them to rapidly build and operate mobile and web applications without requiring backend expertise.

Usergrid is an open-source Backend-as-a-Service (“BaaS” or “mBaaS”) composed of an integrated distributed NoSQL database, application layer and client tier with SDKs for developers looking to rapidly build web and/or mobile applications. It provides elementary services (user registration & management, data storage, file storage, queues) and retrieval features (full text search, geolocation search, joins) to power common app features.

Usergrid - The BaaS Framework you run

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required. Its feature set include Rich Document Parsing and Indexing (PDF, Word, HTML, etc) using Apache Tika, An Administration Interface, Monitorable Logging, Fast Incremental Updates and Index Replication, Highly Scalable Distributed search with sharded index across multiple hosts, HTTP interface with configurable response formats (XML/XSLT, JSON, Python, Ruby, PHP, Velocity) Solrj is embedded solr. It provides Java based API and it takes care of constructing, parsing, sending and receiving HTTP request.

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. 

Solr - Blazing-fast, open source enterprise search platform

IndexTank search engine powers search in Reddit, Social bookmarking site. IndexTank is acquired by LinkedIn and released the project as open source. It includes features like Variables boosts, Facets, Faceted search, Snippeting, Custom scoring functions, Suggest, and Autocomplete. 
 Homepage: <A HREF="http://indextank.com/" target="_blank">http://indextank.com/</A>

IndexTank search engine powers search in Reddit, Social bookmarking site. IndexTank is acquired by LinkedIn and released the project as open source. It includes features like Variables boosts, Facets, Faceted search, Snippeting, Custom scoring functions, Suggest, and Autocomplete.

IndexTank - Search Engine powers Reddit

Lucene is most popular and java based searchengine library. It offers near real time search. Its features include Ranked search, many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more, fielded searching (e.g., title, author, contents), date-range searching, sorting by any field , multiple-index searching with merged results , allows simultaneous update and searching. 

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Lucene - A high-performance, full-featured text search engine library

Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time. Vespa is serving platform for Yahoo.com, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Gemini, Flickr.
 Queries can use both structured filters and unstructured text search to select data. All the matching data is then ranked according to a ranking function - typically machine learned - to implement such use cases as search relevance, recommendation, targeting and personalization.
 
Vespa is scalable. System sizes up to hundreds of nodes handling tens of billions of documents are not uncommon, and no harder to set up and modify than single node systems. Since all system components, as well as stored data is redundant and self-correcting, hardware failures are not operational emergencies and can be handled by re-adding capacity when convenient.
Its feature include:
<ul>
<li>Text search - Combine structured query and text search to select data</li>
<li>Advanced Ranking - Machine learned ranking</li>
<li>Aggregation</li>
<li>Elastic, Scalable</li>
<li>High Availability</li>
<li>Auto repair data corruption</li>
<li>Simple HTTP API interface&nbsp;</li>
<li>lot more...</li>
</ul>

Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time. Vespa is serving platform for Yahoo.com, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Gemini, Flickr.

Vespa - Yahoo's big data serving engine

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want to perform Question Answering or semantic document search, you can use the State-of-the-Art NLP models in Haystack to provide unique search experiences and allow your users to query in natural language. Haystack is built in a modular fashion so that you can combine the best technology from other open-source projects like Huggingface's Transformers, Elasticsearch, or Milvus.
 
Haystack can perform semantic search and retrieve documents according to meaning, not keywords. It can ask questions in natural language and find granular answers in your documents. It can automate processes by automatically applying a list of questions to new documents and using the extracted answers.

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want to perform Question Answering or semantic document search, you can use the State-of-the-Art NLP models in Haystack to provide unique search experiences and allow your users to query in natural language. Haystack is built in a modular fashion so that you can combine the best technology from other open-source projects like Huggingface's Transformers, Elasticsearch, or Milvus.

Haystack - Build a natural language interface for your data

OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards. OpenSearch enables people to easily ingest, secure, search, aggregate, view, and analyze data. These capabilities are popular for use cases such as application search, log analytics, and more.
 
Its features include: 
<ul><li>Log analytics</li><li> Real-time application monitoring</li><li> Clickstream analytics</li><li>Use SQL or a piped processing language to query your data</li><li>Automate index operations</li><li>Monitor and optimize your cluster</li><li>Run search requests in the background</li><li>KNN- Find “nearest neighbors” in your vector data</li><li>Authentication and access control for your cluster</li><li>Anomaly Detection</li></ul>

OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards. OpenSearch enables people to easily ingest, secure, search, aggregate, view, and analyze data. These capabilities are popular for use cases such as application search, log analytics, and more.

OpenSearch - Open source distributed and RESTful search engine

TNTSearch is a full-text search (FTS) engine written entirely in PHP. A simple configuration allows you to add an amazing search experience in just minutes. Its features include Fuzzy search, Geo-search, Text classification, Stemming, Bm25 ranking algorithm, Result highlighting, Boolean search and lot more.

TNTSearch - A fully featured full text search engine written in PHP

Lunr.js is a small, full-text search library for use in the browser. It indexes JSON documents and provides a simple search interface for retrieving documents that best match text queries. A bit like Solr, but much smaller and not as bright. Lunr enables you to provide a great search experience without the need for external, server-side, search services. Lunr has no external dependencies and works in your browser or on the server with node.js.
 
For web applications with all their data already sitting in the client, it makes sense to be able to search that data on the client too. A local search index will be quicker, there is no network overhead, and will remain available and usable even without a network connection.

Lunr.js is a small, full-text search library for use in the browser. It indexes JSON documents and provides a simple search interface for retrieving documents that best match text queries. A bit like Solr, but much smaller and not as bright. Lunr enables you to provide a great search experience without the need for external, server-side, search services. Lunr has no external dependencies and works in your browser or on the server with node.js.

LUNR.js - A bit like Solr, but much smaller and not as bright

Summa is a fast modular and scalable search engine written in Java. Summa is characterized by: 
 <ul><li>Integrated search. Summa can simultaneously access a number of different data and data sources and expose it in a unified interface.</li><li>Modular design. The Summa search system consists of a set of independent modules, which makes it simple and easy to maintain and upgrade.</li><li>Scalable. Summa supports a distributed architecture and can be scaled up or down to handle any amount of data.</li><li>Open standards. Summa is based upon modern web technologies and standards, and don’t include any proprietary codes or elements.</li><li>Failure tolerant. If a single source of data or service should fail, Summa will continue without that specific source.</li></ul>

Summa is a fast modular and scalable search engine written in Java. Sports a flexible workflow system and incremental indexing and facetting. Summa is primarily developed by the State and University Library of Denmark.

Summa

An open source .NET web crawler written in C# using SQL 2005/2008. Arachnode.net is a complete and comprehensive .NET web crawler for downloading, indexing and storing Internet content including e-mail addresses, files, hyperlinks, images, and Web pages. Its features include
 <UL>
	<LI>.NET architecture</LI>
	<LI>Configurable Rules and Actions</LI>
	<LI>Lucene.NET Integration</LI>
	<LI>SQL Server 2008 and full-text indexing</LI>
	<LI>.DOC/.PDF/.PPT/.XLS Indexing</LI>
	<LI>HTML to XML and XHTML</LI>
	<LI>Multi-threading and Throttling</LI>
	<LI>Respectful Crawling</LI>
	<LI>Analysis Services</LI>
	<LI>SQL Server 2008 and SSIS</LI>
	<LI>EXIF data extraction</LI>
 </UL>

An open source .NET web crawler written in C# using SQL 2005/2008. Arachnode.net is a complete and comprehensive .NET web crawler for downloading, indexing and storing Internet content including e-mail addresses, files, hyperlinks, images, and Web pages.

Arachnode.net

Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.
 <UL>
	<LI>Multi-languages indexing</LI>
	<LI>The crawlers go through web sites and file systems to rapidly and easily build your index.</LI>
	<LI>Numerous document formats are supported, such as XML, HTML/XHTML, Adobe™ PDF, Microsoft™ Word™, PowerPoint™, OpenOffice™, etc</LI>
	<LI>Quick integration thanks to an XML interface via HTTP queries (XML over HTTP) and PHP classes</LI>
	<LI>The web interface is built around the power offered by the Zkoss framework. It runs with the main Ajax browsers. This RIA-type interface is as comfortable to use as that of a heavy client</LI>
 </UL>

Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.

Open Search Server

Discover open source projects across all platforms

Projects