Displaying 1 to 20 from 42 results

Vespa - Yahoo's big data serving engine


Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time. Vespa is serving platform for Yahoo.com, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Gemini, Flickr.

Lucene - A high-performance, full-featured text search engine library


Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Solr - Blazing-fast, open source enterprise search platform


Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

IndexTank - Search Engine powers Reddit


IndexTank search engine powers search in Reddit, Social bookmarking site. IndexTank is acquired by LinkedIn and released the project as open source. It includes features like Variables boosts, Facets, Faceted search, Snippeting, Custom scoring functions, Suggest, and Autocomplete.




ElasticSearch - Distributed, RESTful search and analytics engine


Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

Norconex HTTP Collector - Enterprise Web Crawler


Norconex HTTP Collector is a full-featured web crawler (or spider) that can manipulate and store collected data into a repositoriy of your choice (e.g. a search engine). It very flexible, powerful, easy to extend, and portable.

Yioop - Open Source Search Engine Software


Yioop is an open source, PHP search engine capable of crawling, index, and providing search results for hundred of millions of pages on relatively low end hardware. It can index a variety of text formats HTML, RSS, PDF, RTF, DOC and images GIF, JPEG, PNG, etc. It can import data from ARC, WARC, Media-Wiki, Open Directory RDF. It is easily localized to many languages. It has built-in support for new feeds, discussion groups, blogs, and wikis. It also supports mixing indexes to create mash ups.

Gigablast - Web and Enterprise search engine in C++


Gigablast is one of the remaining four search engines in the United States that maintains its own searchable index of over a billion pages. It is scalable to thousands of servers. Has scaled to over 12 billion web pages on over 200 servers. It supports Distributed web crawler, Document conversion, Automated data corruption detection and repair, Can cluster results from same site, Synonym search, Spell checker and lot more.



YaCy - Decentralized Web Search


YaCy (read "ya see") is a free distributed search engine, built on principles of peer-to-peer (P2P) networks. It is distributed on several hundred computers so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database which is shared with other YaCy-peers using principles of P2P networks.

Searchdaimon - Enterprise Search


Searchdaimon is an open source search engine for corporate data and websites. It comes with a powerful administrator interface and can index websites and several common enterprise systems like SharePoint, Exchange, SQL databases, Windows file shares etc. It also supports many data sources (e.g., Word, PDF, Excel) and the possibility of faceted search, attribute navigation and collection sorting.

Lemur - Search Engine


The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri search engine, Lemur Toolbar, and ClueWeb09 dataset.

Terrier - Information Retrieval Platform


Terrier is a highly flexible, efficient, and effective open source search engine, readily deployable on large-scale collections of documents. Terrier implements state-of-the-art indexing and retrieval functionalities, and provides an ideal platform for the rapid development and evaluation of large-scale retrieval applications. Terrier can index large corpora of documents, and provides multiple indexing strategies, such as multi-pass, single-pass and large-scale MapReduce indexing.

MG4J - Managing Gigabytes for Java


MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java. MG4J is a highly customisable, high-performance, full-fledged search engine providing state-of-the-art features (such as BM25/BM25F scoring) and new research algorithms. The main points of MG4J are Powerful indexing, Multi-index interval semantics, Virtual fields, Clustering and lot more.

Crate - The fast, scalable, easy to use SQL database with native full text search


Crate is an open source, highly scalable, shared-nothing distributed SQL database. Crate offers the scalability and performance of a modern No-SQL database with the power of Standard SQL. Crate’s distributed SQL query engine lets you use the same syntax that already exists in your applications or integrations, and have queries seamlessly executed across the crate cluster, including any aggregations, if needed.

Zebra - Indexing and Retrieval Engine


Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It can index records in XML/SGML, MARC, e-mail archives and many other formats and allows access to them through exact boolean search expressions and relevance-ranked free-text queries.

Whoosh - Python Search Library


Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python. Programmers can use it to easily add search functionality to their applications and websites. It has support of Fielded indexing, search, scoring, text analysis, storage, Pluggable scoring algorithm, Powerful query language and spell-checker.

Xapian - Search Engine Library


Xapian is an Open Source Search Engine Library. It is written in C++, with bindings to allow use from Perl, Python, PHP, Java, Tcl, C# and Ruby. Xapian is a highly adaptable toolkit which allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also supports a rich set of boolean query operators.

Constellio - Enterprise Search engine


Constellio Open Source Enterprise Search is based on Apache Solr and using Google Search Appliances connectors architecture, it allows, with a single click, to find all relevant content in your organization (Web, email, ECM, CRM etc.).

CLucene - Lucene C Port


CLucene is a port of the very popular Java Lucene text search engine API. CLucene aims to be a good alternative to Java Lucene when performance really matters or if you want to stick to good old C++. CLucene is faster than Lucene as it is written in C++, meaning it is being compiled into machine code, has no background GC operations, and requires no any extra setup procedures.

Jumper - Collaborative search engine in PHP


Jumper 2.0 is a collaborative community search platform that revolutionizes search by crowdsourcing knowledge management powered by a shared bookmarking engine. It is easily and quickly deployed into a community of practice that benefits users with complex and specialized search requirements. Jumper delivers universal search of any databases, flat files, fileshares, content systems, web pages, blogs and wikis, even people - through one simple search box.