LuMongo - Realtime Time Distributed Search

  •        192

LuMongo is a real-time distributed search and storage system based on Lucene. LuMongo is designed from the ground up to scale both vertically and horizontally across servers. LuMongo stores Lucene indexes directly into MongoDB. Documents can be stored natively into MongoDB. When stored natively document can be queried as normal out of MongoDB and use of Map-Reduce and the Aggregation Framework is possible.

http://lumongo.org/
https://github.com/lumongo/lumongo

Tags
Implementation
License
Platform

   




Related Projects

SenseiDB - Distributed, Realtime, Semi-Structured Database from LinkedIn


Sensei is a distributed data system that was built to support many product initiatives at LinkedIn, including the real-time faceted search in LinkedIn Signal and the news feed and tabs on the Homepage. Sensei is both a search engine and a database. It is designed to query and navigate through documents that consist of unstructured text and well-formed and structured metadata. Sensei is both a search engine and a database.

compass - Searchengine built on top of Lucene


Compass is a real time searchengine. It is built on top of lucene. It is transactional, distributed, supports Spring MVC, integrates with Hibernate.

ElasticSearch - Distributed, RESTful search and analytics engine


Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

Vespa - Yahoo's big data serving engine


Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time. Vespa is serving platform for Yahoo.com, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Gemini, Flickr.

Solr - Blazing-fast, open source enterprise search platform


Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.



Jumper - Collaborative search engine in PHP


Jumper 2.0 is a collaborative community search platform that revolutionizes search by crowdsourcing knowledge management powered by a shared bookmarking engine. It is easily and quickly deployed into a community of practice that benefits users with complex and specialized search requirements. Jumper delivers universal search of any databases, flat files, fileshares, content systems, web pages, blogs and wikis, even people - through one simple search box.

YaCy - Decentralized Web Search


YaCy (read "ya see") is a free distributed search engine, built on principles of peer-to-peer (P2P) networks. It is distributed on several hundred computers so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database which is shared with other YaCy-peers using principles of P2P networks.

Gigablast - Web and Enterprise search engine in C++


Gigablast is one of the remaining four search engines in the United States that maintains its own searchable index of over a billion pages. It is scalable to thousands of servers. Has scaled to over 12 billion web pages on over 200 servers. It supports Distributed web crawler, Document conversion, Automated data corruption detection and repair, Can cluster results from same site, Synonym search, Spell checker and lot more.

riot - Go Open Source, Distributed, Simple and efficient Search Engine


Supporting riot, buy me a coffee.Riot is primarily distributed under the terms of the Apache License (Version 2.0), base on wukong.

Sphinix - Search server


Sphinix is free open-source SQL full-text search engine. How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.

IndexTank - Search Engine powers Reddit


IndexTank search engine powers search in Reddit, Social bookmarking site. IndexTank is acquired by LinkedIn and released the project as open source. It includes features like Variables boosts, Facets, Faceted search, Snippeting, Custom scoring functions, Suggest, and Autocomplete.

Crate - The fast, scalable, easy to use SQL database with native full text search


Crate is an open source, highly scalable, shared-nothing distributed SQL database. Crate offers the scalability and performance of a modern No-SQL database with the power of Standard SQL. Crate’s distributed SQL query engine lets you use the same syntax that already exists in your applications or integrations, and have queries seamlessly executed across the crate cluster, including any aggregations, if needed.

Katta - Lucene and more in the cloud.


Katta is a scalable, failure tolerant, distributed, data storage for real time access. Katta serves large, replicated, indices as shards to serve high loads and very large data sets. These indices can be of different type. Currently implementations are available for Lucene and Hadoop mapfiles.

Summa


Summa is a fast modular and scalable search engine written in Java. Sports a flexible workflow system and incremental indexing and facetting. Summa is primarily developed by the State and University Library of Denmark.

Nutch - Highly extensible, highly scalable Web crawler


Nutch is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.

Lux - XML Search engine


Lux is an open source XML search engine using Lucene /Solr and Saxon XQuery/XSLT processor. Lux provides XML-aware indexing, an XQuery 1.0 optimizer that rewrites queries to use the indexes, and a function library for interacting with Lucene via XQuery. These capabilities are tightly integrated with Solr, and leverage its application framework in order to deliver a REST service, application server, and supporting tools.

Lucene - A high-performance, full-featured text search engine library


Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Data-SearchEngine-Zebra - Zebra search engine abstraction (Data::SearchEngine)


Zebra search engine abstraction (Data::SearchEngine)

Pinot - A realtime distributed OLAP datastore


Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally, so that it can scale to larger data sets and higher query rates as needed.

Bobo - Faceted search library based on Lucene


Bobo Browse is an information retrieval technology that provides navigational browsing into a semi-structured dataset. Beyond the result set from queries and selections, Bobo Browse also provides the facets from this point of browsing. It provides support to sort documents on fields that have multiple values. It is stable and used by LinkedIn.