Lucene / Solr as NoSQL database

Lucene and Solr are most popular and widely used search engine. It indexes the content and delivers the search result faster. It has all capabilities of NoSQL database. This article describes about its pros and cons.

NoSQL database should have following capabilities

  1. Schema-less
  2. Does not use SQL as its query language
  3. May not give full ACID guarantees
  4. Stores semi structured data
  5. Ability to store and retrieve faster
  6. No relationship between records
  7. Distributed, Scalable
  8. Clients could be written in any programming language
Lucene / Solr has these capabilities and it could be very well considered as NoSQL Document Store. It will reduce the learning curve of learning one more NoSQL database.

When you pick a SQL / NoSQL database, it will certainly have more number of database related features than Lucene / Solr. Database offers better storage persistence.

Lucene / Solr is designed for search engine but it has some capabilities of database. You could very well use it as document store but make sure you have persisted data some where else. Lucene / Solr cannot be used as primary database / data store. Data should be stored in a persistent store which could be file system or database or any archiving store. Data will also be stored and indexed in Lucene / Solr. Application will retrieve the information from Lucene db. At the end we have to make our application faster.

Reference:
Guardian is using Solr as its database
Lucene and Solr as NoSQL database





Bookmark and Share          660



comments powered by Disqus


Related Products

OrientDB - The NoSQL Graph-Document DBMS

OrientDB has the flexibility of the Document databases and the power of the Graph databases to manage relationships. It can work in schema-less mode, schema-full or a mix of both. It can store up to 150,000 records per second on common hardware. OrientDB has been designed to be very fast. It inherits the best features and concepts from the Object Databases, Graph DBMS and the modern NoSQL engines.

Read more

Solr

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

Read more

Infinispan - Key value NOSQL data store and data grid

Infinispan is an extremely scalable, highly available key/value NoSQL datastore and distributed data grid platform. The purpose of Infinispan is to expose a data structure that is highly concurrent, designed ground-up to make the most of modern multi-processor/multi-core architectures while at the same time providing distributed cache capabilities. Infinispan offers enterprise features such as efficient eviction algorithms to control memory usage as well as JTA compatibility.

Read more

Hypertable - A high performance, scalable, distributed storage and processing system for structured

Hypertable is based on Google's Bigtable Design, which is a proven scalable design that powers hundreds of Google services. Many of the current scalable NoSQL database offerings are based on a hash table design which means that the data they manage is not kept physically ordered. Hypertable keeps data physically sorted by a primary key and it is well suited for Analytics.

Read more

Carrot2 - Search Results Clustering Engine

Carrot2 is an Open Source Search Results Clustering Engine. It could cluster the search results from various sources and generates small collection of documents. Carrot2 offers ready-to-use components for fetching search results from various sources including YahooAPI, GoogleAPI, Bing API, eTools Meta Search, Lucene, SOLR, Google Desktop and more.

Read more

HBase - Hadoop database

HBase provides support to handle BigTable - billions of rows X millions of columns. It is a scalable, distributed, versioned, column-oriented store modeled after Google's Bigtable and runs on top of HDFS (Hadoop Distributed Filesystem). It features compression, in-memory operation per-column. Data could be replicated between the nodes. HBase is used in Facebook and Twitter.

Read more

HyperGraphDB - Database for Storing Strongly-Typed Hypergraphs

HyperGraphDB is a general purpose, open-source data storage mechanism based on a powerful knowledge management formalism known as directed hypergraphs. While a persistent memory model designed mostly for Knowledge management, Artificial Intelligence and Semantic web projects, it can also be used as an embedded object-oriented database for Java projects of all sizes. It could also be used as graph database or as (non-SQL) relational database.

Read more

Terrastore - Scalable, elastic, consistent document store.

Terrastore is a modern document store which provides advanced scalability and elasticity features without sacrificing consistency. Terrastore is based on Terracotta, so it relies on an industry-proven, fast (and cool) clustering technology. Terrastore is accessed through the universally supported HTTP protocol. Terrastore is a distributed document store supporting single-cluster and multi-cluster deployments.

Read more

ArangoDB - The Multi-purpose NoSQL DB

ArangoDB is a multi-purpose open-source database with flexible data model for documents, graphs, and key-values. Build high performance application using a convenient sql-like query language or JavaScript/Ruby extensions. Its key features are Schema-free, Convenient querying using AQL, Extendable through JS, Space efficiency, Supports modern storage hardware, like SSD and large caches and lot more.

Read more

SenseiDB - Search engine used in LinkedIn

Sensei is a distributed data system that was built to support many product initiatives at LinkedIn, including the real-time faceted search in LinkedIn Signal and the news feed and tabs on the Homepage. Sensei is both a search engine and a database. It is designed to query and navigate through documents that consist of unstructured text and well-formed and structured metadata.

Read more

Related Tags
Browse projects by tags.

We have collection of more than 1 Million open source products ranging from Enterprise product to small libraries in all platforms. We aggregate information from all open source repositories. Search and find the best for your needs.