Lucene / Solr as NoSQL database

  •        738

Lucene and Solr are most popular and widely used search engine. It indexes the content and delivers the search result faster. It has all capabilities of NoSQL database. This article describes about its pros and cons.

NoSQL database should have following capabilities

  1. Schema-less
  2. Does not use SQL as its query language
  3. May not give full ACID guarantees
  4. Stores semi structured data
  5. Ability to store and retrieve faster
  6. No relationship between records
  7. Distributed, Scalable
  8. Clients could be written in any programming language
Lucene / Solr has these capabilities and it could be very well considered as NoSQL Document Store. It will reduce the learning curve of learning one more NoSQL database.

When you pick a SQL / NoSQL database, it will certainly have more number of database related features than Lucene / Solr. Database offers better storage persistence.

Lucene / Solr is designed for search engine but it has some capabilities of database. You could very well use it as document store but make sure you have persisted data some where else. Lucene / Solr cannot be used as primary database / data store. Data should be stored in a persistent store which could be file system or database or any archiving store. Data will also be stored and indexed in Lucene / Solr. Application will retrieve the information from Lucene db. At the end we have to make our application faster.

Reference:
Guardian is using Solr as its database
Lucene and Solr as NoSQL database



Tags
Implementation
License
Platform

   




Related Projects

Solr - Blazing-fast, open source enterprise search platform


Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

ElasticSearch - Distributed, RESTful search and analytics engine


Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

Solr/Lucene on Azure


This project hosts Solr/Lucene in Windows Azure using multi-instance replication for index-serving and single-instance for index generation with a persistent index mounted in Azure storage. Typical scenarios could be a commercial and publisher sites that need to scale the traffic with increasing query volume and need to index maximum 16 TB of data and require couple of index updates per day.

coffeenode-solr - A minimal Solr distro + HTPP / JSON API for building Lucene document stores


A minimal Solr distro + HTPP / JSON API for building Lucene document stores

Spatial Solr Plugin for Lucene and Solr


With the continuous efforts of adjusting search results to focused target audieces, there's an increasing demand for incorporating geographical location information into the standard search functionality. Spatial Solr Plugin (SSP) is a free, standalone plug-in which enables Geo / Location Based Search, and is built on top of the open source projects Apache Solr and Apache Lucene.



SQLToNoSQLImporter - Import data from SQL to NoSQL systems


SQLToNoSQLImporter is a Solr like data import handler to import Sql (MySQL,Oracle,PostgreSQL) data to NoSQL Systems (Mongodb,CouchDB,Elastic Search). Migration is now completely configuration driven. SQLToNoSQLImporter reads from sql databases, converts and then batch inserts them into NoSQL datastore.

ArangoDB - The Multi-purpose NoSQL DB


ArangoDB is a multi-purpose open-source database with flexible data model for documents, graphs, and key-values. Build high performance application using a convenient sql-like query language or JavaScript/Ruby extensions. Its key features are Schema-free, Convenient querying using AQL, Extendable through JS, Space efficiency, Supports modern storage hardware, like SSD and large caches and lot more.

OrientDB - The NoSQL Graph-Document DBMS


OrientDB has the flexibility of the Document databases and the power of the Graph databases to manage relationships. It can work in schema-less mode, schema-full or a mix of both. It can store up to 150,000 records per second on common hardware. OrientDB has been designed to be very fast. It inherits the best features and concepts from the Object Databases, Graph DBMS and the modern NoSQL engines.

LokiJS - A fast, in-memory document-oriented datastore for node.js, browser and cordova


LokiJS is a document oriented database, Its purpose is to store javascript objects as documents in a nosql fashion and retrieve them with a similar mechanism. It runs in node (including cordova/phonegap and node-webkit) and the browser. It is ideal for client-side in-memory db is ideal (e.g., a session store), data sets loaded into a browser page and synchronised at the end of the work session etc.

lucene-solr - Mirror of Apache Lucene & Solr


Mirror of Apache Lucene & Solr

lucene-solr - Mirror of apache lucene-solr repo. Used for Geospatial Suggest development


Mirror of apache lucene-solr repo. Used for Geospatial Suggest development

siren


Efficient, large scale handling of semi-structured data is increasingly an important issue to many web and enterprise information reuse scenarios.While Lucene has long offered these capabilities, its native capabilities are not intended for collections of semi-structured documents (e.g., documents with very different schemas, documents with arbitrary nested objects). For this reason we developed SIREn - Semantic Information Retrieval Engine - a Lucene/Solr/Elasticsearch plugin to overcome these shortcomings and efficiently index and query arbitrary JSON documents, as well as any JSON document with an arbitrary amount of metadata fields.

havalo - Non Distributed NoSQL Key Value Store


A zero configuration, non-distributed NoSQL key-value store that runs in any Servlet 3.0 compatible container. With Havalo, simply drop havalo.war into your favorite Servlet 3.0 compatible container and with almost no configuration you'll have access to a fast and lightweight K,V store backed by any local mount point for persistent storage.

TinyDB - Lightweight Document Oriented Database


TinyDB is a lightweight document oriented database optimized for your happiness. It's written in pure Python and has no external dependencies. The target are small apps that would be blown away by a SQL-DB or an external database server. TinyDB neither needs an external server nor any dependencies from PyPI. You can easily extend TinyDB by writing new storages or modify the behaviour of storages with Middlewares.

Carrot2 - Search Results Clustering Engine


Carrot2 is an Open Source Search Results Clustering Engine. It could cluster the search results from various sources and generates small collection of documents. Carrot2 offers ready-to-use components for fetching search results from various sources including YahooAPI, GoogleAPI, Bing API, eTools Meta Search, Lucene, SOLR, Google Desktop and more.

lucene-skos - SKOS Support for Apache Lucene and Solr


SKOS Support for Apache Lucene and Solr

lucene - lucene??web????????solr?NRTManager?SearcherManager?


lucene??web????????solr?NRTManager?SearcherManager?

octo-solr-adventure - Contains some plays with Solr, Lucene


Contains some plays with Solr, Lucene

webservice-solr - Module to interface with the Solr (Lucene) webservice


Module to interface with the Solr (Lucene) webservice