Lucene / Solr as NoSQL database
Lucene and Solr are most popular and widely used search engine. It indexes the content and delivers the search result faster. It has all capabilities of NoSQL database. This article describes about its pros and cons.
NoSQL database should have following capabilities
- Schema-less
- Does not use SQL as its query language
- May not give full ACID guarantees
- Stores semi structured data
- Ability to store and retrieve faster
- No relationship between records
- Distributed, Scalable
- Clients could be written in any programming language
When you pick a SQL / NoSQL database, it will certainly have more number of database related features than Lucene / Solr. Database offers better storage persistence.
Lucene / Solr is designed for search engine but it has some capabilities of database. You could very well use it as document store but make sure you have persisted data some where else. Lucene / Solr cannot be used as primary database / data store. Data should be stored in a persistent store which could be file system or database or any archiving store. Data will also be stored and indexed in Lucene / Solr. Application will retrieve the information from Lucene db. At the end we have to make our application faster.
Reference:
Guardian is using Solr as its database
Lucene and Solr as NoSQL database
|
|
|
|
|
|
|
|
comments powered by Disqus
Related Products
Sql-to-mongo-importer - Solr like data import handler for mongo db to import data from SQL databases
IntroductionThis project moved to http://code.google.com/p/sql-to-nosql-importer/ sql-to-mongo-importer project reads from sql databases, converts and then inserts them into mongodb.For this purpose it uses one properties file (import.properties) where mongodb related settings are listed and one xml file with sql database related settings and de-normalized schema and fields.For more info http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml But the configuration file sql
Solr
Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.
Orient - NoSQL document database light, portable and fast. Supports ACID Tx, Indexes, asynch queries
What is Orient?OrientDB is an Open Source NoSQL DBMS with both the features of Document and Graph DBMSs. It's written in Java and it's amazing fast: can store up to 150,000 records per second on common hardware. Even if it's Document based database the relationships are managed as in Graph Databases with direct connections among records. You can traverse entire or part of trees and graphs of records in few milliseconds. Supports schema-less, schema-full and schema-mixed modes. Has a strong secur
Alchemydatabase - A Hybrid Relational-Database/NOSQL-Datastore
Alchemy Database: A Hybrid RDBMS/NOSQL-DatastoreAlchemy Database is a low-latency high-TPS NewSQL RDBMS embedded in the NOSQL datastore redis. Extensive datastore-side-scripting is provided via deeply embedded Lua. Unstructured data, can also be stored, as there are no limits on #tables, #indexes, #columns, and sparsely populated rows use minimal memory. AlchemyDB believes OLTP traffic's needs are best served by extending SQL and has recently added the following experimental functionalities: Lua
Spatial Solr Plugin for Lucene and Solr
With the continuous efforts of adjusting search results to focused target audieces, there's an increasing demand for incorporating geographical location information into the standard search functionality. Spatial Solr Plugin (SSP) is a free, standalone plug-in which enables Geo / Location Based Search, and is built on top of the open source projects Apache Solr and Apache Lucene.
Solr/Lucene on Azure
This project hosts Solr/Lucene in Windows Azure using multi-instance replication for index-serving and single-instance for index generation with a persistent index mounted in Azure storage. Typical scenarios could be a commercial and publisher sites that need to scale the traffic with increasing query volume and need to index maximum 16 TB of data and require couple of index updates per day.
Acidhouse - NoSQL Killer Tune
IMPORTANT NOTICE: The project has been moved to GitHub and this site will be closed by July, 2012. Please follow https://github.com/eiichiro/acidhouseAcid House is a generic Java client library for NoSQL datastores developed by Eiichiro Uchiumi (Eiichiro.org) and can be performed on JDK 6 and later Java platform. Acid House's goal is to offer human-friendly API based on NoSQL application's best practices. Acid House (will) supports the following NoSQL datastores: Google App Engine Datastore Apac
Lusql - Plugable pipelined threaded extract transform load (ETL), default from JDBC to Lucene
LuSql is a simple but powerful tool for building Lucene indexes from relational databases. It is a command-line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQL query to use, individual indexing/storage/term-vector nature of fields, analyzer, stop word list, and other tuning parameters. In its default mode it uses threading to take advantage of multiple core
Solphr - An object oriented PHP5 library for accessing Solr search indices.
Solphr (pronounced "sulfur") is an object-oriented PHP5 library designed for ease of use and deployment for communicating with Solr. Why a new library?In the process of deploying Solr for searching video, the developers at Frameweld determined that the pre-existing solutions were less than satisfactory. While the solutions offered access to Solr, they were either missing too much functionality (the existing pure PHP offerings) or had an API that was overcomplicated and impossible to learn (the P
Lily - Content Repository
Lily offers an open source content repository. It is the first cloud-scalable repository for social content applications. It is built from ground up using Big Data and NOSQL technology. Its technology stack includes Hadoop, HBase and Solr. It could be used in document archiving, large-scale SaaS-model web content management, heritage databases, news libraries, digital asset management, content collections, structured data management.