Whats new in Lucene / Solr 4.0

  •        0
  

We aggregate and tag open source projects. We have collections of more than one million projects. Check out the projects section.



The release 4.0 is one of the important milestone for Lucene and Solr. It has lot of new features and performance important. Few important ones are highliggted in this article.

Lucene:
  • ColumnStrideFields: DocValues are stored on a per-document basis where each documents field can hold exactly one value of a given type.
  • Facet search, this feature was already available in Solr. It is now integrated to Lucene.
  • With flexible indexing it is now possible for an application to create its own postings codec, to alter how fields, terms, docs and positions are encoded into the index.
  • Added a variety of different relevance ranking systems to Lucene.
  • Added a Codec implementation that works with append-only filesystems (such as e.g. Hadoop DFS).
  • Added DirectSpellChecker, which retrieves correction candidates directly from the term dictionary using levenshtein automata.
  • Indexed terms are no longer UTF-16 char sequences, instead terms can be any binary value encoded as byte arrays. By default, text terms are now encoded as UTF-8 bytes.
  • Substantially faster performance when using a Filter during searching.
  • FuzzyQuery is 100-200 times faster than in past releases.
  • Added index statistics such as the number of tokens for a term or field, number of postings for a field, and number of documents with a posting for a field.
  • Added RegexpQuery support to contrib/queryparser.
Solr: Solr 4.0-alpha includes more NoSQL features for those using Solr as a primary data store.
  • Distributed indexing designed from the ground up for near real-time (NRT) and NoSQL features such as realtime-get, optimistic locking, and durable updates.
  • High availability with no single points of failure.
  • Apache Zookeeper integration for distributed coordination and cluster metadata and configuration storage.
  • Updates sent to any node in the cluster and are automatically forwarded to the correct shard and replicated to multiple nodes for redundancy.
  • Queries sent to any node automatically perform a full distributed search across the cluster with load balancing and fail-over.
  • A transaction log ensures that even uncommitted documents are never lost.
  • Real-time Get – The ability to quickly retrieve the latest version of a document, without the need to commit or open a new searcher.
  • Atomic updates - the ability to add, remove, change, and increment fields of an existing document without having to send in the complete document again.
  • Pivot Faceting – Multi-level or hierarchical faceting where the top constraints for one field are found for each top constraint of a different field.
  • Pseudo-Join functionality – The ability to select a set of documents based on their relationship to a second set of documents.
  • A brand new web admin interface, including support for SolrCloud.
Reference:
http://lucene.apache.org/core/4_0_0-ALPHA/changes/Changes.html
http://lucene.apache.org/solr/

   

We publish blog post about open source products. If you are interested in sharing knowledge about open source products, please visit write for us

Subscribe to our newsletter.

We will send mail once in a week about latest updates on open source tools and technologies. subscribe our newsletter



Related Articles

Lucene Vs Solr

  • searchengine lucene solr

Lucene is a search library built in Java. Solr is a web application built on top of Lucene. Certainly Solr = Lucene + Added features. Often there would a question, when to choose Solr and when to choose Lucene.

Read More


Why Elastic Search is gaining more popularity than Solr?

  • solr elastic-search search-engine

Solr and Elastic Search both are built on top of Lucene library. Both are compratively equal. Any new feature / enhancement which get introduced in Lucene, will also get added to Solr. But still Elastic Search which uses Lucene as it core gained more popularity than Solr in recent years.

Read More


Lucene / Solr as NoSQL database

  • lucene solr no-sql nosql document-store

Lucene and Solr are most popular and widely used search engine. It indexes the content and delivers the search result faster. It has all capabilities of NoSQL database. This article describes about its pros and cons.

Read More


8 Best Open Source Searchengines built on top of Lucene

  • lucene solr searchengine elasticsearch

Lucene is most powerful and widely used Search engine. Here is the list of 7 search engines which is built on top of Lucene. You could imagine how powerful they are.

Read More


LucidWorks Vs SearchBlox - Enterprise Search Solution

  • lucene solr searchblox lucidworks enterprise-search

Enterprise search software should be capable to search the data available in the entire organization or personnel desktop. The data could be in File system, Web or in Database. It should search contents of Emails, file formats like doc, xls, ppt, pdf and lot more. There are many commercial products available but LucidWorks and SearchBlox are best and free.

Read More



Solr vs Elastic Search

  • full-text-search search-engine lucene solr elastic-search

Solr and Elastic Search are built on top of Lucene. Both are open source and both have extra features which makes programmer life easy. This article explains the difference and the best situation to choose between them.

Read More


Restrict Solr Admin Access

  • solr searchengine tips

Solr is a search engine built on top of Lucene. It supports REST interface and has lot of built-in capabilities. Solr package has Admin UI interface which has support to perform query and even delete the contents of the index. If you are using Solr in production then you may need to restrict access. I saw couple of questions in the group related to this topic. Thought to write an article explaining few tips to restrict the user access to Solr admin UI.

Read More


An introduction to LucidWorks Enterprise Search

  • lucene solr search engine enterprise

Lucidworks Enterprise search solution is built on top of Apache Solr. It scales seamlessly w/sub-second response times under extreme query loads for multi-billion document collections. It has user friendly UI, which does all the job of configuration and search.

Read More


How to make money from Open Source

  • opensource how-to money

As open source getting popular day by day, many have questions like How to make money from Open Source? Lot more products are getting introduced and don't know who is making money. Certainly open source means, give the product and source for free then how to make money? Yes sell the product for free but get paid for its services.

Read More


Why require Searchengine? Why not use database for full text search in Enterprise application.

  • searchengine database

Most of the database has support of full text search, basically indexing and saarching. MySQL, Oracle and many more databases has in-built full text search. Then what is the need to go for external search engine like Lucene, Sphinx, Solr etc. Check out the advantage of using Searchengine.

Read More


Scene.js - Library to Create Timeline-Based Animation

  • scenejs css timeline javascript animation motion

Scene.js is a JavaScript timeline-based animation library for creating animation websites. As an animated timeline library, it allows you to create a chronological order of movements and positions of objects.

Read More


Exonum Blockchain Framework by the Bitfury Group

  • blockchain bitcoin hyperledger blockchain-framework

Exonum is an extensible open source blockchain framework for building private blockchains which offers outstanding performance, data security, as well as fault tolerance. The framework does not include any business logic, instead, you can develop and add the services that meet your specific needs. Exonum can be used to build various solutions from a document registry to a DevOps facilitation system.

Read More


Top 15 Open source alternative to Microsoft products

  • microsoft-alternative open-source-enterprise

Microsoft is monopoly in the commercial software. Here are 15 best alternatives to most popular and widely used Microsoft products.

Read More


Apache OpenNLP - Document Classification

  • opennlp natural-language-processing nlp document-classification

Apache OpenNLP is a library for natural language processing using machine learning. In this article, we will explore document/text classification by training with sample data and then execute to get its results. We will use plain training model as one example and then training using Navie Bayes Algorithm.

Read More


Top 10 AI development tools which you should know in 2020

  • artificial-Intelligence neural-networks frameworks

It is a fact the 2020 is not going the way we expected to be but when it comes to technology breakthrough we can say 2020 will be the heir of greatness. <br />Speaking of technical breakthroughs we have got artificial intelligence which is known to be taking over the mankind like a wildfire. Everything around us is connected through AI be it shopping travelling or even reading. Every other activity of ours is transforming into a whole new extent.

Read More


RESTEasy - A guide to implement CRUD Rest API

  • resteasy rest-api java framework

RESTEasy is a JBoss project that provides various frameworks to help you build RESTful Web Services and RESTful Java applications. It is a fully certified and portable implementation of the JAX-RS 2.1 specification, a JCP specification that provides a Java API for RESTful Web Services over the HTTP protocol. It is licensed under the Apache 2.0 license.

Read More


Open source is the backbone for Startups

  • opensource open-source startups

Many startups are entering in to the business due to open source. Open source acts as a back bone / pillar for their business. It reduces the cost of production, Generates revenue from consulting and support. This article describes about the startups which flourished because of open source. Sun acquired MySQL for $1Bn is the biggest achievement for open source startups.

Read More


MailHog - Web and API based SMTP testing

  • smtp-testing testing-tool smtp test automation email-server email

Most of the projects will have a requirement of sending and receiving mails. We have mentioned about GreenMail - Email Test Framework, in our previous article about API based SMTP testing. In this article, we discuss about MailHog - Web and API based SMTP testing. You send out a mail from your code and you can check it via web visually and also via API. Those who do API testing can check via API. Developers may want to visually verify the format of the mail. MailHog is a best bet for SMTP testing.

Read More


WebSocket implementation with Spring Boot

  • websocket web-sockets spring-boot java

Spring Boot is a microservice-based Java framework used to create web application. WebSocket API is an advanced technology that provides full-duplex communication channels over a single TCP connection. This article explains about how to implement WebSocket using Spring Boot.

Read More


Desktop Apps using Electron JS with centralized data control

  • electronjs couchdb pouchdb desktop-app

When there is a requirement for having local storage for the desktop application context and data needs to be synchronized to central database, we can think of Electron with PouchDB having CouchDB stack. Electron can be used for cross-platform desktop apps with pouch db as local storage. It can sync those data to centralized database CouchDB seamlessly so any point desktop apps can recover or persist the data. In this article, we will go through of creation of desktop apps with ElectronJS, PouchDB and show the sync happens seamlessly with remote CouchDB.

Read More







We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.