Whats new in Lucene / Solr 4.0

  •        0
  

We aggregate and tag open source projects. We have collections of more than one million projects. Check out the projects section.



The release 4.0 is one of the important milestone for Lucene and Solr. It has lot of new features and performance important. Few important ones are highliggted in this article.

Lucene:
  • ColumnStrideFields: DocValues are stored on a per-document basis where each documents field can hold exactly one value of a given type.
  • Facet search, this feature was already available in Solr. It is now integrated to Lucene.
  • With flexible indexing it is now possible for an application to create its own postings codec, to alter how fields, terms, docs and positions are encoded into the index.
  • Added a variety of different relevance ranking systems to Lucene.
  • Added a Codec implementation that works with append-only filesystems (such as e.g. Hadoop DFS).
  • Added DirectSpellChecker, which retrieves correction candidates directly from the term dictionary using levenshtein automata.
  • Indexed terms are no longer UTF-16 char sequences, instead terms can be any binary value encoded as byte arrays. By default, text terms are now encoded as UTF-8 bytes.
  • Substantially faster performance when using a Filter during searching.
  • FuzzyQuery is 100-200 times faster than in past releases.
  • Added index statistics such as the number of tokens for a term or field, number of postings for a field, and number of documents with a posting for a field.
  • Added RegexpQuery support to contrib/queryparser.
Solr: Solr 4.0-alpha includes more NoSQL features for those using Solr as a primary data store.
  • Distributed indexing designed from the ground up for near real-time (NRT) and NoSQL features such as realtime-get, optimistic locking, and durable updates.
  • High availability with no single points of failure.
  • Apache Zookeeper integration for distributed coordination and cluster metadata and configuration storage.
  • Updates sent to any node in the cluster and are automatically forwarded to the correct shard and replicated to multiple nodes for redundancy.
  • Queries sent to any node automatically perform a full distributed search across the cluster with load balancing and fail-over.
  • A transaction log ensures that even uncommitted documents are never lost.
  • Real-time Get – The ability to quickly retrieve the latest version of a document, without the need to commit or open a new searcher.
  • Atomic updates - the ability to add, remove, change, and increment fields of an existing document without having to send in the complete document again.
  • Pivot Faceting – Multi-level or hierarchical faceting where the top constraints for one field are found for each top constraint of a different field.
  • Pseudo-Join functionality – The ability to select a set of documents based on their relationship to a second set of documents.
  • A brand new web admin interface, including support for SolrCloud.
Reference:
http://lucene.apache.org/core/4_0_0-ALPHA/changes/Changes.html
http://lucene.apache.org/solr/


We aggregate and tag open source projects. We have collections of more than one million projects. Check out the projects section.


   




Related Articles

Lucene Vs Solr

  • searchengine lucene solr

Lucene is a search library built in Java. Solr is a web application built on top of Lucene. Certainly Solr = Lucene + Added features. Often there would a question, when to choose Solr and when to choose Lucene.

Read More


Why Elastic Search is gaining more popularity than Solr?

  • solr elastic-search search-engine

Solr and Elastic Search both are built on top of Lucene library. Both are compratively equal. Any new feature / enhancement which get introduced in Lucene, will also get added to Solr. But still Elastic Search which uses Lucene as it core gained more popularity than Solr in recent years.

Read More


Lucene / Solr as NoSQL database

  • lucene solr no-sql nosql document-store

Lucene and Solr are most popular and widely used search engine. It indexes the content and delivers the search result faster. It has all capabilities of NoSQL database. This article describes about its pros and cons.

Read More


8 Best Open Source Searchengines built on top of Lucene

  • lucene solr searchengine elasticsearch

Lucene is most powerful and widely used Search engine. Here is the list of 7 search engines which is built on top of Lucene. You could imagine how powerful they are.

Read More


LucidWorks Vs SearchBlox - Enterprise Search Solution

  • lucene solr searchblox lucidworks enterprise-search

Enterprise search software should be capable to search the data available in the entire organization or personnel desktop. The data could be in File system, Web or in Database. It should search contents of Emails, file formats like doc, xls, ppt, pdf and lot more. There are many commercial products available but LucidWorks and SearchBlox are best and free.

Read More


Solr vs Elastic Search

  • full-text-search search-engine lucene solr elastic-search

Solr and Elastic Search are built on top of Lucene. Both are open source and both have extra features which makes programmer life easy. This article explains the difference and the best situation to choose between them.

Read More


Restrict Solr Admin Access

  • solr searchengine tips

Solr is a search engine built on top of Lucene. It supports REST interface and has lot of built-in capabilities. Solr package has Admin UI interface which has support to perform query and even delete the contents of the index. If you are using Solr in production then you may need to restrict access. I saw couple of questions in the group related to this topic. Thought to write an article explaining few tips to restrict the user access to Solr admin UI.

Read More


An introduction to LucidWorks Enterprise Search

  • lucene solr search engine enterprise

Lucidworks Enterprise search solution is built on top of Apache Solr. It scales seamlessly w/sub-second response times under extreme query loads for multi-billion document collections. It has user friendly UI, which does all the job of configuration and search.

Read More


Why require Searchengine? Why not use database for full text search in Enterprise application.

  • searchengine database

Most of the database has support of full text search, basically indexing and saarching. MySQL, Oracle and many more databases has in-built full text search. Then what is the need to go for external search engine like Lucene, Sphinx, Solr etc. Check out the advantage of using Searchengine.

Read More


How to make money from Open Source

  • opensource how-to money

As open source getting popular day by day, many have questions like How to make money from Open Source? Lot more products are getting introduced and don't know who is making money. Certainly open source means, give the product and source for free then how to make money? Yes sell the product for free but get paid for its services.

Read More


Top 15 Open source alternative to Microsoft products

  • microsoft-alternative open-source-enterprise

Microsoft is monopoly in the commercial software. Here are 15 best alternatives to most popular and widely used Microsoft products.

Read More


Open source is the backbone for Startups

  • opensource open-source startups

Many startups are entering in to the business due to open source. Open source acts as a back bone / pillar for their business. It reduces the cost of production, Generates revenue from consulting and support. This article describes about the startups which flourished because of open source. Sun acquired MySQL for $1Bn is the biggest achievement for open source startups.

Read More


Microsoft released F# under Open Source

  • fsharp opensource

F# is a functional programming language for the .NET Framework. It combines the succinct, expressive and compositional style of functional programming with the runtime, libraries, interoperability, and object model of .NET. Microsoft recently released its source code under Apache License.

Read More


Appserver.io – The First Multithreaded Application Server for PHP written in PHP

  • appserver application-server php

What if you could reliably run PHP without Nginx or Apache, but also without relying on its internal server? What if you could do async operations in PHP with true multi threading, fully taking advantage of multi core processors without hacks or a jungle of callbacks? What if you had drag and drop installation support for your PHAR packaged web apps in an environment identical to its production counterpart? Welcome to appserver.io – the worlds first open source application server for PHP.

Read More


4 Free Code Snippets Hosting Sites

  • code-hosting code code-snippet

Most of us would be familar about the project hosting sites like sourceforge, github etc. These sites hosts the complete project, documentation and has a means to track bugs etc. Sometimes we may need to host the simple pieace of code which we may be want to share it with our friends or to post the link in the forum to discuss more about the code.

Read More


Generate Thumbnail in Java using Thumbnailator library 

  • thumbnail image-processing java

In our work there will be situation where we need to resize the image, generate thumbnails and so on. Users need to have little bit of image processing knowledge to achieve it. We have Java ImageIO APIs to achieve these functionalities. As said, we need to be aware of or spend time in learning these APIs. To help us, Thumbnailator library provides easy fluent style API and generates thumbnail in simple three lines of code.

Read More


New Open Source Web Browser Engine from Google and Mozilla

  • browser browser-engine

Web Browser engine used to render the page from HTML, CSS and Javascript. The browsers are one of the most important tool to view the web and it is must have software in Desktop and Mobile. These new browsers will take advantage of tomorrow’s faster, multi-core, heterogeneous computing desktop and mobile architectures.

Read More


How to contribute to open source

  • opensource contribute participate foss

I could see many many students posting this question in many forums, I want to contribute to open source but How to contribute? There are many ways to do that. I have listed a few and I hope it might be useful.

Read More


React Patent Clause Licensing issue. Is it something to worry?

  • react react-license facebook

React libraries from Facebook is one of the most used UI libraries. It is competitive to AngularJS. There are many open source UI components or frameworks available but mostly people narrow down to two choices Angular / React. Recently Facebook has updated React license and added a patent clause which makes companies to worry and rethink whether to use React or not.

Read More


Microweber CMS - An open source CMS with Ecommerce support

  • cms e-commerce microweber

To the user's satisfaction, there is a whole wide world of different CMS, all suitable for different needs. You can go for the giants like Wordpress or Joomla or pick one of the rising forces - Shopify, Squarespace or others. Microweber CMS fills a hole in the current technological ecosystem, aimed at delivering a light software that is perfect for all end-users lacking the technical knowledge needed for complicated website building.

Read More