How to build meta search engine

  •        0
  

We aggregate and tag open source projects. We have collections of more than one million projects. Check out the projects section.



Meta Search engine is nothing but a search engine which searches more than one search engine and combines or filters the results. Each search engine has its own proprietary ranking mechanism to rank the results. When combined the search results from all leading search engines would be more informative and useful. With less page traversals we will end up our destination.

Google, Yahoo and Bing search engines are most used and provide an api to search. Call their web service API, get the results independently. Now we have 3 set of results, we need to combine or filter them. How to do that? This is the most trickiest part and it requires better algorithm.


Carrot2 - An open source search results clustering engine could be used to cluster the results. It has two kinds of algorithm (Lingo, STC)to cluster the results. All the above APIs has support of REST interface and it is easy to code in your desired programming language.

Most of the search engines support API to search web, news, videos, images etc. You could search based on your need. Another good use case would be, most of the companies monitor the web, social networking sites to get feedback (good and bad news) about their product, about their competitor products, search on the keyword across the searchengine, cluster the results and analyze the results for research.

Reference:

   

We publish blog post about open source products. If you are interested in sharing knowledge about open source products, please visit write for us

Subscribe to our newsletter.

We will send mail once in a week about latest updates on open source tools and technologies. subscribe our newsletter



Related Articles

Why require Searchengine? Why not use database for full text search in Enterprise application.

  • searchengine database

Most of the database has support of full text search, basically indexing and saarching. MySQL, Oracle and many more databases has in-built full text search. Then what is the need to go for external search engine like Lucene, Sphinx, Solr etc. Check out the advantage of using Searchengine.

Read More


Lucene Vs Solr

  • searchengine lucene solr

Lucene is a search library built in Java. Solr is a web application built on top of Lucene. Certainly Solr = Lucene + Added features. Often there would a question, when to choose Solr and when to choose Lucene.

Read More


8 Best Open Source Searchengines built on top of Lucene

  • lucene solr searchengine elasticsearch

Lucene is most powerful and widely used Search engine. Here is the list of 7 search engines which is built on top of Lucene. You could imagine how powerful they are.

Read More


How to create SEO friendly url

  • seo url searchengine

SEO friendly URL is recommended for any website which wants to be indexed and wants its presence in search results. Searchengine mostly index the static URL. It will avoid the URL which has lot of query strings. Almost all websites generate content dynamically then how could the URL be static. That is the job of the programmer.

Read More


Restrict Solr Admin Access

  • solr searchengine tips

Solr is a search engine built on top of Lucene. It supports REST interface and has lot of built-in capabilities. Solr package has Admin UI interface which has support to perform query and even delete the contents of the index. If you are using Solr in production then you may need to restrict access. I saw couple of questions in the group related to this topic. Thought to write an article explaining few tips to restrict the user access to Solr admin UI.

Read More



An introduction to LucidWorks Enterprise Search

  • lucene solr search engine enterprise

Lucidworks Enterprise search solution is built on top of Apache Solr. It scales seamlessly w/sub-second response times under extreme query loads for multi-billion document collections. It has user friendly UI, which does all the job of configuration and search.

Read More


Introduction to Light 4J Microservices Framework

  • light4j microservice java programming framework

Light 4j is fast, lightweight, secure and cloud native microservices platform written in Java 8. It is based on pure HTTP server without Java EE platform. It is hosted by server UnderTow. Light-4j and related frameworks are released under the Apache 2.0 license.

Read More


Introduction to Apache Cassandra

  • cassandra database nosql

Apache Cassandra was designed by Facebook and was open-sourced in July 2008. It is regarded as perfect choice when the users demand scalability and high availability without any impact towards performance. Apache Cassandra is highly scalable, high-performance distributed database designed to handle large voluminous amounts of data across many commodity servers with no failure.

Read More


Advanced Programming Guide in Redis using Jedis

  • redis jedis advanced-guide cluster pipline publish-subscribe

Redis is an in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. This blog covers the advanced concepts like cluster, publish and subscribe, pipeling concepts of Redis using Jedis Java library.

Read More


How to install and setup Redis

  • redis install setup redis-cluster

Redis is an open source (BSD licensed), in-memory data structure store, used also as a database cache and message broker. It is written in ANSI C and works in all the operating systems. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. This article explains about how to install Redis.

Read More


Whats new in Lucene / Solr 4.0

  • lucene solr new-release

The release 4.0 is one of the important milestone for Lucene and Solr. It has lot of new features and performance important. Few important ones are highliggted in this article.

Read More


PySpark: Installation, RDDs, MLib, Broadcast and Accumulator

  • pyspark spark python rdd big-data

We knew that Apace Spark- the most famous parallel computing model or processing the massive data set is written in Scala programming language. The Apace foundation offered a tool to support the Python in Spark which was named PySpark. The PySpark allows us to use RDDs in Python programming language through a library called Py4j. This article provides basic introduction about PySpark, RDD, MLib, Broadcase and Accumulator.

Read More


AbanteCart - Easy to use open source e-commerce platform, helps selling online

  • e-commerce ecommerce cart

AbanteCart is a free, open source shopping cart that was built by developers with a passion for free and accessible software. Founded in 2010 (launched in 2011), the platform is coded in PHP and supports MySQL. AbanteCart’s easy to use admin and basic layout management tool make this open source solution both easy to use and customizable, depending on the skills of the user. AbanteCart is very user-friendly, it is entirely possible for a user with little to no coding experience to set up and use this cart. If the user would be limited to the themes and features available in base AbanteCart, there is a marketplace where third-party extensions or plugins come to the rescue.

Read More


Desktop Apps using Electron JS with centralized data control

  • electronjs couchdb pouchdb desktop-app

When there is a requirement for having local storage for the desktop application context and data needs to be synchronized to central database, we can think of Electron with PouchDB having CouchDB stack. Electron can be used for cross-platform desktop apps with pouch db as local storage. It can sync those data to centralized database CouchDB seamlessly so any point desktop apps can recover or persist the data. In this article, we will go through of creation of desktop apps with ElectronJS, PouchDB and show the sync happens seamlessly with remote CouchDB.

Read More


An introduction to web cache proxy server - nuster

  • web-cache proxy-server load-balancer

Nuster is a simple yet powerful web caching proxy server based on HAProxy. It is 100% compatible with HAProxy, and takes full advantage of the ACL functionality of HAProxy to provide fine-grained caching policy based on the content of request, response or server status. This article gives an overview of nuster - web cache proxy server, its installation and few examples of how to use it.

Read More


New Open Source Web Browser Engine from Google and Mozilla

  • browser browser-engine

Web Browser engine used to render the page from HTML, CSS and Javascript. The browsers are one of the most important tool to view the web and it is must have software in Desktop and Mobile. These new browsers will take advantage of tomorrow’s faster, multi-core, heterogeneous computing desktop and mobile architectures.

Read More


Thymeleaf - Text display, Iteration and Conditionals

  • thymeleaf template-engine web-programming java

Thymeleaf is a server-side Java template engine for both web and standalone environments. It is a better alternative to JavaServer Pages (JSP). Spring MVC and Thymeleaf compliment each other if chosen for web application development. In this article, we will discuss how to use Thymeleaf.

Read More


WebSocket implementation with Spring Boot

  • websocket web-sockets spring-boot java

Spring Boot is a microservice-based Java framework used to create web application. WebSocket API is an advanced technology that provides full-duplex communication channels over a single TCP connection. This article explains about how to implement WebSocket using Spring Boot.

Read More


Push Notifications using Angular

  • angular push-notifications notifications

Notifications is a message pushed to user's device passively. Browser supports notifications and push API that allows to send message asynchronously to the user. Messages are sent with the help of service workers, it runs as background tasks to receive and relay the messages to the desktop if the application is not opened. It uses web push protocol to register the server and send message to the application. Once user opt-in for the updates, it is effective way of re-engaging users with customized content.

Read More


JWT Authentication using Auth0 Library

  • java jwt authentication security

Json Web Token shortly called as JWT becomes defacto standard for authenticating REST API. In a traditional web application, once the user login credentials are validated, loggedin user object will be stored in session. Till user logs out, session will remain and user can work on the web application without any issues. Rest world is stateless, it is difficult to identify whether the user is already authenticated. One way is to use authenticate every API but that would be too expensive task as the client has to provide credentials in every API. Another approach is to use token.

Read More







We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.