How to build meta search engine
Meta Search engine is nothing but a search engine which searches more than one search engine and combines or filters the results. Each search engine has its own proprietary ranking mechanism to rank the results. When combined the search results from all leading search engines would be more informative and useful. With less page traversals we will end up our destination.
Google, Yahoo and Bing search engines are most used and provide an api to search. Call their web service API, get the results independently. Now we have 3 set of results, we need to combine or filter them. How to do that? This is the most trickiest part and it requires better algorithm.
Carrot2 - An open source search results clustering engine could be used to cluster the results. It has two kinds of algorithm (Lingo, STC)to cluster the results. All the above APIs has support of REST interface and it is easy to code in your desired programming language.
Most of the search engines support API to search web, news, videos, images etc. You could search based on your need. Another good use case would be, most of the companies monitor the web, social networking sites to get feedback (good and bad news) about their product, about their competitor products, search on the keyword across the searchengine, cluster the results and analyze the results for research.
comments powered by Disqus
Wikipedia is a multilingual, collaboratively edited encyclopedia. It is one of the busiest site in the world. It has more than 8 million articles and accessed by millions of users around the world. This article briefly discuss about the open source software used in Wikipedia.
LinkedIn is a social network for professionals. LinkedIn handles millions of searches as well as hundreds of thousands of updates daily. They sponsored many projects to open source. Here are the list of open source products used by LinkedIn.
Twitter uses many open source products and also contributes most of the code to open source. Here is the list of open source products used by Twitter. This list does not include the projects sponsored by twitter.
Facebook a leading social networking website predominantly uses open source technologies to build its application. Here is the list of open source products used and contributed by Facebook.
Tumblr is a microblogging platform that allows users to effortlessly share anything. Tumblr now hosts over 70 million blogs with over 34 billion posts to date. Below is the list of open source used in Tumbler.
Pinterest is a tool for collecting and organizing things you love. It is a social networking site where users could pin images and write a note for that. It is now currently serving billions of pages every month. Check out the open source products used in Pinterest.