Carrot2 - Search Results Clustering Engine

  •        8381

Carrot2 is an Open Source Search Results Clustering Engine. It could cluster the search results from various sources and generates small collection of documents. Carrot2 offers ready-to-use components for fetching search results from various sources including YahooAPI, GoogleAPI, Bing API, eTools Meta Search, Lucene, SOLR, Google Desktop and more.


It is implemented in Java. It has native API implementation in CSharp. Java runtime is not required and the performance is comparable to Java. It has support of REST interface which could be called from PHP and Ruby.

If you have search instances running in multiple nodes and search has to perform across the nodes, then you need a way to combine those results, filter and sort them. Carrot2 helps to do this job efficiently. It is well suited to work with Lucene, Solr and Nutch.

Carrot2 could be even called as meta search engine. It has built-in functionality to fetch results from all popular search-engines and combine them. It also offers supporting tools like command-line and GUI application to experiment with this product. Firefox and IE search plug-in is also available.

Demo: http://search.carrot2.org/stable/search

http://project.carrot2.org/

Tags
Implementation
License
Platform

   




Related Projects

OpenSearch - Open source distributed and RESTful search engine

  •    Java

OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards. OpenSearch enables people to easily ingest, secure, search, aggregate, view, and analyze data. These capabilities are popular for use cases such as application search, log analytics, and more.

wksctl - Open Source Weaveworks Kubernetes System

  •    Go

Please note that the code has recently updated from ClusterAPI v1alpha1 to v1alpha3 and as a result Everything Has Changed While this note is in the README you may find inconsistencies in the code, and between the code, examples and documentation. Sorry about that. Feel free to still open issues and/or ask questions as below. wksctl allows simple creation of a Kubernetes cluster given a set of IP addresses and an SSH key. It can be run in a standalone environment but is best used via a GitOps approach in which cluster and machine descriptions are stored in Git and the state of the cluster tracks changes to the descriptions.

Helix - Cluster Management Framework

  •    Java

Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. It helps to perform scheduling of maintenance tasks, such as backups, garbage collection, file consolidation, index rebuilds, repartitioning of data or resources across the cluster, informing dependent systems of changes so they can react appropriately to cluster changes, throttling system tasks and changes and so on.

docker-redis-cluster - Dockerfile for Redis Cluster (redis 3.0+)

  •    Makefile

Docker image with redis built and installed from source. The main usage for this container is to test redis cluster code. For example in https://github.com/Grokzen/redis-py-cluster repo.

cluster-api - Home for the Cluster Management API work, a subproject of sig-cluster-lifecycle

  •    Go

The Cluster API is a Kubernetes project to bring declarative, Kubernetes-style APIs to cluster creation, configuration, and management. It provides optional, additive functionality on top of core Kubernetes. Note that Cluster API effort is still in the prototype stage while we get feedback on the API types themselves. All of the code here is to experiment with the API and demo its abilities, in order to drive more technical feedback to the API design. Because of this, all of the prototype code is rapidly changing.


Kubernetes-GPU-Guide - This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster

  •    Shell

This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster. Therefore I will explain how to easily setup a GPU cluster on multiple Ubuntu 16.04 bare metal servers and provide some useful scripts and .yaml files that do the entire setup for you. By the way: If you need a Kubernetes GPU-cluster for other reasons, this guide might be helpful to you as well.

redis-py-cluster - Python cluster client for the official redis cluster. Redis 3.0+.

  •    Python

This client provides a client for redis cluster that was added in redis 3.0. This Readme contains a reduced version of the full documentation.

kube-cluster-osx - Local development multi node Kubernetes Cluster for macOS made very simple

  •    Shell

Kube-Cluster for macOS is a status bar app which allows in an easy way to bootstrap and control multi-node (master+ two nodes) Kubernetes cluster on three CoreOS VMs. It leverages macOS native Hypervisor virtualisation framework of using corectl command line tool, so there are no needs to use VirtualBox or any other virtualisation software anymore.

Galera - Cluster for MySQL and MariaDB

  •    C

Galera cluster provides distributed, multi master support for MySQL and MariaDB. Its feature include Synchronous Replication, True Multi-master, Active-Active Cluster Read and write to any node at any time, Automatic Node Provisioning, Multi-threaded Slave and lot more.

turing-pi-cluster - Turing Pi cluster configuration for Raspberry Pi Compute Modules

  •    HTML

You might also be interested in another Raspberry-Pi cluster I've maintained for years, the Raspberry Pi Dramble, which is a Kubernetes Pi cluster in my basement that hosts www.pidramble.com. Other models of Raspberry Pi and Compute Modules may or may not work, but the main thing you need is a cluster with at least 7 GB of RAM and at least 12 available CPU cores (every current Pi has 4 CPU cores), otherwise not all of the software will be able to run well.

kubernetes-cluster-federation - Kubernetes cluster federation tutorial

  •    

This tutorial will walk you through setting up a Kubernetes cluster federation composed of four Kubernetes clusters across multiple GCP regions.This guide is not for people looking for a fully automated command to bring up a Kubernetes cluster federation. If that's you then check out Setting up Cluster Federation with Kubefed.

tack - Terraform module for creating Kubernetes cluster running on Container Linux by CoreOS in an AWS VPC

  •    HCL

Opinionated Terraform module for creating a Highly Available Kubernetes cluster running on Container Linux by CoreOS (any channel) in an AWS Virtual Private Cloud VPC. With prerequisites installed make all will simply spin up a default cluster; and, since it is based on Terraform, customization is much easier than CloudFormation.The default configuration includes Kubernetes add-ons: DNS, Dashboard and UI.

corvus - A fast and lightweight Redis Cluster Proxy for Redis 3.0

  •    C

Corvus is a fast and lightweight redis cluster proxy for redis 3.0 with cluster mode enabled.Most redis client implementations don't support redis cluster. We have a lot of services relying on redis, which are written in Python, Java, Go, Nodejs etc. It's hard to provide redis client libraries for multiple languages without breaking compatibilities. We used twemproxy before, but it relies on sentinel for high availabity, it also requires restarting to add or remove backend redis instances, which causes service interruption. And twemproxy is single threaded, we have to deploy multiple twemproxy instances for large number of clients, which causes the sa headaches.

yoke - Postgres high-availability cluster with auto-failover and automated cluster recovery.

  •    Go

Yoke is a Postgres redundancy/auto-failover solution that provides a high-availability PostgreSQL cluster that's simple to manage. Note: The ini file can be named anything and reside anywhere. All Yoke needs is the /path/to/config.ini on startup.

CCM - A script to easily create and destroy an Apache Cassandra cluster on localhost

  •    Python

A script/library to create, launch and remove an Apache Cassandra cluster on localhost. The goal of ccm and ccmlib is to make it easy to create, manage and destroy a small Cassandra cluster on a local box. It is meant for testing a Cassandra cluster.

kubeadm-ha - Kubernetes high availiability deploy based on kubeadm (for v1

  •    Smarty

kube-apiserver: exposes the Kubernetes API. It is the front-end for the Kubernetes control plane. It is designed to scale horizontally – that is, it scales by deploying more instances. etcd: is used as Kubernetes’ backing store. All cluster data is stored here. Always have a backup plan for etcd’s data for your Kubernetes cluster. kube-scheduler: watches newly created pods that have no node assigned, and selects a node for them to run on. kube-controller-manager: runs controllers, which are the background threads that handle routine tasks in the cluster. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process. kubelet: is the primary node agent. It watches for pods that have been assigned to its node (either by apiserver or via local configuration file) kube-proxy: enables the Kubernetes service abstraction by maintaining network rules on the host and performing connection forwarding. keepalived cluster config a virtual IP address (192.168.20.10), this virtual IP address point to k8s-master01, k8s-master02, k8s-master03. nginx service as the load balancer of k8s-master01, k8s-master02, k8s-master03's apiserver. The other nodes kubernetes services connect the keepalived virtual ip address (192.168.20.10) and nginx exposed port (16443) to communicate with the master cluster's apiservers.

ksync - Sync files between your local system and a kubernetes cluster.

  •    Go

ksync speeds up developers who build applications for Kubernetes. It transparently updates containers running on the cluster from your local checkout. This enables developers to use their favorite IDEs, such as Atom or Sublime Text to work from inside a cluster instead of from outside it. There is no reason to wait minutes to test code changes when you can see the results in seconds. You can also download the latest release and install it yourself.

redis-go-cluster - redis cluster client implementation in Go

  •    Go

redis-go-cluster is a golang implementation of redis client based on Gary Burd's Redigo. It caches slot info at local and updates it automatically when cluster change. The client manages a connection pool for each node, uses goroutine to execute as concurrently as possible, which leads to its high efficiency and low lantency. redis-go-cluster has compatible interface to Redigo, which uses a print-like API for all redis commands. When executing a command, it need a key to hash to a slot, then find the corresponding redis node. Do method will choose first argument in args as the key, so commands which are independent from keys are not supported, such as SYNC, BGSAVE, RANDOMKEY, etc.

sonobuoy - Heptio Sonobuoy is a diagnostic tool that makes it easier to understand the state of a Kubernetes cluster by running a set of Kubernetes conformance tests in an accessible and non-destructive manner

  •    Go

Heptio Sonobuoy is a diagnostic tool that makes it easier to understand the state of a Kubernetes cluster by running a set of Kubernetes conformance tests in an accessible and non-destructive manner. It is a customizable, extendable, and cluster-agnostic way to generate clear, informative reports about your cluster. Sonobuoy supports Kubernetes versions 1.9 and later.

aws-eks-base - This boilerplate contains the know-how of the Mad Devs team for the rapid deployment of a Kubernetes cluster, supporting services, and the underlying infrastructure in the Amazon cloud

  •    HCL

This repository contains the know-how of the Mad Devs team for the rapid deployment of a Kubernetes cluster, supporting services, and the underlying infrastructure in the Amazon cloud. The main development and delivery tool is terraform. In our company’s work, we have tried many infrastructure solutions and services and traveled the path from on-premise hardware to serverless. As of today, Kubernetes has become our standard platform for deploying applications, and AWS has become the main cloud.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.