Qdrant - Neural Search Engine, Vector Similarity Search Engine with extended filtering support

  •        7486

Qdrant ( quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. Qdrant is tailored to extended filtering support. It makes it useful for all sorts of neural-network or semantic-based matching, faceted search, and other applications. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more.

The neural search uses semantic embeddings instead of keywords and works best with short texts. With Qdrant and a pre-trained neural network, you can build and deploy semantic neural search on your data in minutes.

https://qdrant.tech
https://github.com/qdrant/qdrant

Tags
Implementation
License
Platform

   




Related Projects

Milvus - An open-source vector database for embedding similarity search and AI applications

  •    Go

Milvus is an open-source vector database built to power embedding similarity search and AI applications. Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment. Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility.

lopq - Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark

  •    Python

This is Python training and testing code for Locally Optimized Product Quantization (LOPQ) models, as well as Spark scripts to scale training to hundreds of millions of vectors. The resulting model can be used in Python with code provided here or deployed via a Protobuf format to, e.g., search backends for high performance approximate nearest neighbor search.Locally Optimized Product Quantization (LOPQ) [1] is a hierarchical quantization algorithm that produces codes of configurable length for data points. These codes are efficient representations of the original vector and can be used in a variety of ways depending on the application, including as hashes that preserve locality, as a compressed vector from which an approximate vector in the data space can be reconstructed, and as a representation from which to compute an approximation of the Euclidean distance between points.

n2 - TOROS N2 - lightweight approximate Nearest Neighbor library which runs faster even with large datasets

  •    C++

For more detail, see the installation for instruction on how to build N2 from source. N2 is an approximate nearest neighborhoods algorithm library written in C++ (including Python/Go bindings). N2 provides a much faster search speed than other implementations when modeling large dataset. Also, N2 supports multi-core CPUs for index building.

bootcamp - Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc

  •    Python

Embed everything, thanks to AI, we can use neural networks to extract feature vectors from unstructured data, such as image, audio and vide etc. Then analyse the unstructured data by calculating the feature vectors, for example calculating the Euclidean or Cosine distance of the vectors to get the similarity. Milvus Bootcamp is designed to expose users to both the simplicity and depth of the Milvus vector database. Discover how to run benchmark tests as well as build similarity search applications like chatbots, recommender systems, reverse image search, molecular search, video search, audio search, and more.

OpenSearch - Open source distributed and RESTful search engine

  •    Java

OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards. OpenSearch enables people to easily ingest, secure, search, aggregate, view, and analyze data. These capabilities are popular for use cases such as application search, log analytics, and more.


annoy - Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

  •    C++

Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.To install, simply do sudo pip install annoy to pull down the latest version from PyPI.

Open Distro for Elasticsearch - Elasticsearch enhanced with enterprise security, alerting, SQL, and more

  •    Java

Open Distro for Elasticsearch is an Apache 2.0-licensed distribution of Elasticsearch enhanced with Enterprise Security, Alerting, SQL, Index Management, k-Nearest Neighbor Search, Performance Analyzer and more.

pysparnn - Approximate Nearest Neighbor Search for Sparse Data in Python!

  •    Python

Approximate Nearest Neighbor Search for Sparse Data in Python! This library is well suited to finding nearest neighbors in sparse, high dimensional spaces (like text documents). Out of the box, PySparNN supports Cosine Distance (i.e. 1 - cosine_similarity).

Non-Metric Space Library (NMSLIB) - An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

  •    C++

Non-Metric Space Library (NMSLIB) is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The core-library does not have any third-party dependencies. It has been gaining popularity recently. In particular, it has become a part of Amazon Elasticsearch Service.

ann-benchmarks - Benchmarks of approximate nearest neighbor libraries in Python

  •    Python

Doing fast searching of nearest neighbors in high dimensional spaces is an increasingly important problem, but so far there has not been a lot of empirical attempts at comparing approaches in an objective way. This project contains some tools to benchmark various implementations of approximate nearest neighbor (ANN) search for different metrics. We have pregenerated datasets (in HDF5) formats and we also have Docker containers for each algorithm. There's a test suite that makes sure every algorithm works.

similarity - TensorFlow Similarity is a python package focused on making similarity learning quick and easy

  •    Python

TensorFlow Similarity is a TensorFlow library for similarity learning also known as metric learning and contrastive learning. TensorFlow Similarity is still in beta.

Haystack - Build a natural language interface for your data

  •    Python

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want to perform Question Answering or semantic document search, you can use the State-of-the-Art NLP models in Haystack to provide unique search experiences and allow your users to query in natural language. Haystack is built in a modular fashion so that you can combine the best technology from other open-source projects like Huggingface's Transformers, Elasticsearch, or Milvus.

TNTSearch - A fully featured full text search engine written in PHP

  •    PHP

TNTSearch is a full-text search (FTS) engine written entirely in PHP. A simple configuration allows you to add an amazing search experience in just minutes. Its features include Fuzzy search, Geo-search, Text classification, Stemming, Bm25 ranking algorithm, Result highlighting, Boolean search and lot more.

thingscoop - Search and filter videos based on objects that appear in them using convolutional neural networks

  •    Python

Thingscoop is a command-line utility for analyzing videos semantically - that means searching, filtering, and describing videos based on objects, places, and other things that appear in them.When you first run thingscoop on a video file, it uses a convolutional neural network to create an "index" of what's contained in the every second of the input by repeatedly performing image classification on a frame-by-frame basis. Once an index for a video file has been created, you can search (i.e. get the start and end times of the regions in the video matching the query) and filter (i.e. create a supercut of the matching regions) the input using arbitrary queries. Thingscoop uses a very basic query language that lets you to compose queries that test for the presence or absence of labels with the logical operators ! (not), || (or) and && (and). For example, to search a video the presence of the sky and the absence of the ocean: thingscoop search 'sky && !ocean' <file>.

SoftwareBotany.Sunlight Word Aligned Hybrid Bit Vector Search Framework

  •    

The Software Botany Sunlight project is a search framework built using Word Aligned Hybrid Bit Vectors. Its sole purpose is to provide high performance in-memory searching of data using unknown combinations of indices. It is developed with .NET 4.0 using C#.

resin - 32-bit vector space search engine

  •    CSharp

A full-text search engine with HTTP API and programmable read/write pipelines. To provide full-text search words and phrases are extracted from documents and mapped to a 2 billion dimensional vector-space that form clusters of syntactically similar "bag-of-chars". In this language model, each character (glyph) is encoded as a 32-bit word (an int), and each word or phrase alike encoded as a 32-bit wide (but sparse) array.

Jina - Cloud-native neural search framework for any kind of data

  •    Python

Jina is a neural search framework that empowers anyone to build SOTA & scalable deep learning search applications in minutes. It helps to build solutions for indexing, querying, understanding multi-/cross-modal data such as video, image, text, audio, source code, PDF.

DeText - A Deep Neural Text Understanding Framework for Ranking and Classification Tasks

  •    Python

DeText is a Deep Text understanding framework for NLP related ranking, classification, and language generation tasks. It leverages semantic matching using deep neural networks to understand member intents in search and recommender systems. As a general NLP framework, DeText can be applied to many tasks, including search & recommendation ranking, multi-class classification and query understanding tasks.

ElasticSearch - Distributed, RESTful search and analytics engine

  •    Java

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

finetuner - Finetuning any DNN for better embedding on neural search tasks

  •    Python

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks. It accompanies Jina to deliver the last mile of performance for domain-specific neural search applications. 🎛 Designed for finetuning: a human-in-the-loop deep learning tool for leveling up your pretrained models in domain-specific neural search applications.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.