Qdrant ( quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. Qdrant is tailored to extended filtering support. It makes it useful for all sorts of neural-network or semantic-based matching, faceted search, and other applications. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more.
The neural search uses semantic embeddings instead of keywords and works best with short texts. With Qdrant and a pre-trained neural network, you can build and deploy semantic neural search on your data in minutes.
Milvus is an open-source vector database built to power embedding similarity search and AI applications. Milvus makes unstructured data search more accessible, and provides a consistent user experience regardless of the deployment environment. Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility.
database ai vector nearest-neighbor-search cloud-native image-search approximate-nearest-neighbor-search hacktoberfest embedding similarity-search video-search faiss anns hnsw vector-search milvus vector-database embeddings-similarity artificial-intelligenceThis is Python training and testing code for Locally Optimized Product Quantization (LOPQ) models, as well as Spark scripts to scale training to hundreds of millions of vectors. The resulting model can be used in Python with code provided here or deployed via a Protobuf format to, e.g., search backends for high performance approximate nearest neighbor search.Locally Optimized Product Quantization (LOPQ) [1] is a hierarchical quantization algorithm that produces codes of configurable length for data points. These codes are efficient representations of the original vector and can be used in a variety of ways depending on the application, including as hashes that preserve locality, as a compressed vector from which an approximate vector in the data space can be reconstructed, and as a representation from which to compute an approximation of the Euclidean distance between points.
nearest-neighbor-search product-quantization lopq clustering sparkFor more detail, see the installation for instruction on how to build N2 from source. N2 is an approximate nearest neighborhoods algorithm library written in C++ (including Python/Go bindings). N2 provides a much faster search speed than other implementations when modeling large dataset. Also, N2 supports multi-core CPUs for index building.
ml knn machine-learning approximate k-nearest-neighbors nearest-neighbor-search approximate-nearest-neighbor-searchEmbed everything, thanks to AI, we can use neural networks to extract feature vectors from unstructured data, such as image, audio and vide etc. Then analyse the unstructured data by calculating the feature vectors, for example calculating the Euclidean or Cosine distance of the vectors to get the similarity. Milvus Bootcamp is designed to expose users to both the simplicity and depth of the Milvus vector database. Discover how to run benchmark tests as well as build similarity search applications like chatbots, recommender systems, reverse image search, molecular search, video search, audio search, and more.
nlp deep-learning question-answering image-classification image-recognition image-search hacktoberfest unstructured-data audio-search milvus benchmark-testingOpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards. OpenSearch enables people to easily ingest, secure, search, aggregate, view, and analyze data. These capabilities are popular for use cases such as application search, log analytics, and more.
search-engine searchengine full-text-search realtime-analytics analytics log-aggregation aggregation clickstream-analyticsAnnoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.To install, simply do sudo pip install annoy to pull down the latest version from PyPI.
c-plus-plus nearest-neighbor-search locality-sensitive-hashing approximate-nearest-neighbor-searchOpen Distro for Elasticsearch is an Apache 2.0-licensed distribution of Elasticsearch enhanced with Enterprise Security, Alerting, SQL, Index Management, k-Nearest Neighbor Search, Performance Analyzer and more.
elastic-search search-engine searchengine aggregations real-time enterprise-searchApproximate Nearest Neighbor Search for Sparse Data in Python! This library is well suited to finding nearest neighbors in sparse, high dimensional spaces (like text documents). Out of the box, PySparNN supports Cosine Distance (i.e. 1 - cosine_similarity).
Non-Metric Space Library (NMSLIB) is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The core-library does not have any third-party dependencies. It has been gaining popularity recently. In particular, it has become a part of Amazon Elasticsearch Service.
search search-library similarity-search algorithm knn-search non-metric neighborhood-graphs k-nn-graphs proximity-graphs lsh locality-sensitive-hashingDoing fast searching of nearest neighbors in high dimensional spaces is an increasingly important problem, but so far there has not been a lot of empirical attempts at comparing approaches in an objective way. This project contains some tools to benchmark various implementations of approximate nearest neighbor (ANN) search for different metrics. We have pregenerated datasets (in HDF5) formats and we also have Docker containers for each algorithm. There's a test suite that makes sure every algorithm works.
nearest-neighbors benchmark dockerTensorFlow Similarity is a TensorFlow library for similarity learning also known as metric learning and contrastive learning. TensorFlow Similarity is still in beta.
deep-learning tensorflow nearest-neighbor-search metric-learning nearest-neighbors similarity-search similarity-learning contrastive-learningHaystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want to perform Question Answering or semantic document search, you can use the State-of-the-Art NLP models in Haystack to provide unique search experiences and allow your users to query in natural language. Haystack is built in a modular fashion so that you can combine the best technology from other open-source projects like Huggingface's Transformers, Elasticsearch, or Milvus.
search nlp search-engine elasticsearch information-retrieval pytorch question-answering summarization transfer-learning ann language-model semantic-search squad bert dpr retriever neural-search natural-languageTNTSearch is a full-text search (FTS) engine written entirely in PHP. A simple configuration allows you to add an amazing search experience in just minutes. Its features include Fuzzy search, Geo-search, Text classification, Stemming, Bm25 ranking algorithm, Result highlighting, Boolean search and lot more.
fulltext search search-engine full-text-search fuzzy-search geo-search fuzzy fuzzy-matching tntsearch laravel-scoutThingscoop is a command-line utility for analyzing videos semantically - that means searching, filtering, and describing videos based on objects, places, and other things that appear in them.When you first run thingscoop on a video file, it uses a convolutional neural network to create an "index" of what's contained in the every second of the input by repeatedly performing image classification on a frame-by-frame basis. Once an index for a video file has been created, you can search (i.e. get the start and end times of the regions in the video matching the query) and filter (i.e. create a supercut of the matching regions) the input using arbitrary queries. Thingscoop uses a very basic query language that lets you to compose queries that test for the presence or absence of labels with the logical operators ! (not), || (or) and && (and). For example, to search a video the presence of the sky and the absence of the ocean: thingscoop search 'sky && !ocean' <file>.
The Software Botany Sunlight project is a search framework built using Word Aligned Hybrid Bit Vectors. Its sole purpose is to provide high performance in-memory searching of data using unknown combinations of indices. It is developed with .NET 4.0 using C#.
bit-vector bitmap-index compression search search-engineA full-text search engine with HTTP API and programmable read/write pipelines. To provide full-text search words and phrases are extracted from documents and mapped to a 2 billion dimensional vector-space that form clusters of syntactically similar "bag-of-chars". In this language model, each character (glyph) is encoded as a 32-bit word (an int), and each word or phrase alike encoded as a 32-bit wide (but sparse) array.
information-retrieval search-engine dotnet-core vector-space-model lsm-treeJina is a neural search framework that empowers anyone to build SOTA & scalable deep learning search applications in minutes. It helps to build solutions for indexing, querying, understanding multi-/cross-modal data such as video, image, text, audio, source code, PDF.
search nlp machine-learning framework computer-vision deep-learning microservice cloud-native image-search zmq semantic-search video-search multimodal-search neural-search jina search-as-a-serviceDeText is a Deep Text understanding framework for NLP related ranking, classification, and language generation tasks. It leverages semantic matching using deep neural networks to understand member intents in search and recommender systems. As a general NLP framework, DeText can be applied to many tasks, including search & recommendation ranking, multi-class classification and query understanding tasks.
nlp deep-neural-networks ranking classification text-embeddings detext-framework neural-networksElasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.
searchengine search-engine full-text-search document-oriented percolator analytics nosql time-series aggregationFinetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks. It accompanies Jina to deliver the last mile of performance for domain-specific neural search applications. 🎛 Designed for finetuning: a human-in-the-loop deep learning tool for leveling up your pretrained models in domain-specific neural search applications.
tensorflow keras pytorch metric-learning transfer-learning pretrained-models triplet-loss paddlepaddle labeling-tool siamese-network fine-tuning finetuning few-shot-learning negative-sampling neural-search jina
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.