Boom Filters are probabilistic data structures for processing continuous, unbounded streams. This includes Stable Bloom Filters, Scalable Bloom Filters, Counting Bloom Filters, Inverse Bloom Filters, Cuckoo Filters, several variants of traditional Bloom filters, HyperLogLog, Count-Min Sketch, and MinHash.Classic Bloom filters generally require a priori knowledge of the data set in order to allocate an appropriately sized bit array. This works well for offline processing, but online processing typically involves unbounded data streams. With enough data, a traditional Bloom filter "fills up", after which it has a false-positive probability of 1.
bloom-filter stable-bloom-filters cuckoo-filter probabilistic-programming counting-bloom-filters scalable-bloom-filters count-min-sketch data-stream filter data-structure collections go-collectionFast bloom filter in JavaScript.
bloom-filter probabilistic-data-structureThe official source code repository is at https://github.com/dib-lab/khmer and project documentation is available online at http://khmer.readthedocs.io. See http://khmer.readthedocs.io/en/stable/introduction.html for an overview of the khmer project. khmer is research software, so you should cite us when you use it in scientific publications! Please see the CITATION file for citation information.
dna k-mer bloom-filter count-min-sketch graph-traversal bioinformaticsCuckoo filter is a Bloom filter replacement for approximated set-membership queries. Cuckoo filters support adding and removing items dynamically while achieving even higher performance than Bloom filters. For applications that store many items and target moderately low false positive rates, cuckoo filters have lower space overhead than space-optimized Bloom filters. Some possible use-cases that depend on approximated set-membership queries would be databases, caches, routers, and storage systems where it is used to decide if a given item is in a (usually large) set, with some small false positive probability. Alternatively, given it is designed to be a viable replacement to Bloom filters, it can also be used to reduce the space required in probabilistic routing tables, speed longest-prefix matching for IP addresses, improve network state management and monitoring, and encode multicast forwarding information in packets, among many other applications. Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%).
cuckoo-filter bloom-filter filterA Bloom filter is a representation of a set of n items, where the main requirement is to make membership queries; i.e., whether an item is a member of a set.A Bloom filter has two parameters: m, a maximum size (typically a reasonably large multiple of the cardinality of the set to represent) and k, the number of hashing functions on elements of the set. (The actual hashing functions are important, too, but this is not a parameter for this implementation). A Bloom filter is backed by a BitSet; a key is represented in the filter by setting the bits at each value of the hashing functions (modulo m). Set membership is done by testing whether the bits at each value of the hashing functions (again, modulo m) are set. If so, the item is in the set. If the item is actually in the set, a Bloom filter will never fail (the true positive rate is 1.0); but it is susceptible to false positives. The art is to choose k and m correctly.
bloom bloom-filters bloom-filter data-structure collections go-collectionThe goal of pybloomfiltermmap is simple: to provide a fast, simple, scalable, correct library for Bloom Filters in Python.
bloom-filterThe library's full documentation can be found here. Be sure to lint & pass the unit tests before submitting your pull request.
natural-language-processing machine-learning fuzzy-matching clustering record-linkage bayes bloom-filter canberra caverphone chebyshev cologne cosine classifier daitch-mokotoff dice fingerprint fuzzy hamming k-means jaccard jaro lancaster levenshtein lig metaphone mra ngrams nlp nysiis perceptron phonetic porter punkt schinke sorensen soundex stats tfidf tokenizer tversky vectorizer winklerC++ Bloom Filter Library
algorithm associative bloom bloom-filter bloomier boostA project aiming to build drag and drop replacements for .Net collections offerring higher or equivalent performance and significantly lower memory requirements. The project's first deliverables will be a StringDictionary<T> which is a drag and drop replacement for Dictionary<...
collections bloom-filter collection dictionary hash"A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not. In other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed," says Wikipedia. Warning: These are synthetic benchmarks in isolated environment. Usually the difference in throughput and latency is bigger in production system because it will stress the GC, lead to slow allocation paths and higher latencies, trigger the GC, etc.
bloom-filter probabilistic high-performance datastructuresBuckets is a complete, tested and documented collections library for swift.Carthage is a decentralized dependency manager that automates the process of adding frameworks to your application.
swift-3 queue deque stack priority-queue matrix multiset multimap bimap graph trie bitarray circular-buffer bloom-filter swift-package-manager carthage cocoapodsCuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space.Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%).
bloom-filter data-structure collections go-collectionProof of work based on SHA256 and Bloom filter.Timestamp MUST be equal to number of milliseconds since 1970-01-01T00:00:00.000Z in UTC time.
bloom-filter proof of work sha256 bloomsketchy is available as a Maven artifact from Clojars.This library contains various sketching/hash-based algorithms useful for building compact summaries of large datasets.
hashing sketching bloom-filter minhash count-min-sketch hyperloglogFlajolet is an OCaml library providing streaming data structures in the vein of the popular streamlib library for Java. Flajolet is named for INRIA professor Philippe Flajolet, inventor of the HyperLogLog data structure.
flajolet ocaml hyperloglog minhash bloom-filterThe most recent version of the Doxygen API documentation exists at http://mavam.github.io/libbf/api. Alternatively, you can build the documentation locally via make doc and then browse to doc/gh-pages/api/index.html. Each Bloom filter inherits from the abstract base class bloom_filter, which provides addition and lookup via the virtual functions add and lookup. These functions take an object as argument, which serves a light-weight view over sequential data for hashing.
bloom-filter synopsisA bloom filter for node backed by redis. To install, use npm and run npm install bloom-redis.
bloom filter redis bloom-filterYou can run cli.js to calculate cache digests manually. In the above example, -b option is used so that the digest would be encoded using base64url. Please refer to -h (help) option for more information.
http2 cache-digests server-push bloom-filter golomb-codingA bloom filter implementation that is serialisable to JSON and compatible between both Ruby and Javascript. Very useful when needing to train a bloom filter in one language and using it in the other. Bloom filters allow for space efficient lookups in a list, without having to store all the items in the list. This is useful for looking up tags, domain names, links, or anything else that you might want to do client side.
bloom-filter json bitarray
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.