ntCard - Estimating k-mer coverage histogram of genomics data

  •    C++

ntCard is a streaming algorithm for cardinality estimation in genomics datasets. As input it takes file(s) in fasta, fastq, sam, or bam formats and computes the total number of distinct k-mers, F0, and also the k-mer coverage frequency histogram, fi, i>=1.

redis-tdigest - t-digest module for Redis

  •    C

This is a Redis module for the t-digest data structure which can be used for accurate online accumulation of rank-based statistics such as quantiles and cumulative distribution at a point. The implementation is based on the Merging Digest implementation by the author. Before going ahead, make sure that the Redis server you're using has support for Redis modules.

