CRoaring - Roaring bitmaps in C (and C++)

  •        51

Bitsets, also called bitmaps, are commonly used as fast data structures. Unfortunately, they can use too much memory. To compensate, we often use compressed bitmaps.Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. They are used by several major systems such as Apache Lucene and derivative systems such as Solr and Elasticsearch, Metamarkets' Druid, LinkedIn Pinot, Netflix Atlas, Apache Spark, OpenSearchServer, Cloud Torrent, Whoosh, Pilosa, Microsoft Visual Studio Team Services (VSTS), and eBay's Apache Kylin.

http://roaringbitmap.org/
https://github.com/RoaringBitmap/CRoaring

Tags
Implementation
License
Platform

   




Related Projects

roaring - Roaring bitmaps in Go (golang)

  •    Go

This is a go port of the Roaring bitmap data structure.Roaring bitmaps are used by several major systems such as Apache Lucene and derivative systems such as Solr and Elasticsearch, Metamarkets' Druid, LinkedIn Pinot, Netflix Atlas, Apache Spark, OpenSearchServer, Cloud Torrent, Whoosh, Pilosa, Microsoft Visual Studio Team Services (VSTS), and eBay's Apache Kylin.

RoaringBitmap - A better compressed bitset in Java

  •    Java

Roaring bitmaps are compressed bitmaps (also called bitsets) which tend to outperform conventional compressed bitmaps such as WAH or Concise.

javaewah - A compressed alternative to the Java BitSet class

  •    Java

The bit array data structure is implemented in Java as the BitSet class. Unfortunately, this fails to scale without compression. JavaEWAH is a word-aligned compressed variant of the Java bitset class. It uses a 64-bit run-length encoding (RLE) compression scheme. The goal of word-aligned compression is not to achieve the best compression, but rather to improve query processing time. Hence, we try to save CPU cycles, maybe at the expense of storage. However, the EWAH scheme we implemented is always more efficient storage-wise than an uncompressed bitmap (implemented in Java as the BitSet class). Unlike some alternatives, javaewah does not rely on a patented scheme.

EWAHBoolArray - A compressed bitmap class in C++.

  •    C++

The class EWAHBoolArray is a compressed bitset data structure. It supports several word sizes by a template parameter (16-bit, 32-bit, 64-bit). You should expect the 64-bit word-size to provide better performance, but higher memory usage, while a 32-bit word-size might compress a bit better, at the expense of some performance.The library also provides a basic BoolArray class which can serve as a traditional bitmap.

bitset - Go package implementing bitsets

  •    Go

Package bitset implements bitsets, a mapping between non-negative integers and boolean values. It should be more efficient than map[uint] bool.It provides methods for setting, clearing, flipping, and testing individual integers.


bloom - Go package implementing Bloom filters

  •    Go

A Bloom filter is a representation of a set of n items, where the main requirement is to make membership queries; i.e., whether an item is a member of a set.A Bloom filter has two parameters: m, a maximum size (typically a reasonably large multiple of the cardinality of the set to represent) and k, the number of hashing functions on elements of the set. (The actual hashing functions are important, too, but this is not a parameter for this implementation). A Bloom filter is backed by a BitSet; a key is represented in the filter by setting the bits at each value of the hashing functions (modulo m). Set membership is done by testing whether the bits at each value of the hashing functions (again, modulo m) are set. If so, the item is in the set. If the item is actually in the set, a Bloom filter will never fail (the true positive rate is 1.0); but it is susceptible to false positives. The art is to choose k and m correctly.

Tarantool - Get your data in RAM. Get compute close to data. Enjoy the performance

  •    Lua

Tarantool is an in-memory database and application server. It is 100% compatible drop-in replacement for Lua 5.1, based on LuaJIT 2.1. It has full support for Lua modules and a rich set of own modules, including cooperative multitasking, non-blocking I/O, access to external databases, etc.

magic_get - std::tuple like methods for user defined types without any macro or boilerplate code

  •    C++

This C++14 library is meant for accessing structure elements by index and providing other std::tuple like methods for user defined types without any macro or boilerplate code. Distributed under the Boost Software License, Version 1.0.

Fit - C++ function utility library

  •    C++

Fit is a header-only C++11/C++14 library that provides utilities for functions and function objects, which can solve many problems with much simpler constructs than whats traditionally been done with metaprogramming. This requires a C++11 compiler. There are no third-party dependencies. This has been tested on clang 3.5-3.8, gcc 4.6-6.2, and Visual Studio 2015. Gcc 5.1 is not supported at all, however, gcc 5.4 is supported.

Redisson - Redis based In-Memory Data Grid for Java

  •    Java

Redisson - distributed Java objects and services (Set, Multimap, SortedSet, Map, List, Queue, BlockingQueue, Deque, BlockingDeque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Executor service, Tomcat Session Manager, Scheduler service, JCache API) on top of Redis server. Rich Redis client.

sol2 - Sol3 (sol2 v3

  •    C++

sol is a C++ library binding to Lua. It currently supports all Lua versions 5.1+ (LuaJIT 2.x included). sol aims to be easy to use and easy to add to a project. The library is header-only for easy integration with projects. Find it here. A run-through kind of tutorial is here! The API documentation goes over most cases (particularly, the "api/usertype" and "api/table_proxy" and "api/function" sections) that should still get you off your feet and going, and there's an examples directory here as well.

ParaSim

  •    C

Parallelized calculation of molecular similarities

hof - Higher-order functions for c++

  •    C++

HigherOrderFunctions is a header-only C++11/C++14 library that provides utilities for functions and function objects, which can solve many problems with much simpler constructs than whats traditionally been done with metaprogramming. This requires a C++11 compiler. There are no third-party dependencies. This has been tested on clang 3.5-3.8, gcc 4.6-7, and Visual Studio 2015 and 2017.

Capsule - The Capsule Hash Trie Collections Library

  •    Java

Capsule aims to become a full-fledged (immutable) collections library for Java 8+ that is solely built around persistent tries. The library is designed for standalone use and for being embedded in domain-specific languages. Capsule still has to undergo some incubation before it can ship as a well-rounded collection library. Nevertheless, the code is stable and performance is solid.

Celero - C++ Benchmark Authoring Library/Framework

  •    C++

Developing consistent and meaningful benchmark results for code is a complex task. Measurement tools exist (Intel® VTune™ Amplifier, SmartBear AQTime, Valgrind, etc.) external to applications, but they are sometimes expensive for small teams or cumbersome to utilize. This project, Celero, aims to be a small library which can be added to a C++ project and perform benchmarks on code in a way which is easy to reproduce, share, and compare among individual runs, developers, or projects. Celero uses a framework similar to that of GoogleTest to make its API easier to use and integrate into a project. Make automated benchmarking as much a part of your development process as automated testing. Celero uses CMake to provide cross-platform builds. It does require a modern compiler (Visual C++ 2012+, GCC 4.7+, Clang 2.9+) due to its use of C++11.

htslib - C library for high-throughput sequencing data formats

  •    C

HTSlib is an implementation of a unified C library for accessing common file formats, such as SAM, CRAM and VCF, used for high-throughput sequencing data, and is the core library used by samtools and bcftools. HTSlib only depends on zlib. It is known to be compatible with gcc, g++ and clang. HTSlib implements a generalized BAM index, with file extension .csi (coordinate-sorted index). The HTSlib file reader first looks for the new index and then for the old if the new index is absent.

FakeIt - C++ mocking made easy. A simple yet very expressive, headers only library for c++ mocking.

  •    C++

FakeIt is a simple mocking framework for C++. It supports GCC, Clang and MS Visual C++. FakeIt is written in C++11 and can be used for testing both C++11 and C++ projects.





We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.