Displaying 1 to 20 from 22 results

alertmanager - Prometheus Alertmanager

  •    Go

The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integrations such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.There are various ways of installing Alertmanager.

libpostal - A C library for parsing/normalizing street addresses around the world

  •    C

Addresses and the locations they represent are essential for any application dealing with maps (place search, transportation, on-demand/delivery services, check-ins, reviews). Yet even the simplest addresses are packed with local conventions, abbreviations and context, making them difficult to index/query effectively with traditional full-text search engines. This library helps convert the free-form addresses that humans use into clean normalized forms suitable for machine comparison and full-text indexing. Though libpostal is not itself a full geocoder, it can be used as a preprocessing step to make any geocoding application smarter, simpler, and more consistent internationally. The core library is written in pure C. Language bindings for Python, Ruby, Go, Java, PHP, and NodeJS are officially supported and it's easy to write bindings in other languages.

restic - Fast, secure, efficient backup program

  •    Go

restic is a backup program that is fast, efficient and secure. Restic should be easy to configure and use, so that in the unlikely event of a data loss you can just restore it. It uses cryptography to guarantee confidentiality and integrity of your data.

Borg - Deduplicating archiver with compression and authenticated encryption

  •    C

BorgBackup (short: Borg) is a deduplicating backup program. Optionally, it supports compression and authenticated encryption. The main goal of Borg is to provide an efficient and secure way to backup data. The data deduplication technique used makes Borg suitable for daily backups since only changes are stored. The authenticated encryption technique makes it suitable for backups to not fully trusted targets.




rdedup - Data deduplication engine, supporting optional compression and public key encryption.

  •    Rust

See wiki for current project status. rdedup is a data deduplication engine and a backup software.

blobstash - BlobStash is your personal database.

  •    Go

BlobStash is both a content-addressed blob store and a key value store accessible via an HTTP API.Key value pairs are stored as "meta" blobs, this mean you can build application on top of BlobStash without the need for another database.

file-dedupe - Fast duplicate file detection library

  •    Javascript

findup is quite fast - it is within 2x of the fastest duplicate finders written in C/C++. Based on the V8 profiler output, about 40% of the time is spent on I/O, 13% on crypto and 11% on file traversal, so any further gains in performance will need to come from I/O optimizations rather than code optimizations. BTW, you may notice that file-dedupe defaults to sync I/O. This is because the async I/O seems to have significant overhead for typical FS tasks. You can test this out by passing the --async flag on your system.


lieu - Dedupe/batch geocode addresses and venues around the world with libpostal

  •    Python

lieu is a Python library for deduping venues and addresses around the world using libpostal's international street address normalization. Note: libpostal and its Python binding are required to use this library, setup instructions here.

kvdo - A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.

  •    C

A pair of kernel modules which provide pools of deduplicated and/or compressed block storage. VDO (which includes kvdo and vdo) is software that provides inline block-level deduplication, compression, and thin provisioning capabilities for primary storage. VDO installs within the Linux device mapper framework, where it takes ownership of existing physical block devices and remaps these to new, higher-level block devices with data-efficiency capabilities.

vdo - Userspace tools for managing VDO volumes.

  •    C

A set of userspace tools for managing pools of deduplicated and/or compressed block storage. VDO (which includes kvdo and vdo) is software that provides inline block-level deduplication, compression, and thin provisioning capabilities for primary storage. VDO installs within the Linux device mapper framework, where it takes ownership of existing physical block devices and remaps these to new, higher-level block devices with data-efficiency capabilities.

dedupsqlfs - Deduplicating filesystem via Python3, FUSE and SQLite

  •    C

Rewriten to use Python3 (3.2+), new compression methods, snapshots / subvolumes. I know about ZFS and Btrfs. But they are still complicated to use under linux and has disadvantages like need in block device, weak block hash algorithms, very little variants of compression methods.

blobfs - A FUSE file system built on top of BlobStash FileTree API.

  •    Go

A FUSE file system built on top of BlobStash with built-in sync and deduplication.

lafs-backup-tool - Tool to securely push incremental (think "rsync --link-dest") backups to tahoe-lafs

  •    Python

Tool to securely push incremental (think "rsync --link-dest") backups to Tahoe Least Authority File System. Note that to install stuff in system-wide PATH and site-packages, elevated privileges are often required. Use "install --user", ~/.pydistutils.cfg or virtualenv to do unprivileged installs into custom paths.

Frost - A backup program that does deduplication, compression, encryption

  •    C++

A backup program that does deduplication, compression, encryption. It's based on the ideas of Syncany, but reimplemented in C++, using state of art compression (BSC library), no dependency on anything except libcrypto. It provides a console mode that has been tested on both Linux and MacOSX. It allows saving backups to a remote server that's considered hostile, with no modification to the remote server software required. Because of deduplication, space saving is considerable between backup and in a backup itself. Then even more data is preserved with compression (using either zLib or BSC library), and compressed data is encrypted in local files (or remote files if you mounted then beforehand).

pgdedupe - A simple command line interface to the datamade/dedupe library.

  •    Jupyter

A work-in-progress to provide a standard interface for deduplication of large databases with custom pre-processing and post-processing steps. In addition to running a database-level deduplication with dedupe, this script adds custom pre- and post-processing steps to improve the run-time and results, making this a hybrid between fuzzy matching and record linkage.

IntraArchiveDeduplicator - Tool for managing data-deduplication within extant compressed archive files, along with a relatively performant BK tree implementation for fuzzy image searching

  •    Python

Tool for managing data-deduplication within extant compressed archive files, with a heavy focus on Manga/Comic-book archive files. This is a rather exotic tool that is intended to allow fairly fast duplicate detection for files within compressed archives.

rabin - node native addon for rabin fingerprinting data streams

  •    C++

Node native addon module (C/C++) for Rabin fingerprinting data streams. Uses the implementation of Rabin fingerprinting from LBFS.

dedupe - easy deduplication of array values

  •    Javascript

removes duplicates from your array. Here the string representation of the object is used for comparism. The mechanism is similar to JSON.stringifing but a bit more efficient. That means that {} is considered egal to {}.