rabin - node native addon for rabin fingerprinting data streams

  •        19

Node native addon module (C/C++) for Rabin fingerprinting data streams. Uses the implementation of Rabin fingerprinting from LBFS.

https://github.com/datproject/rabin#readme

Dependencies:

bindings : ^1.2.1
bl : ^1.0.0
debug : ^2.2.0
minimist : ^1.2.0
nan : ^2.1.0
prebuild-install : ^2.1.0
readable-stream : ^2.0.4

Tags
Implementation
License
Platform

   




Related Projects

dedupe - :id: A python library for accurate and scaleable fuzzy matching, record deduplication and entity-resolution

  •    Python

dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on structured data. dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

casync - Content-Addressable Data Synchronization Tool

  •    C

Encoding: Let's take a large linear data stream, split it into variable-sized chunks (the size of each being a function of the chunk's contents), and store these chunks in individual, compressed files in some directory, each file named after a strong hash value of its contents, so that the hash value may be used to as key for retrieving the full chunk data. Let's call this directory a "chunk store". At the same time, generate a "chunk index" file that lists these chunk hash values plus their respective chunk sizes in a simple linear array. The chunking algorithm is supposed to create variable, but similarly sized chunks from the data stream, and do so in a way that the same data results in the same chunks even if placed at varying offsets. For more information see this blog story. Decoding: Let's take the chunk index file, and reassemble the large linear data stream by concatenating the uncompressed chunks retrieved from the chunk store, keyed by the listed chunk hash values.

Borg - Deduplicating archiver with compression and authenticated encryption

  •    C

BorgBackup (short: Borg) is a deduplicating backup program. Optionally, it supports compression and authenticated encryption. The main goal of Borg is to provide an efficient and secure way to backup data. The data deduplication technique used makes Borg suitable for daily backups since only changes are stored. The authenticated encryption technique makes it suitable for backups to not fully trusted targets.

imagehash - 🌄 Perceptual image hashing for PHP

  •    PHP

A perceptual hash is a fingerprint of a multimedia file derived from various features from its content. Unlike cryptographic hash functions which rely on the avalanche effect of small changes in input leading to drastic changes in the output, perceptual hashes are "close" to one another if the features are similar. Perceptual hashes are a different concept compared to cryptographic hash functions like MD5 and SHA1. With cryptographic hashes, the hash values are random. The data used to generate the hash acts like a random seed, so the same data will generate the same result, but different data will create different results. Comparing two SHA1 hash values really only tells you two things. If the hashes are different, then the data is different. And if the hashes are the same, then the data is likely the same. In contrast, perceptual hashes can be compared -- giving you a sense of similarity between the two data sets.

yarbu - Yet Another Rsync Backup Utility

  •    Shell

Yet Another Rsync Backup Utility (YARBU). A robust but powerful snapshot-like rolling backup utility with email notification and straightforward configuration.


ipfspics-server - Content-addressable, peer-to-peer method of storing and sharing images on the internet

  •    PHP

ipfs.pics is a open-source and distributed image hosting website. It aims to be an alternative to non-libre image hosting websites such as imgur, flickr and others. It is based on IPFS - the InterPlanetary File System. The whole application runs on the concept of peer to peer connections, which means that instead of hosting the information in a single location, our servers, the data is stored by everyone who wants to. When a picture is put on IPFS, it is given a hash, a 46 characters long digital fingerprint. No other file will have it and if the same file is added twice then their hashes will be exactly the same, which means the picture can still be found on the network simply by knowing the hash, even if our website is down. You can find the hash at the end of a picture URL, just like below.

cfilter - Cuckoo Filter implementation in Go, better than Bloom Filters (unmaintained, unfortunately)

  •    Go

Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. Cuckoo filters support adding and removing items dynamically while achieving even higher performance than Bloom filters. For applications that store many items and target moderately low false positive rates, cuckoo filters have lower space overhead than space-optimized Bloom filters. Some possible use-cases that depend on approximated set-membership queries would be databases, caches, routers, and storage systems where it is used to decide if a given item is in a (usually large) set, with some small false positive probability. Alternatively, given it is designed to be a viable replacement to Bloom filters, it can also be used to reduce the space required in probabilistic routing tables, speed longest-prefix matching for IP addresses, improve network state management and monitoring, and encode multicast forwarding information in packets, among many other applications. Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%).

cuckoofilter

  •    C++

Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space. Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%).

LolliPin - A Material design Android pincode library. Supports Fingerprint.

  •    Java

The password itself is not saved, only its hash using the SHA-1 algorithm. This hash is then saved on the SharedPreferences, allowing to verify that the user entered the right PinCode, without giving the possibility to retrieve it.

free-style - Make CSS easier and more maintainable by using JavaScript

  •    TypeScript

Free-Style is designed to make CSS easier and more maintainable by using JavaScript. There's a great presentation by Christopher Chedeau you should check out.

rdedup - Data deduplication engine, supporting optional compression and public key encryption.

  •    Rust

See wiki for current project status. rdedup is a data deduplication engine and a backup software.

FingerprintManager - A small library to handle Android fingerprint API.

  •    Kotlin

A small library to handle Android fingerprint APIs. This library offers an easy way to handle authorisation and encryption tasks using Android Fingerprint APIs. It's based on Android fingerprint dialog sample made by Google: https://github.com/googlesamples/android-FingerprintDialog.

librsync -- network-delta library

  •    C

librsync implements the rolling-checksum algorithm of remote file synchronization that was popularized by the rsync utility and is used in rproxy. This algorithm transfers the differences between 2 files without needing both files on the same system.

RxFingerprint - Android Fingerprint authentication and encryption with RxJava

  •    Java

Learn more about the Android Fingerprint APIs at developer.android.com. This library has a minSdkVersion of 15, but will only really work on API level 23. Below that it will provide no functionality due to the missing APIs.

csvdedupe - :id: Command line tool for deduplicating CSV files

  •    Python

Command line tools for using the dedupe python library for deduplicating CSV files. csvdedupe - takes a messy input file or STDIN pipe and identifies duplicates.

Level 3 Fingerprint Image Toolkit

  •    Java

L3TK is a Java-based software toolkit for analysis of Level 3 fingerprint features in high resolution fingerprint images. Level 3 fingerprint features are the sweat pores, ridge contours, and edgeoscopic points along the contours.

reprint - A unified fingerprint library for android.

  •    Java

A simple, unified fingerprint authentication library for Android with RxJava extensions. See the sample app for a complete example.

CRFChunker: CRF English Phrase Chunker

  •    Java

CRFChunker: Conditional Random Fields Phrase Chunker (Phrase Chunking Tool) for English. The model was trained on sections 01..24 of WSJ corpus and using section 00 as the development test set (F1-score of 95.77). Chunking speed: 700 sentences/s

phpchunkit - PHPChunkit - PHPUnit test runner with test chunking capabilities.

  •    PHP

PHPChunkit is a library that sits on top of PHPUnit and adds additional functionality to make it easier to work with large unit and functional test suites. The primary feature is test chunking and database sandboxing which gives you the ability to run your tests in parallel chunks on the same server or across multiple servers. In order to run functional tests in parallel on the same server, you need to have a concept of database sandboxing. You are responsible for implementing the sandbox preparation, database creation, and sandbox cleanup. PHPChunkit provides a framework for you to hook in to so you can prepare your application environment sandbox.

flair - A very simple framework for state-of-the-art NLP

  •    Python

A very simple framework for state-of-the-art NLP. Developed by Zalando Research. A powerful syntactic-semantic tagger / classifier. Flair allows you to apply our state-of-the-art models for named entity recognition (NER), part-of-speech tagging (PoS), frame sense disambiguation, chunking and classification to your text.