Displaying 1 to 14 from 14 results

NCRFpp - NCRF++, an Open-source Neural Sequence Labeling Toolkit

  •    Python

Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. State-of-the-art sequence labeling models mostly utilize the CRF structure with input word features. LSTM (or bidirectional LSTM) is a popular deep learning based feature extractor in sequence labeling task. And CNN can also be used due to faster computation. Besides, features within word are also useful to represent word, which can be captured by character LSTM or character CNN structure or human-defined neural features. NCRF++ is a PyTorch based framework with flexiable choices of input features and output structures. The design of neural sequence labeling models with NCRF++ is fully configurable through a configuration file, which does not require any code work. NCRF++ is a neural version of CRF++, which is a famous statistical CRF framework.

fine-uploader - Multiple file upload plugin with image previews, drag and drop, progress bars

  •    Javascript

FineUploader is also simple to use. In the simplest case, you only need to include one JavaScript file. There are absolutely no other required external dependencies. For more information, please see the documentation. If you'd like to help and keep this project strong and relevant, you have several options.

casync - Content-Addressable Data Synchronization Tool

  •    C

Encoding: Let's take a large linear data stream, split it into variable-sized chunks (the size of each being a function of the chunk's contents), and store these chunks in individual, compressed files in some directory, each file named after a strong hash value of its contents, so that the hash value may be used to as key for retrieving the full chunk data. Let's call this directory a "chunk store". At the same time, generate a "chunk index" file that lists these chunk hash values plus their respective chunk sizes in a simple linear array. The chunking algorithm is supposed to create variable, but similarly sized chunks from the data stream, and do so in a way that the same data results in the same chunks even if placed at varying offsets. For more information see this blog story. Decoding: Let's take the chunk index file, and reassemble the large linear data stream by concatenating the uncompressed chunks retrieved from the chunk store, keyed by the listed chunk hash values.

flair - A very simple framework for state-of-the-art NLP

  •    Python

A very simple framework for state-of-the-art NLP. Developed by Zalando Research. A powerful syntactic-semantic tagger / classifier. Flair allows you to apply our state-of-the-art models for named entity recognition (NER), part-of-speech tagging (PoS), frame sense disambiguation, chunking and classification to your text.




nacl-stream-js - Streaming encryption based on TweetNaCl.js

  •    Javascript

Inputs: 32-byte key, 16-byte nonce, a stream (or a file).Stream is split into chunks of the specified length.

node-chunking-streams - A set of NodeJS streams aimed on chunking data

  •    Javascript

Simple TransformStream which counts lines (\n is a separator) and emit data chunks contains exactly specified number of them. If some tail data is not fit fully into specified chunk size, it can be emitted at the end if flushTail flag is set.

desync - Alternative casync implementation

  •    Go

This project re-implements many features of upstream casync in Go. It seeks to maintain compatibility with casync's data structures, protocols and types, such as chunk stores (castr), index files (caibx/caidx) and archives (catar) in order to function as a drop-in replacement in many use cases. It also tries to maintain support for platforms other than Linux and simplify build/installation. It consists of a library that implements the features, available for integration into any 3rd-party product as well as a command-line tool. For support and discussion, see . Feature requests should be discussed there before filing, unless you're interested in doing the work to implement them yourself.

cryptor - Privacy, Anonimity, Freedom

  •    Go

Cryptor is a P2P network designed for sharing data without revealing one's true identity or the nature of the shared information. Data security is accomplished using asymetric encryption (AES256) and public-private key (RSA) for safe peer-to-peer communication and data integrity validation. All local data stored on your machine is encrypted using your master password. The chunk design of the file sharing protocol also further improves security.


SAPO - SAPO

  •    CSharp

This is an implementation of the paper "Towards Easier and Faster Sequence Labeling: A Search-based Probabilistic Online Learning Framework (SAPO)". The codes are also used for the paper "Towards Shockingly Easy Structured Classification: A Search-based Probabilistic Online Learning Framework".

rabin - node native addon for rabin fingerprinting data streams

  •    C++

Node native addon module (C/C++) for Rabin fingerprinting data streams. Uses the implementation of Rabin fingerprinting from LBFS.

split - Split large files into smaller ones using deterministic Content Defined Chunking

  •    Go

Split large files into smaller ones using the same Content Defined Chunking algorithm the restic backup program uses. If you're interested in the mathematical foundation for Content Defined Chunking with Rabin Fingerprints, head over to the restic blog which has an introductory article.

jmem - Break up huge JSON arrays into manageable sizes.

  •    PHP

Iterate through large JSON arrays without eating up all your memory. To start using jmem add the following line to your composer.json file.

cafs - Content-Addressable File System (used by BitWrk)

  •    Go

Content-Addressable File System. This is the data caching back-end used by the BitWrk distributed computing software. See https://bitwrk.net/ for more info.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.