pyahocorasick - Python module (C extension and plain python) implementing Aho-Corasick algorithm

  •        20

pyahocorasick is a fast and memory efficient library for exact or approximate multi-pattern string search meaning that you can find multiple key strings occurrences at once in some input text. The library provides an ahocorasick Python module that you can use as a plain dict-like Trie or convert a Trie to an automaton for efficient Aho-Corasick search. It is implemented in C and tested on Python 2.7 and 3.4+. It works on Linux, Mac and Windows.

https://github.com/WojciechMula/pyahocorasick

Tags
Implementation
License
Platform

   




Related Projects

ac - Aho-Corasick Automaton with Double Array Trie (Multi-pattern substitute in go)

  •    Go

Aho-Corasick Automaton with Double Array Trie (Multi-pattern substitute in go)

ahocorasick - A Golang implementation of the Aho-Corasick string matching algorithm

  •    Go

A Golang implementation of the Aho-Corasick string matching algorithm

aho-corasick - Java implementation of the Aho-Corasick algorithm for efficient string matching

  •    Java

Java library for efficient string matching against a large set of keywords

AHO Corasick .net

  •    

Aho corasick search algorithm implementation using .net C#, with path compression.


Tandem Repeat Occurrence Locator

  •    C++

The Tandem Repeat Occurrence Locator -- TROLL -- is a light weight SSR finder based on a slight modification of the Aho-Corasick algorithm.

Twine - String manipulation, leveled up!

  •    PHP

Twine is a simple string manipulation library with an expressive, fluent syntax. Like this project? Keep me caffeinated by making a donation.

strman - 🏗A Javascript string manipulation library.

  •    Javascript

A Javascript string manipulation library. Want to contribute? Follow these recommendations.

strman-java - A Java 8 string manipulation library.

  •    Java

A Java 8 library for working with Strings. You can learn about all the String utility functions implemented in strman library by reading the documentation. To use strman in your application, you have to add strman to your classpath. strman is available on Maven Central so you just need to add dependency in your favorite build tool as shown below.

underscore.string - String manipulation helpers for javascript

  •    Javascript

Javascript lacks complete string manipulation operations. This is an attempt to fill that gap. List of build-in methods can be found for example from Dive Into JavaScript. Originally started as an Underscore.js extension but is a full standalone library nowadays.Upgrading from 2.x to 3.x? Please read the changelog.

Guitar - A Cross-Platform String and Regular Expression Library written in Swift.

  •    Swift

This library seeks to add common string manipulation functions, including common regular expression capabilities, that are needed in both mobile and server-side development, but are missing in Swift's Standard Library. The full documentation can be found at http://www.sabintsev.com/Guitar/.

stringr - A fresh approach to string manipulation in R

  •    R

Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparation tasks. The stringr package provide a cohesive set of functions designed to make working with strings as easy as possible. If you’re not familiar with strings, the best place to start is the chapter on strings in R for Data Science. stringr is built on top of stringi, which uses the ICU C library to provide fast, correct implementations of common string manipulations. stringr focusses on the most important and commonly used string manipulation functions whereas stringi provides a comprehensive set covering almost anything you can imagine. If you find that stringr is missing a function that you need, try looking in stringi. Both packages share similar conventions, so once you’ve mastered stringr, you should find stringi similarly easy to use.

hat-trie - C++ implementation of a fast and memory efficient HAT-trie

  •    C++

Trie implementation based on the "HAT-trie: A Cache-conscious Trie-based Data Structure for Strings." (Askitis Nikolas and Sinha Ranjan, 2007) paper. For now, only the pure HAT-trie has been implemented, the hybrid version may arrive later. Details regarding the HAT-trie data structure can be found here. The library provides an efficient and compact way to store a set or a map of strings by compressing the common prefixes. It also allows to search for keys that match a prefix. Note though that the default parameters of the structure are geared toward optimizing exact searches, if you do a lot of prefix searches you may want to reduce the burst threshold through the burst_threshold method.

underscore.string - String manipulation extensions for Underscore.js javascript library.

  •    Javascript

String manipulation extensions for Underscore.js javascript library.

Stringy - A PHP string manipulation library with multibyte support

  •    PHP

A PHP string manipulation library with multibyte support. Compatible with PHP 5.4+, PHP 7+, and HHVM.Refer to the 1.x branch or 2.x branch for older documentation.

s.el - The long lost Emacs string manipulation library.

  •    Emacs

The long lost Emacs string manipulation library. Or you can just dump s.el in your load path somewhere.

ExtraL

  •    Perl

Extral is a generally useful Tcl extension that provides a.o.: extral list manipulation commands, extra string manipulation commands, array manipulation, map, atexit, filing commands.

magic-string - Manipulate strings like a wizard

  •    Javascript

Suppose you have some source code. You want to make some light modifications to it - replacing a few characters here and there, wrapping it with a header and footer, etc - and ideally you'd like to generate a source map at the end of it. You've thought about using something like recast (which allows you to generate an AST from some JavaScript, manipulate it, and reprint it with a sourcemap without losing your comments and formatting), but it seems like overkill for your needs (or maybe the source code isn't JavaScript).Your requirements are, frankly, rather niche. But they're requirements that I also have, and for which I made magic-string. It's a small, fast utility for manipulating strings and generating sourcemaps.

left-pad - :arrow_left: String left pad

  •    Javascript

NOTE: The third argument should be a single char. However the module doesn't throw an error if you supply more than one chars. See #28. NOTE: Characters having code points outside of BMP plan are considered a two distinct characters. See #58.