closestmatch - Golang library for fuzzy matching within a set of strings :page_with_curl:

  •        150

closestmatch is a simple and fast Go library for fuzzy matching an input string to a list of target strings. closestmatch is useful for handling input from a user where the input (which could be mispelled or out of order) needs to match a key in a database. closestmatch uses a bag-of-words approach to precompute character n-grams to represent each possible target string. The closest matches have highest overlap between the sets of n-grams. The precomputation scales well and is much faster and more accurate than Levenshtein for long strings.closestmatch is more accurate than Levenshtein for long strings (like in the test corpus).

https://github.com/schollz/closestmatch

Tags
Implementation
License
Platform

   




Related Projects

SymSpell - SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm

  •    CSharp

The Symmetric Delete spelling correction algorithm reduces the complexity of edit candidate generation and dictionary lookup for a given Damerau-Levenshtein distance. It is six orders of magnitude faster (than the standard approach with deletes + transposes + replaces + inserts) and language independent. Lookup provides a very fast spelling correction of single words.

fuzzysearch - :pig: Tiny and fast fuzzy search in Go

  •    Go

Inspired by bevacqua/fuzzysearch, a fuzzy matching library written in JavaScript. But contains some extras like ranking using Levenshtein distance (see RankMatch()) and finding matches in a list of words (see Find()). Fuzzy searching allows for flexibly matching a string with partial input, useful for filtering data very quickly based on lightweight user input.

fuzzysearch - :pig: Tiny and fast fuzzy search in Go

  •    Go

Inspired by bevacqua/fuzzysearch, a fuzzy matching library written in JavaScript. But contains some extras like ranking using Levenshtein distance (see RankMatch()) and finding matches in a list of words (see Find()). Fuzzy searching allows for flexibly matching a string with partial input, useful for filtering data very quickly based on lightweight user input.

fuzzywuzzy - Fuzzy String Matching in Python

  •    Python

Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

Fuzzy string matching algorithm in C# and LINQ

  •    

Fuzzy matching strings. Find out how similar two string is, and find the best fuzzy matching string from a string table. Given a string (strA) and a big string table. Find the likeness or similarity of the string in the string table. Using C# and LINQ


StringScore - StringScore is an Objective-C library which provides super fast fuzzy string matching/scoring

  •    Objective-C

StringScore is an Objective-C library which provides super fast fuzzy string matching/scoring. Based on the JavaScript library of the same name, by Joshaven Potter. All three methods return a CGFloat representing how closely the string matched the otherString parameter.

jellyfish - 🎐 a python library for doing approximate and phonetic matching of strings.

  •    Python

Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk <james.p.turk@gmail.com> and Michael Stephens.

tre - The approximate regex matching library and agrep command line tool.

  •    C

TRE is a lightweight, robust, and efficient POSIX compliant regexp matching library with some exciting features such as approximate (fuzzy) matching. The matching algorithm used in TRE uses linear worst-case time in the length of the text being searched, and quadratic worst-case time in the length of the used regular expression.

ahocorasick - A Golang implementation of the Aho-Corasick string matching algorithm

  •    Go

A Golang implementation of the Aho-Corasick string matching algorithm

dedupe - :id: A python library for accurate and scaleable fuzzy matching, record deduplication and entity-resolution

  •    Python

dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on structured data. dedupe takes in human training data and comes up with the best rules for your dataset to quickly and automatically find similar records, even with very large databases.

fuzzyset.js - fuzzyset.js - A fuzzy string set for javascript

  •    Javascript

fuzzyset is a data structure that performs something akin to fulltext search against data to determine likely mispellings and approximate string matching. Note that this is a javascript port of a python library. The result will be an array of [score, matched_value] arrays. The score is between 0 and 1, with 1 being a perfect match.

fuse-swift - A lightweight fuzzy-search library, with zero dependencies

  •    Swift

Fuse is a super lightweight library which provides a simple way to do fuzzy searching. To run the example project, clone the repo, and run pod install from the Example directory first.

amatch - Approximate String Matching library

  •    C

Approximate String Matching library

fuzzysearch - :crystal_ball: Tiny and blazing-fast fuzzy search in JavaScript

  •    Javascript

Fuzzy searching allows for flexibly matching a string with partial input, useful for filtering data very quickly based on lightweight user input.To see fuzzysearch in action, head over to bevacqua.github.io/horsey, which is a demo of an autocomplete component that uses fuzzysearch to filter out results based on user input.

barefoot - Java library for integrating the map into software and services with state-of-the-art online and offline map matching that can be used stand-alone and in the cloud

  •    Java

An open source Java library for online and offline map matching with OpenStreetMap. Together with its extensive set of geometric and spatial functions, an in-memory map data structure and basic machine learning functions, it is a versatile basis for scalable location-based services and spatio-temporal data analysis on the map. It is designed for use in parallel and distributed systems and, hence, includes a stand-alone map matching server and can be used in distributed systems for map matching services in the cloud. Barefoot consists of a software library and a (Docker-based) map server that provides access to street map data from OpenStreetMap and is flexible to be used in distributed cloud infrastructures as map data server or side-by-side with Barefoot's stand-alone servers for offline (matcher server) and online map matching (tracker server), or other applications built with Barefoot library. Access to map data is provided with a fast and flexible in-memory map data structure. Together with GeographicLib [1] and ESRI's geometry API [2], it provides an extensive set of geographic and geometric operations for spatial data analysis on the map.

url-pattern - easier than regex string matching patterns for urls and other strings

  •    CoffeeScript

easier than regex string matching patterns for urls and other strings. turn strings into data or data into strings.a pattern is immutable after construction. none of its methods changes its state. that makes it easier to reason about.

liquidmetal - :sweat_drops::metal: A mimetic poly-alloy of the Quicksilver scoring algorithm, essentially LiquidMetal

  •    Javascript

Flex matching short abbreviations against longer strings is a boon in productivity for typists. Applications like Quicksilver, Alfred, LaunchBar, and Launchy have made this method of keyboard entry a popular one. It's time to bring this same functionality to web controls. LiquidMetal makes scoring long strings against abbreviations easy.If you like this project, buy me a coffee, donate via Gratipay, or book a session with me on Codementor.

FuzzyFinder - buffer/file/command/tag/etc explorer with fuzzy matching

  •    VimL

buffer/file/command/tag/etc explorer with fuzzy matching