Displaying 1 to 16 from 16 results

jellyfish - 🎐 a python library for doing approximate and phonetic matching of strings.

  •    Python

Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk <james.p.turk@gmail.com> and Michael Stephens.

natural - general natural language facilities for node

  •    Javascript

"Natural" is a general natural language facility for nodejs. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported.




jaro_winkler - Ruby & C implementation of Jaro-Winkler distance algorithm which supports UTF-8 string

  •    Ruby

jaro_winkler is an implementation of Jaro-Winkler distance algorithm which is written in C extension and will fallback to pure Ruby version in platforms other than MRI/KRI like JRuby or Rubinius. Both of C and Ruby implementation support any kind of string encoding, such as UTF-8, EUC-JP, Big5, etc. There is no JaroWinkler.jaro_winkler_distance, it's tediously long.

jaro-winkler - The Jaro-Winkler distance metric for node and browser.

  •    Javascript

A string similarity function using the Jaro-Winkler distance metric. Returns a number between 0 and 1. A 0 being no similarity and a 1 being an exact match. Read more about it on Wikipedia.

clj-fuzzy - A handy collection of algorithms dealing with fuzzy strings and phonetics.

  •    Clojure

clj-fuzzy is a native Clojure library providing a collection of famous algorithms dealing with fuzzy strings and phonetics. It can be used in Clojure, ClojureScript, client-side JavaScript and Node.js.


simetric - String similarity metrics for Elixir

  •    Elixir

Simetric provides facilities to perform approximate string matching and measurement of string similarity/distance. The library is focusing on speed and completeness. Then, run mix deps.get in your shell to fetch the new dependency.

stringosim - String similarity functions, String distance's, Jaccard, Levenshtein, Hamming, Jaro-Winkler, Q-grams, N-grams, LCS - Longest Common Subsequence, Cosine similarity

  •    Go

The plan for this package is to have Go implementation of different string distance/similarity functions, like Levenshtein (normalized, weighted, Damerau), Jaro-Winkler, Jaccard index, Euclidean distance, Hamming distance... Work in progress...

spark-stringmetric - Spark functions to run popular phonetic and string matching algorithms

  •    Scala

Making similarity functions and phonetic algorithms readily available for fuzzy matching analyses in Spark. Update your build.sbt file to import the libraries.

strsim-rs - :abc: Rust implementations of string similarity metrics

  •    Rust

You can change the version in the url to see the documentation for an older version in the changelog. If you don't want to install Rust itself, you can run $ ./dev for a development CLI if you have Docker installed.

strutil - Golang metrics for calculating string similarity and other string utility functions

  •    Go

strutil provides string metrics for calculating string similarity as well as other string utility functions. Full documentation can be found at: https://pkg.go.dev/github.com/adrg/strutil. The package defines the StringMetric interface, which is implemented by all the string metrics. The interface is used with the Similarity function, which calculates the similarity between the specified strings, using the provided string metric.

StringComparison - String Comparision in C#.NET

  •    CSharp

StringComparison is a library developed for reconciling naming conventions between different models of the electric grid. I have stripped off the power system specific code and put together what can effectively be used as a string extension for determining approximate equality between two strings. All of the algorithms used here have been pulled from online resources, translated into C#, and compiled into this library. I found several other similar open-source implementations around but nothing for .NET/C#. Adding the *.dll to your project will give you access to this extension and the individual extensions under the hood of the IsSimilarity() extension. While all of the algorithms are exposed and can be used and can provide their raw results, they have been conveniently combined in a way that they can selectively be used to judge the approximate equality of two strings. This is done through the IsSimilar extension and by setting the desired StringComparisonOptions and StringComparisonTolerance.

TySug - A project around helping to prevent typing typos

  •    Go

TySug is collection of packages, together they form a keyboard layout aware alternative word suggester. It can be used as both a library and a webservice. The primary supported use-case is to help with spelling mistakes against short popular word lists (e.g. domain names). Which is useful in helping to prevent typos in e.g. e-mail addresses, detect spam, phishing (Typosquatting), etc.

ceja - PySpark phonetic and string matching algorithms

  •    Python

Run pip install ceja to install the library. Import the functions with import ceja. After importing the code you can run functions like ceja.nysiis, ceja.jaro_winkler_similarity, etc.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.