Displaying 1 to 20 from 24 results

dssim - Image similarity comparison simulating human perception (multiscale SSIM in Rust)

  •    Rust

This tool computes (dis)similarity between two or more PNG images using an algorithm approximating human vision. Comparison is done using the SSIM algorithm at multiple weighed resolutions.

dssim - Image similarity comparison simulating human perception (multiscale SSIM in C)

  •    C

This tool computes (dis)similarity between two or more PNG images using an algorithm approximating human vision.Comparison is done using the SSIM algorithm (based on Rabah Mehdi's implementation) at multiple weighed resolutions.

levenshtein.c - Levenshtein algorithm in C

  •    C

Vladimir Levenshtein’s edit distance algorithm1 as a C library. There’s also a CLI: levenshtein(1), and a JavaScript version.Or clone the repo.




middleman-blog-similar - An extension for middleman-blog that adds method to lookup similar article.

  •    Ruby

middleman-blog-similar is an extension for middleman-blog that adds method to lookup similar article.Middleman::Blog::BlogArticle#similar_articles returns an array of Middleman::Blog::BlogArticle instances.

mongodb-chemistry - Ideas for chemical similarity searches in MongoDB.

  •    Python

Chemical similarity search implementation in MongoDB, with performance analysis. See this blog post for more information.

html-similarity - Compare html similarity using structural and style metrics

  •    Python

This package provides a set of functions to measure the similarity between web pages. Uses sequence comparison of the html tags to compute the similarity.


ReactionDecoder - Reaction Decoder Tool (RDT)

  •    HTML

a) You could [download the latest RDT] (https://github.com/asad/ReactionDecoder/releases) release version from the github. RDT is released under the GNU General Public License version 3.

wikimark - get a sens of it

  •    Python

wikimark goal is to give you an idea of what the text is about. You can also use your own corpus.

synt - Find similar functions and classes in your JavaScript/TypeScript code

  •    TypeScript

Find similar functions and classes in your JavaScript/TypeScript code. For more info on support for ECMAScript Stage-3 and below proposals, see issue #94.

img-ssim - :mount_fuji: Get the structural similarity between two images.

  •    Javascript

Get the structural similarity between two images. Please post questions on Stack Overflow. You can open issues with questions, as long you add a link to your Stack Overflow question.

string-similarity - Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance

  •    Javascript

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance. Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-insensitive.

apollo - Advanced similarity and duplicate source code proof of concept for our research efforts.

  •    Python

Advanced code deduplicator from hell. Powered by source{d} ML, source{d} engine and minhashcuda. Agnostic to the analysed language thanks to Babelfish. Python 3, PySpark, CUDA inside. source{d}'s effort to research and solve the code deduplication problem. At scale, as usual. A code clone is several snippets of code with few differences. For now this project focuses on find near-duplicate projects and files; it will eventually support functions and snippets in the future.

consimilo - A Clojure library for querying large data-sets on similarity

  •    Clojure

consimilo is a library that utilizes locality sensitive hashing (implemented as lsh-forest) and minhashing, to support top-k similar item queries. Finding similar items across expansive data-sets is a common problem that presents itself in many real world applications (e.g. finding articles from the same source, plagiarism detection, collaborative filtering, context filtering, document similarity, etc...). Searching a corpus for top-k similar items quickly grows to an unwieldy complexity at relatively small corpus sizes (n choose 2). LSH reduces the search space by "hashing" items in such a way that collisions occur as a result of similarity. Once the items are hashed and indexed the lsh-forest supports a top-k most similar items query of ~O(log n). There is an accuracy trade-off that comes with the enormous increase in query speed. More information can be found in chapter 3 of Mining Massive Datasets. You can continue to add to this forest by passing it as the first argument to add-all-to-forest. The forest data structure is stored in an atom, so the existing forest is modified in place.

investment-advisor - The Investment Advisor Demo combines IBM Watson Personality Insights and IBM Watson Tradeoff Analytics services to recommend suitable funds and agents for clients

  •    Javascript

The Investment Advisor Demo combines IBM Watson Personality Insights and IBM Watson Tradeoff Analytics services to recommend suitable funds and agents for clients. Fund recommendation is based on a client's risk propensity. Agent recommendation focuses on building long-term relationships with clients. Sign up in Bluemix, or use an existing account.

rltk - Record Linkage ToolKit (Find and link entities)

  •    Python

The Record Linkage ToolKit (RLTK) is a general-purpose open-source record linkage platform that allows users to build powerful Python programs that link records referring to the same underlying entity. Record linkage is an extremely important problem that shows up in domains extending from social networks to bibliographic data and biomedicine. Current open platforms for record linkage have problems scaling even to moderately sized datasets, or are just not easy to use (even by experts). RLTK attempts to address all of these issues. RLTK supports a full, scalable record linkage pipeline, including multi-core algorithms for blocking, profiling data, computing a wide variety of features, and training and applying machine learning classifiers based on Python’s sklearn library. An end-to-end RLTK pipeline can be jump-started with only a few lines of code. However, RLTK is also designed to be extensible and customizable, allowing users arbitrary degrees of control over many of the individual components. You can add new features to RLTK (e.g. a custom string similarity) very easily.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.