Displaying 1 to 20 from 62 results

markdown-js - A Markdown parser for javascript

  •    Javascript

If you want to use from the browser go to the releases page on GitHub and download the version you want (minified or not). We only officially support node >= 0.10 as the libraries we use for building and testing don't work on older versions of node. That said since this module is so simple and doesn't use any parts of the node API if you use the pre-built version and find a bug let us know and we'll try and fix it.

Command-line-text-processing - :zap: From finding text to search and replace, from sorting to beautifying text and more :art:

  •    Shell

Learn about various commands available for common and exotic text processing needs. Examples have been tested on GNU/Linux - there'd be syntax/feature variations with other distributions, consult their respective man pages for details. ⚠️ 🚧 Work in progress, stay tuned...

unified - interface for parsing, inspecting, transforming, and serializing content through syntax trees

  •    Javascript

unified is an interface for processing text using syntax trees. It’s what powers remark (Markdown), retext (natural language), and rehype (HTML), and allows for processing between formats. unified enables new exciting projects like Gatsby to pull in Markdown, MDX to embed JSX, and Prettier to format it. It’s used in about 500k projects on GitHub and has about 25m downloads each month on npm: you’re probably using it. Some notable users are Node.js, Vercel, Netlify, GitHub, Mozilla, WordPress, Adobe, Facebook, Google, and many more.

Gate - General Architecture for Text Engineering

  •    Java

GATE excels at text analysis of all shapes and sizes. It provides support for diverse language processing tasks such as parsers, morphology, tagging, Information Retrieval tools, Information Extraction components for various languages, and many others. It provides support to measure, evaluate, model and persist the data structure. It could analyze text or speech. It has built-in support for machine learning and also adds support for different implementation of machine learning via plugin.




OpenPipe - Document Pipeline

  •    Java

OpenPipe is an open source scalable platform for manipulating a stream of documents. A pipeline is an ordered set of steps / operations performed on a document to convert from its raw form to something ready to be put into the index. The operations performed on documents include language detection, field manipulation, POS tagging, entity extraction or submitting the document to a search engine.

TextTeaser - Automatic Summarization Algorithm

  •    Scala

TextTeaser is an automatic summarization algorithm that combines the power of natural language processing and machine learning to produce good results. It can provide provide a gist of an article, Better previews in news readers.

pyparsing

  •    Python

The pyparsing module is an alternative approach to creating and executing simple grammars, vs. the traditional lex/yacc approach, or the use of regular expressions. The pyparsing module provides a library of classes that client code uses to construct the grammar directly in Python code. The Python representation of the grammar is quite readable, owing to the self-explanatory class names, and the use of '+', '|' and '^' operator definitions.

hck - A sharp cut(1) clone.

  •    Rust

A sharp cut(1) clone. hck is a shortening of hack, a rougher form of cut.


CAML.NET

  •    CSharp

A set of .NET language-based tools for creating dynamic, reusable CAML query components. CAML.NET leverages the power and flexibility of the .NET Common Language Runtime (CLR) to build CAML queries dynamically in code while preserving the syntactic structure of the native CAML...

Conversor de textos formatados para OpenXML

  •    C++

Projeto de um conversor para o formato OpenXML, mais precisamente arquivos de texto formatado utilizando a linguagem WordprocessingML, a partir de outros formatos diversos. O conversor consiste basicamente de duas partes: um parser/interpretador para o formato original e um...

RazorEngine

  •    

A templating engine built upon Microsoft's Razor parsing technology. The RazorEngine allows you to use Razor syntax to build robust templates. Currently we have integrated the vanilla Html + Code support, but we hope to support other markup languages in future.

Flat File Parser

  •    CSharp

A flat file parser capable of loading in complete or partial flat text files. It will convert each row in the file into a standard CLR object.

TextGenerator

  •    

A simple tool for quick, polymorphic text generation based on a variable input pattern.

Regex Batch Replacer (Multi-File)

  •    

Regex Batch Replacer uses regular expression to find and replace text in multiple files.

fotelo: A formatted text loader library

  •    

fotelo (foe-tell-o): A formatted text loader library. Fotelo will allow you to import text files of various formats into a strongly-typed .NET DataTable for use within your applications.

OpenTextSummarizer C# Port

  •    

This is a port to C# of the fantastic Open Text Summarizer (http://libots.sourceforge.net/) . It uses the same dictionary files and algorithms of the original OTS, though all of the code was rewritten.

pynlpl - PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing

  •    Python

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotatation). The library is a divided into several packages and modules. It works on Python 2.7, as well as Python 3.

python-nameparser - A simple Python module for parsing human names into their individual components

  •    Python

A simple Python (3.2+ & 2.6+) module for parsing human names into their individual components. The supported name structure is generally "Title First Middle Last Suffix", where all pieces are optional. Comma-separated format like "Last, First" is also supported.

whatlanggo - Natural language detection library for Go

  •    Go

Natural language detection for Go.Thanks to greyblake Potapov Sergey for creating whatlang-rs from where I got the idea and logic.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.