unified is an interface for processing text using syntax trees. It’s what powers remark (Markdown), retext (natural language), and rehype (HTML), and allows for processing between formats. unified enables new exciting projects like Gatsby to pull in Markdown, MDX to embed JSX, and Prettier to format it. It’s used in about 500k projects on GitHub and has about 25m downloads each month on npm: you’re probably using it. Some notable users are Node.js, Vercel, Netlify, GitHub, Mozilla, WordPress, Adobe, Facebook, Google, and many more.

unified - interface for parsing, inspecting, transforming, and serializing content through syntax trees

GATE excels at text analysis of all shapes and sizes. It provides support for diverse language processing tasks such as parsers, morphology, tagging, Information Retrieval tools, Information Extraction components for various languages, and many others. It provides support to measure, evaluate, model and persist the data structure. It could analyze text or speech. It has built-in support for machine learning and also adds support for different implementation of machine learning via plugin. 
 It has family of products: 
 GATE Developer: An integrated development environment for language processing components bundled with a very widely used Information Extraction system and a comprehensive set of other plugins. 
 
 GATE Teamware: A collaborative annotation environment for factory-style semantic annotation projects built around a workflow engine and a heavily-optimized backend service infrastructure. 

 GATE Embedded: An object library optimized for inclusion in diverse applications giving access to all the services used by GATE Developer and more.
 <img src="/AppImages/Article/gate_img1.jpg" alt="" class="float-center">

GATE excels at text analysis of all shapes and sizes. It provides support for diverse language processing tasks such as parsers, morphology, tagging, Information Retrieval tools, Information Extraction components for various languages, and many others. It provides support to measure, evaluate, model and persist the data structure. It could analyze text or speech. It has built-in support for machine learning and also adds support for different implementation of machine learning via plugin.

Gate - General Architecture for Text Engineering

OpenPipe is an open source scalable platform for manipulating a stream of documents. A pipeline is an ordered set of steps / operations performed on a document to convert from its raw form to something ready to be put into the index.
 The operations performed on documents include language detection, field manipulation, POS tagging, entity extraction or submitting the document to a search engine. OpenPipe has support to extract content from database and file system. It could extract content or metadata from any file formats.

OpenPipe is an open source scalable platform for manipulating a stream of documents. A pipeline is an ordered set of steps / operations performed on a document to convert from its raw form to something ready to be put into the index.
  The operations performed on documents include language detection, field manipulation, POS tagging, entity extraction or submitting the document to a search engine.

OpenPipe - Document Pipeline

TextTeaser is an automatic summarization algorithm that combines the power of natural language processing and machine learning to produce good results. It can provide provide a gist of an article, 
Better previews in news readers. 

TextTeaser - Automatic Summarization Algorithm

Discover open source projects across all platforms

Projects

unified - interface for parsing, inspecting, transforming, and serializing content through syntax trees

Gate - General Architecture for Text Engineering

OpenPipe - Document Pipeline

TextTeaser - Automatic Summarization Algorithm

TechStack

Tagcloud

License

Suggested keywords:

Projects

unified - interface for parsing, inspecting, transforming, and serializing content through syntax trees

Gate - General Architecture for Text Engineering

OpenPipe - Document Pipeline

TextTeaser - Automatic Summarization Algorithm

TechStack

Tagcloud

License