TextTeaser - Automatic Summarization Algorithm

  •        1709

TextTeaser is an automatic summarization algorithm that combines the power of natural language processing and machine learning to produce good results. It can provide provide a gist of an article, Better previews in news readers.

https://github.com/MojoJolo/textteaser
http://www.textteaser.com/

Tags
Implementation
License
Platform

   




Related Projects

prose - :book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction


prose is Go library for text (primarily English at the moment) processing that supports tokenization, part-of-speech tagging, named-entity extraction, and more. The library's functionality is split into subpackages designed for modular use.See the GoDoc documentation for more information.

Gate - General Architecture for Text Engineering


GATE excels at text analysis of all shapes and sizes. It provides support for diverse language processing tasks such as parsers, morphology, tagging, Information Retrieval tools, Information Extraction components for various languages, and many others. It provides support to measure, evaluate, model and persist the data structure. It could analyze text or speech. It has built-in support for machine learning and also adds support for different implementation of machine learning via plugin.

nlp - Selected Machine Learning algorithms for basic natural language processing in Golang


An implementation of selected machine learning algorithms for basic natural language processing in golang. The initial focus for this project is Latent Semantic Analysis to allow retrieval/searching, clustering and classification of text documents based upon semantic content.Built upon the gonum/gonum matrix library with some inspiration taken from Python's scikit-learn.

nlp - Natural language processing tools for text generation, search and analysis.


Natural language processing tools for text generation, search and analysis.

SWING


The Summarizer from the Web IR / NLP Group (WING), hence SWING, is a modular, state-of-the-art automatic extractive text summarization system. It is used as the basis for summarization research at the National University of Singapore. It performs as one of the leading automatic summarization systems in the international TAC competition, getting high marks for the ROUGE evaluation measure



node-summary - Node module that summarizes text using a naive summarization algorithm


Node module that summarizes text using a naive summarization algorithm

divijvaidya-iIntelli


We developed a generic interactive framework based on human cognition, where the system can learn continuously from the Internet and from its interaction with the users. To show the utilization of this framework, iIntelli, an agent based application for multiple text document summarization was developed and compared with the MEAD on the Cran Data Set. Mead is a natural language processing based summarizer, which provides summary by extracting sentences from a cluster of related documents and Cra

OpenPipe - Document Pipeline


OpenPipe is an open source scalable platform for manipulating a stream of documents. A pipeline is an ordered set of steps / operations performed on a document to convert from its raw form to something ready to be put into the index. The operations performed on documents include language detection, field manipulation, POS tagging, entity extraction or submitting the document to a search engine.

japanese-nlptools - Tools for NLP-related analysis of Japanese text


Tools for NLP-related analysis of Japanese text

ArabicNLP - Collection of various Arabic NLP and Text Processing Scripts and Utilities


Collection of various Arabic NLP and Text Processing Scripts and Utilities

nlp - Extract values from strings and fill your structs with nlp.


You will always begin by creating a NL type calling nlp.New(), the NL type is a Natural Language Processor that owns 3 funcs, RegisterModel(), Learn() and P().RegisterModel takes 3 parameters, an empty struct, a set of samples and some options for the model.

tif - Text Interchange Formats


This package describes and validates formats for storing common object arising in text analysis as native R objects. Representations of a text corpus, document term matrix, and tokenized text are included. The tokenized text format is extensible to include other annotations. There are two versions of the corpus and tokens objects; packages should accept both and return or coerce to at least one of these.corpus (data frame) - A valid corpus data frame object is a data frame with at least two columns. The first column is called doc_id and is a character vector with UTF-8 encoding. Document ids must be unique. The second column is called text and must also be a character vector in UTF-8 encoding. Each individual document is represented by a single row in the data frame. Addition document-level metadata columns and corpus level attributes are allowed but not required.

Utils - Common routines in Java for basic data processing in network or text analysis


Common routines in Java for basic data processing in network or text analysis

Text-NLP - Release history of Text-NLP


Release history of Text-NLP

text-analysis-toolkit - Collection of tools for specific text processing that I needed.


Collection of tools for specific text processing that I needed.

text_summarization - Python text summarization project for computer science M.Sc.


Python text summarization project for computer science M.Sc.

suzgec - Text Summarization System for Turkish Language, Senior project at Atilim University


Text Summarization System for Turkish Language, Senior project at Atilim University

posthoc - Functions for ML/NLP experiment post-processing and analysis.


Functions for ML/NLP experiment post-processing and analysis.