nltk_data - NLTK Data

  •        26

NLTK Data

https://github.com/nltk/nltk_data

Tags
Implementation
License
Platform

   




Related Projects

TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more

  •    Python

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.

text-analytics-with-python - Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer

  •    Python

Derive useful insights from your data using Python. Learn the techniques related to natural language processing and text analytics, and gain the skills to know which technique is best suited to solve a particular problem. A structured and comprehensive approach is followed in this book so that readers with little or no experience do not find themselves overwhelmed. You will start with the basics of natural language and Python and move on to advanced analytical and machine learning concepts. You will look at each technique and algorithm with both a bird's eye view to understand how it can be used as well as with a microscopic view to understand the mathematical concepts and to implement them to solve your own problems.

pynlpl - PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing

  •    Python

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotatation). The library is a divided into several packages and modules. It works on Python 2.7, as well as Python 3.

Natural Language Toolkit

  •    Python

The Natural Language Toolkit is a suite of Python packages and data for natural language processing; it comes with extensive API documentation and tutorials. NLTK-Lite is the version under active development.

nlp-with-ruby - Practical Natural Language Processing done in Ruby.

  •    Ruby

This curated list comprises awesome resources, libraries, information sources about computational processing of texts in human languages with the Ruby programming language. That field is often referred to as NLP, Computational Linguistics, HLT (Human Language Technology) and can be brought in conjunction with Artificial Intelligence, Machine Learning, Information Retrieval, Text Mining, Knowledge Extraction and other related disciplines. This list comes from our day to day work on Language Models and NLP Tools. Read why this list is awesome. Our FAQ describes the important decisions and useful answers you may be interested in.


practical-machine-learning-with-python - Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system

  •    Jupyter

"Data is the new oil" is a saying which you must have heard by now along with the huge interest building up around Big Data and Machine Learning in the recent past along with Artificial Intelligence and Deep Learning. Besides this, data scientists have been termed as having "The sexiest job in the 21st Century" which makes it all the more worthwhile to build up some valuable expertise in these areas. Getting started with machine learning in the real world can be overwhelming with the vast amount of resources out there on the web. "Practical Machine Learning with Python" follows a structured and comprehensive three-tiered approach packed with concepts, methodologies, hands-on examples, and code. This book is packed with over 500 pages of useful information which helps its readers master the essential skills needed to recognize and solve complex problems with Machine Learning and Deep Learning by following a data-driven mindset. By using real-world case studies that leverage the popular Python Machine Learning ecosystem, this book is your perfect companion for learning the art and science of Machine Learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute Machine Learning systems and projects successfully.

rake-nltk - Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.

  •    Python

RAKE short for Rapid Automatic Keyword Extraction algorithm, is a domain independent keyword extraction algorithm which tries to determine key phrases in a body of text by analyzing the frequency of word appearance and its co-occurance with other words in the text. If you see a stopwords error, it means that you do not have the corpus stopwords downloaded from NLTK. You can download it using command below.

gensim - Topic Modelling for Humans

  •    Python

Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community. If this feature list left you scratching your head, you can first read more about the Vector Space Model and unsupervised document analysis on Wikipedia.

nltk-trainer - Train NLTK objects with zero code

  •    Python

Train NLTK objects with zero code

nltk - NLTK Source

  •    Python

NLTK Source

cogcomp-nlp - CogComp's Natural Language Processing libraries and Demos:

  •    Java

This project collects a number of core libraries for Natural Language Processing (NLP) developed by Cognitive Computation Group. Each library contains detailed readme and instructions on how to use it. In addition the javadoc of the whole project is available here.

spaCy - 💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython

  •    Python

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 20+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license. 💫 Version 2.0 out now! Check out the new features here.

treat - Natural language processing framework for Ruby.

  •    Ruby

Treat is a toolkit for natural language processing and computational linguistics in Ruby. The Treat project aims to build a language- and algorithm- agnostic NLP framework for Ruby with support for tasks such as document retrieval, text chunking, segmentation and tokenization, natural language parsing, part-of-speech tagging, keyword extraction and named entity recognition. Learn more by taking a quick tour or by reading the manual. I am actively seeking developers that can help maintain and expand this project. You can find a list of ideas for contributing to the project here.

BlingFire - A lightning fast Finite State machine and REgular expression manipulation library.

  •    C++

Hi, we are a team at Microsoft called Bling (Beyond Language Understanding), we help Bing be smarter. Here we wanted to share with all of you our FInite State machine and REgular expression manipulation library (FIRE). We use Fire for many linguistic operations inside Bing such as Tokenization, Multi-word expression matching, Unknown word-guessing, Stemming / Lemmatization just to mention a few. Bling Fire Tokenizer is a tokenizer designed for fast-speed and quality tokenization of Natural Language text. It mostly follows the tokenization logic of NLTK, except hyphenated words are split and a few errors are fixed.

PyTorch-NLP - Supporting Rapid Prototyping with a Toolkit (incl. Datasets and Neural Network Layers)

  •    Python

PyTorch-NLP, or torchnlp for short, is a library of neural network layers, text processing modules and datasets designed to accelerate Natural Language Processing (NLP) research. Join our community, add datasets and neural network layers! Chat with us on Gitter and join the Google Group, we're eager to collaborate with you.

cltk - The Classical Language Toolkit

  •    Python

The docs are at docs.cltk.org. CLTK supports Python version 3.6. The software only runs on POSIX–compliant operating systems (Linux, Mac OS X, FreeBSD, etc.).

OpenNLP - Machine learning based toolkit for the processing of natural language text

  •    Java

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning.

spark-nlp - Natural Language Understanding Library for Apache Spark.

  •    Jupyter

John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment. This library has been uploaded to the spark-packages repository https://spark-packages.org/package/JohnSnowLabs/spark-nlp .

nlp-architect - NLP Architect by Intel AI Lab: Python library for exploring the state-of-the-art deep learning topologies and techniques for natural language processing and natural language understanding

  •    Python

NLP Architect is an open-source Python library for exploring state-of-the-art deep learning topologies and techniques for natural language processing and natural language understanding. It is intended to be a platform for future research and collaboration. Framework documentation on NLP models, algorithms, and modules, and instructions on how to contribute can be found at our main documentation site.

nltk_book - NLTK Book

  •    Python

NLTK Book