opencog - A framework for integrated Artificial Intelligence & Artificial General Intelligence (AGI)

  •        137

OpenCog is a framework for developing AI systems, especially appropriate for integrative multi-algorithm systems, and artificial general intelligence systems. Though much work remains to be done, it currently contains a functional core framework, and a number of cognitive agents at varying levels of completion, some already displaying interesting and useful functionalities alone and in combination. With the exception of MOSES and the CogServer, all of the above are in active development, are half-baked, poorly documented, mis-designed, subject to experimentation, and generally in need of love an attention. This is where experimentation and integration are taking place, and, like any laboratory, things are a bit fluid and chaotic.



Related Projects

BotSharp - The Open Source AI Chatbot Platform Builder in 100% C# Running in

  •    CSharp

BotSharp is an open source machine learning framework for AI Bot platform builder. This project involves natural language understanding, computer vision and audio processing technologies, and aims to promote the development and application of intelligent robot assistants in information systems. Out-of-the-box machine learning algorithms allow ordinary programmers to develop artificial intelligence applications faster and easier. It's witten in C# running on .Net Core that is full cross-platform framework. C# is a enterprise grade programming language which is widely used to code business logic in information management related system. More friendly to corporate developers. BotSharp adopts machine learning algrithm in C# directly. That will facilitate the feature of the typed language C#, and be more easier when refactoring code in system scope.

decaNLP - The Natural Language Decathlon: A Multitask Challenge for NLP

  •    Python

The Natural Language Decathlon is a multitask challenge that spans ten tasks: question answering (SQuAD), machine translation (IWSLT), summarization (CNN/DM), natural language inference (MNLI), sentiment analysis (SST), semantic role labeling(QA‑SRL), zero-shot relation extraction (QA‑ZRE), goal-oriented dialogue (WOZ, semantic parsing (WikiSQL), and commonsense reasoning (MWSC). Each task is cast as question answering, which makes it possible to use our new Multitask Question Answering Network (MQAN). This model jointly learns all tasks in decaNLP without any task-specific modules or parameters in the multitask setting. For a more thorough introduction to decaNLP and the tasks, see the main website, our blog post, or the paper. While the research direction associated with this repository focused on multitask learning, the framework itself is designed in a way that should make single-task training, transfer learning, and zero-shot evaluation simple. Similarly, the paper focused on multitask learning as a form of question answering, but this framework can be easily adapted for different approached to single-task or multitask learning.

ludwig - Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code

  •    Python

Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code. All you need to provide is a CSV file containing your data, a list of columns to use as inputs, and a list of columns to use as outputs, Ludwig will do the rest. Simple commands can be used to train models both locally and in a distributed way, and to use them to predict on new data.

lectures - Oxford Deep NLP 2017 course


This repository contains the lecture slides and course description for the Deep Natural Language Processing course offered in Hilary Term 2017 at the University of Oxford. This is an applied course focussing on recent advances in analysing and generating speech and text using recurrent neural networks. We introduce the mathematical definitions of the relevant machine learning models and derive their associated optimisation algorithms. The course covers a range of applications of neural networks in NLP including analysing latent dimensions in text, transcribing speech to text, translating between languages, and answering questions. These topics are organised into three high level themes forming a progression from understanding the use of neural networks for sequential language modelling, to understanding their use as conditional language models for transduction tasks, and finally to approaches employing these techniques in combination with other mechanisms for advanced applications. Throughout the course the practical implementation of such models on CPU and GPU hardware is also discussed.

OpenNLP - Machine learning based toolkit for the processing of natural language text

  •    Java

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning.

spark-nlp - Natural Language Understanding Library for Apache Spark.

  •    Jupyter

John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment. This library has been uploaded to the spark-packages repository .

models - Model configurations

  •    Python

PaddlePaddle provides a rich set of computational units to enable users to adopt a modular approach to solving various learning problems. In this repo, we demonstrate how to use PaddlePaddle to solve common machine learning tasks, providing several different neural network model that anyone can easily learn and use. The word embedding expresses words with a real vector. Each dimension of the vector represents some of the latent grammatical or semantic features of the text and is one of the most successful concepts in the field of natural language processing. The generalized word vector can also be applied to discrete features. The study of word vector is usually an unsupervised learning. Therefore, it is possible to take full advantage of massive unmarked data to capture the relationship between features and to solve the problem of sparse features, missing tag data, and data noise. However, in the common word vector learning method, the last layer of the model often encounters a large-scale classification problem, which is the bottleneck of computing performance.

WikiSQL - A large annotated semantic parsing corpus for developing natural language interfaces.

  •    HTML

A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is the dataset released along with our work Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning.

nlp-architect - NLP Architect by Intel AI Lab: Python library for exploring the state-of-the-art deep learning topologies and techniques for natural language processing and natural language understanding

  •    Python

NLP Architect is an open-source Python library for exploring state-of-the-art deep learning topologies and techniques for natural language processing and natural language understanding. It is intended to be a platform for future research and collaboration. Framework documentation on NLP models, algorithms, and modules, and instructions on how to contribute can be found at our main documentation site.

gensim - Topic Modelling for Humans

  •    Python

Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community. If this feature list left you scratching your head, you can first read more about the Vector Space Model and unsupervised document analysis on Wikipedia.

limdu - Machine-learning for Node.js

  •    Javascript

Limdu is a machine-learning framework for Node.js. It supports multi-label classification, online learning, and real-time classification. Therefore, it is especially suited for natural language understanding in dialog systems and chat-bots.Limdu is in an "alpha" state - some parts are working (see this readme), but some parts are missing or not tested. Contributions are welcome.

ladder - Ladder network is a deep learning algorithm that combines supervised and unsupervised learning

  •    Python

This is an implementation of Ladder Network in TensorFlow. Ladder network is a deep learning algorithm that combines supervised and unsupervised learning. It was introduced in the paper Semi-Supervised Learning with Ladder Network by A Rasmus, H Valpola, M Honkala, M Berglund, and T Raiko.

delta - DELTA is a deep learning based natural language and speech processing platform.

  •    Python

DELTA is a deep learning based end-to-end natural language and speech processing platform. DELTA aims to provide easy and fast experiences for using, deploying, and developing natural language processing and speech models for both academia and industry use cases. DELTA is mainly implemented using TensorFlow and Python 3. For details of DELTA, please refer to this paper.

spaCy - 💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython

  •    Python

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 20+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license. 💫 Version 2.0 out now! Check out the new features here.

magnitude - A fast, efficient universal vector embedding utility package.

  •    Python

A feature-packed Python package and vector storage file format for utilizing vector embeddings in machine learning models in a fast, efficient, and simple manner developed by Plasticity. It is primarily intended to be a simpler / faster alternative to Gensim, but can be used as a generic key-vector store for domains outside NLP. Vector space embedding models have become increasingly common in machine learning and traditionally have been popular for natural language processing applications. A fast, lightweight tool to consume these large vector space embedding models efficiently is lacking.

OpenCog - Framework to build Artificial Intelligence Programs

  •    C++

The OpenCog Framework is a platform to build and share artificial intelligence programs. It includes components for procedural and declarative knowledge representation (AtomSpace), task scheduling (CogServer), AI algorithm containers (MindAgents), connectors to instant messaging and virtual world systems, and other components. MindAgents and other add-ons explore a wide variety of AI techniques including evolutionary program learning (MOSES), natural language processing, and others.

Smile - Statistical Machine Intelligence & Learning Engine

  •    Java

Smile (Statistical Machine Intelligence and Learning Engine) is a fast and comprehensive machine learning, NLP, linear algebra, graph, interpolation, and visualization system in Java and Scala. With advanced data structures and algorithms, Smile delivers state-of-art performance.Smile covers every aspect of machine learning, including classification, regression, clustering, association rule mining, feature selection, manifold learning, multidimensional scaling, genetic algorithms, missing value imputation, efficient nearest neighbor search, etc.

moviebox - Machine learning movie recommending system

  •    Python

Moviebox is a content based machine learning recommending system build with the powers of tf-idf and cosine similarities. Initially, a natural number, that corresponds to the ID of a unique movie title, is accepted as input from the user. Through tf-idf the plot summaries of 5000 different movies that reside in the dataset, are analyzed and vectorized. Next, a number of movies is chosen as recommendations based on their cosine similarity with the vectorized input movie. Specifically, the cosine value of the angle between any two non-zero vectors, resulting from their inner product, is used as the primary measure of similarity. Thus, only movies whose story and meaning are as close as possible to the initial one, are displayed to the user as recommendations.

We have large collection of open source products. Follow the tags from Tag Cloud >>

Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.