We have collection of more than 1 Million open source products ranging from Enterprise product to
small libraries in all platforms. We aggregate information from all open source repositories.
Search and find the best for your needs. Check out projects section.
Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code. All you need to provide is a CSV file containing your data, a list of columns to use as inputs, and a list of columns to use as outputs, Ludwig will do the rest. Simple commands can be used to train models both locally and in a distributed way, and to use them to predict on new data.
These tutorials have been merged into the official PyTorch tutorials. Please go there for better maintained versions of these tutorials compatible with newer versions of PyTorch. Learn PyTorch with project-based tutorials. These tutorials demonstrate modern techniques with readable code and use regular data from the internet.
🤗 Transformers provides thousands of pretrained models to perform tasks on texts such as classification, information extraction, question answering, summarization, translation, text generation and more in over 100 languages. Its aim is to make cutting-edge NLP easier to use for everyone. 🤗 Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub. At the same time, each python module defining an architecture is fully standalone and can be modified to enable quick research experiments.
Question generation is the task of automatically generating questions from a text paragraph. The most straight-forward way for this is answer aware question generation. In answer aware question generation the model is presented with the answer and the passage and asked to generate a question for that answer by considering the passage context. While there are many papers available for QG task, it's still not as mainstream as QA. One of the reasons is most of the earlier papers use complicated models/processing pipelines and have no pre-trained models available. Few recent papers, specifically UniLM and ProphetNet have SOTA pre-trained weights availble for QG but the usage seems quite complicated. This project is aimed as an open source study on question generation with pre-trained transformers (specifically seq-2-seq models) using straight-forward end-to-end methods without much complicated pipelines. The goal is to provide simplified data processing and training scripts and easy to use pipelines for inference.
You can click the following links for detailed installation instructions. PubMed Paper Reading Dataset This dataset gathers 14,857 entities, 133 relations, and entities corresponding tokenized text from PubMed. It contains 875,698 training pairs, 109,462 development pairs, and 109,462 test pairs.
Evaluation code for various unsupervised automated metrics for NLG (Natural Language Generation). It takes as input a hypothesis file, and one or more references files and outputs values of metrics. Rows across these files should correspond to the same example. where each line in the hypothesis file is a generated sentence and the corresponding lines across the reference files are ground truth reference sentences for the corresponding hypothesis.
Both algoritms can be trained from pairs of source meaning representations (dialogue acts) and target sentences. The newer seq2seq approach is preferrable: it yields higher performance in terms of both speed and quality. Both algorithms support generating sentence plans (deep syntax trees), which are subsequently converted to text using the existing the surface realizer from Treex NLP toolkit. The seq2seq algorithm also supports direct string generation.
A natural language generation language, intended for creating training data for intent parsing systems. Nalgene generates pairs of sentences and grammar trees by a random (or guided) walk through a grammar file.
You can click the following links for detailed installation instructions. ReviewRobot dataset This dataset contains 8,110 paper and review pairs and background KG from 174,165 papers. It also contains information extraction results from SciIE, various knowledge graphs built on the IE results, and human annotation for paper-review pairs. The detailed information can be found here.