Displaying 1 to 20 from 33 results

transformers - 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX

  •    Python

🤗 Transformers provides thousands of pretrained models to perform tasks on texts such as classification, information extraction, question answering, summarization, translation, text generation and more in over 100 languages. Its aim is to make cutting-edge NLP easier to use for everyone. 🤗 Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub. At the same time, each python module defining an architecture is fully standalone and can be modified to enable quick research experiments.

spago - Self-contained Machine Learning and Natural Language Processing library in Go

  •    Go

A Machine Learning library written in pure Go designed to support relevant neural architectures in Natural Language Processing. spaGO is self-contained, in that it uses its own lightweight computational graph framework for both training and inference, easy to understand from start to finish.




scibert - A BERT model for scientific text.

  •    Python

SciBERT is a BERT model trained on scientific text. SciBERT is trained on papers from the corpus of semanticscholar.org. Corpus size is 1.14M papers, 3.1B tokens. We use the full text of the papers in training, not just abstracts.

ERNIE - Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond

  •    Python

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.


Haystack - Build a natural language interface for your data

  •    Python

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want to perform Question Answering or semantic document search, you can use the State-of-the-Art NLP models in Haystack to provide unique search experiences and allow your users to query in natural language. Haystack is built in a modular fashion so that you can combine the best technology from other open-source projects like Huggingface's Transformers, Elasticsearch, or Milvus.

TurboTransformers - a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU

  •    C++

The WeChat AI open-sourced TurboTransformers with the following characteristics. TurboTransformers has been applied to multiple online BERT service scenarios in Tencent. For example, It brings 1.88x acceleration to the WeChat FAQ service, 2.11x acceleration to the public cloud sentiment analysis service, and 13.6x acceleration to the QQ recommendation system. Moreover, it has already been applied to build services such as Chitchating, Searching, and Recommendation.

EasyTransfer - EasyTransfer is designed to make the development of transfer learning in NLP applications easier

  •    Python

EasyTransfer is designed to make the development of transfer learning in NLP applications easier. The literature has witnessed the success of applying deep Transfer Learning (TL) for many real-world NLP applications, yet it is not easy to build an easy-to-use TL toolkit to achieve such a goal. To bridge this gap, EasyTransfer is designed to facilitate users leveraging deep TL for NLP applications at ease. It was developed in Alibaba in early 2017, and has been used in the major BUs in Alibaba group and achieved very good results in 20+ business scenarios. It supports the mainstream pre-trained ModelZoo, including pre-trained language models (PLMs) and multi-modal models on the PAI platform, integrates the SOTA models for the mainstream NLP applications in AppZoo, and supports knowledge distillation for PLMs. EasyTransfer is very convenient for users to quickly start model training, evaluation, offline prediction, and online deployment. It also provides rich APIs to make the development of NLP and transfer learning easier.

AliceMind - ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

  •    Python

This repository provides pre-trained encoder-decoder models and its related optimization techniques developed by Alibaba's MinD (Machine IntelligeNce of Damo) Lab. StructVBERT (March 15, 2021): pre-trained models for vision-language understanding. We propose a new single-stream visual-linguistic pre-training scheme by leveraging multi-stage progressive pre-training and multi-task learning. StructVBERT obtained the 2020 VQA Challenge Runner-up award, and SOTA result on VQA 2020 public Test-standard benchmark (June 2020). "Talk Slides" (CVPR 2020 VQA Challenge Runner-up).

CodeSearchNet - Datasets, tools, and benchmarks for representation learning of code.

  •    Jupyter

We would like to thank all participants for their submissions and we hope that this challenge provided insights to practitioners and researchers about the challenges in semantic code search and motivated new research. We would like to encourage everyone to continue using the dataset and the human evaluations, which we now provide publicly. Please, see below for details, specifically the Evaluation section. No new submissions to the challenge will be accepted.

spacy-transformers - 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

  •    Python

This package provides spaCy components and architectures to use transformer models via Hugging Face's transformers in spaCy. The result is convenient access to state-of-the-art transformer architectures, such as BERT, GPT-2, XLNet, etc. This release requires spaCy v3. For the previous version of this library, see the v0.6.x branch.

nlp-architect - A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

  •    Python

NLP Architect is an open source Python library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing and Natural Language Understanding Neural Networks. NLP Architect is an NLP library designed to be flexible, easy to extend, allow for easy and rapid integration of NLP models in applications and to showcase optimized models.

bigbird - Transformers for Longer Sequences

  •    Python

Not an official Google product. BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.

bluebert - BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).

  •    Python

We uploaded the preprocessed PubMed texts that were used to pre-train the BlueBERT models. This repository provides codes and models of BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III). Please refer to our paper Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets for more details.

node-question-answering - Fast and production-ready question answering in Node.js

  •    TypeScript

It can run models in SavedModel and TFJS formats locally, as well as remote models thanks to TensorFlow Serving. The following example will automatically download the default DistilBERT model in SavedModel format if not already present, along with the required vocabulary / tokenizer files. It will then run the model and return the answer to the question.

rust-erl-ext - Erlang external term format codec for Rust language.

  •    Rust

Erlang external term format parser/serializer for Rust. More examples are in examples directory.

node-bertrpc - BERT-RPC client and server for node.js

  •    Javascript

BERT-RPC is a schemaless binary remote procedure call protocol






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.