Displaying 20 to 40 from 2151944 results

transliteration - Transliteration data and models


Transliteration related data files and/or models. The transliteration models provided are recurrent neural networks trained with a CTC loss. For a detailed description of the models, see the paper.

site-kit-wp - Site Kit is is a one-stop solution for WordPress users to use everything Google has to offer to make them successful on the web

  •    Javascript

Site Kit is is a one-stop solution for WordPress users to use everything Google has to offer to make them successful on the web. Any kind of contributions to Site Kit by Google are welcome. Please read the contributing guidelines to get started.


  •    C++

This project contains an implementation of the "Private Join and Compute" functionality. This functionality allows two users, each holding an input file, to privately compute the sum of associated values for records that have common identifiers. Then the Private Join and Compute functionality would allow the Client to learn that the input files had 2 identifiers in common, and that the associated values summed to 40. It does this without revealing which specific identifiers were in common (Ada and Ruby in the example above), or revealing anything additional about the other identifiers in the two parties' data set.

pix-image-viewer - Desktop image viewer. View thousands of images in a zoomable, pannable grid.

  •    Rust

Explore tens of thousands of images in a grid. Use mouse or keyboard to zoom or pan around the image grid. Heavily inspired by Galapix.

myanmar-tools - Detect and convert the Zawgyi-One font encoding in C++, Java, JavaScript, PHP, and Ruby

  •    Java

This project includes tools for processing font encodings used in Myanmar, currently with support for the widespread Zawgyi-One font encoding. For more information on font encodings in Myanmar, read the Unicode Myanmar FAQ. Conversion is also available via ICU in languages without support via Myanmar Tools; see "Zawgyi-to-Unicode Conversion" below.

minijail - sandboxing and containment tool used in Chrome OS and Android


The Minijail homepage and main repo is https://android.googlesource.com/platform/external/minijail/. Minijail is a sandboxing and containment tool used in Chrome OS and Android. It provides an executable that can be used to launch and sandbox other programs, and a library that can be used by code to sandbox itself.

mediapipe - MediaPipe is a cross-platform framework for building multimodal applied machine learning pipelines

  •    C++

MediaPipe is a framework for building multimodal (eg. video, audio, any time series data) applied ML pipelines. With MediaPipe, a perception pipeline can be built as a graph of modular components, including, for instance, inference models (e.g., TensorFlow, TFLite) and media processing functions. Follow these instructions.

libaddressinput - Google’s postal address library, powering Android and Chromium

  •    C++

The libaddressinput project consists of two different libraries (one implemented in C++, one implemented in Java for Android) that use address metadata from Google's Address Data Service to assist application developers in collecting and handling postal addresses from all over the world. These libraries can provide information about what input fields are required for a correct address input form for any country in the world and can validate an address to highlight input errors like missing required fields or invalid values.

language-resources - Datasets and tools for basic natural language processing.

  •    Python

Datasets and scripts for basic natural language and speech processing. This is not an official Google product.


  •    Java

i18n-sanitychecker is a library that allows developer to write unit-test for locale-sensitive functions without relying on "golden data". i18n-sanitychecker provides only one public method to match actual strings against expected patterns. Patterns may contain placeholders for date, time, numbers, lists. The method behaves similar to other assert methods of JUnit.


  •    Javascript

This repo contains various of input tools. Under "client", it is open source'd Chinese Pinyin IME for Windows.

emoji4unicode - Automatically exported from code.google.com/p/emoji4unicode

  •    Python

Automatically exported from code.google.com/p/emoji4unicode

emoji-segmenter - Emoji Segmenter

  •    C

This repository contains a Ragel grammar and generated C code for segmenting runs of text into text-presentation and emoji-presentation runs. It is currently used in projects such as Chromium and Pango for deciding which preferred presentation, color or text, a run of text should have. This API call will scan emoji_text_iter_t p for the next grammar-token and return an iterator that points to the end of the next token. An end iterator needs be specified as pe so that the scanner can compare against this and knows where to stop. In the reference parameter is_emoji it returns whether this token has emoji-presentation text-presentation.

data-driven-discretization-1d - Code for "Data-driven discretization"

  •    Python

This is not an official Google product. Note that Python 3 is required. Dependencies for the core library (including TensorFlow) are specified in setup.py and should be installed automatically as required.

cpp-async-rpc - Library for Asynchronicity, Serialization and Remoting

  •    C++

This is cpp-async-rpc, a C++17 library supporting template meta-programming, asynchronous network programming, binary serialization and RPC. Disclaimer: This is not a Google supported product.

corpuscrawler - Crawler for linguistic corpora

  •    Python

Corpus Crawler is a tool for Corpus Linguistics. Modern linguistic research works on language corpora, which are large samples of “real world” text. This crawler helps to build such corpora: it follows links to publicly accessible web pages known to be written in a certain language; it removes boilerplate and HTML markup; finally, it writes its output into plaintext files. The crawler implements the Robots Exclusion Standard, and it is intentionally slow so it does not cause much load on the crawled web sites.

certificate-transparency-java - Auditing for TLS certificates, Java code.

  •    Java

A application used to communicate with certificate transparency log servers.


  •    Python

This project converts Bazel BUILD and WORKSPACE files to CMakeLists.txt. The Bazel BUILD file serves as your source of truth, and this tool generates an idiomatic CMakeLists.txt so users can do a CMake-based build. This is not an official Google product.


  •    Shell

Recipes for using open-source ASR corpora with Kaldi. This is not an official Google product.