JapaneseTokenizers - aim to use JapaneseTokenizer as easy as possible

  •        5

This project aims to call tokenizers and split a sentence into tokens as easy as possible. And, this project supports various Tokenization tools common interface. Thus, it's easy to compare output from various tokenizers.

https://github.com/Kensuke-Mitsuzawa/JapaneseTokenizers

Tags
Implementation
License
Platform

   




Related Projects

kagome - Self-contained Japanese Morphological Analyzer written in pure Go

  •    Go

Kagome is an open source Japanese morphological analyzer written in pure golang. The MeCab-IPADIC and UniDic (unidic-mecab) dictionary/statiscal models are packaged in Kagome binary. Kagome has segmentation mode for search such as Kuromoji.

MeCab

  •    C++

MeCab is a fast and customizable Japanese morphological analyzer. MeCab is designed for generic purpose and applied to variety of NLP tasks, such as Kana-Kanji conversion. MeCab provides parameter estimation functionalities based on CRFs and HMM

Dictionary Lookup Tool

  •    CSharp

Dictionary tool to assist Chinese and Japanese language learners when viewing web pages or other text documents. Automatically looks up and translates text on the clipboard. Requires .NET Framework and language support for Chinese/Japanese

?????? Makoto Chan Dictionary

  •    

Makoto Chan is a simple Japanese (EN<>JP) dictionary & Kanji look up tool you can search for the following -Kanji with many info (Currently unavailable ) -English Words -Japanese words(Kana & Kanji) -Names Developed with C#.

gse - Go efficient text segmentation; support english, chinese, japanese and other. Go 语言高性能分词

  •    Go

Go efficient text segmentation; support english, chinese, japanese and other. Dictionary with double array trie (Double-Array Trie) to achieve, Sender algorithm is the shortest path based on word frequency plus dynamic programming.


GJITEN, Japanese dictionary for GNOME

  •    C

Gjiten is a Japanese dictionary for GNOME with advanced word and kanji lookup features. Requires dictionary files (edict, kanjidic) to function.

zkanji - Japanese Language Study Suite

  •    C++

zkanji is a feature rich Japanese language study suite and dictionary for Windows. It has several kanji look-up methods, optional example sentences for many Japanese words, vocabulary printing, JLPT levels indicated for words and kanji for all N levels, spaced-repetition system for studying and more. Visit http://zkanji.sourceforge.net for details

spark-nlp - Natural Language Understanding Library for Apache Spark.

  •    Jupyter

John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment. This library has been uploaded to the spark-packages repository https://spark-packages.org/package/JohnSnowLabs/spark-nlp .

Suiteki

  •    Java

Suiteki is a set of Java J2ME japanese related tools for your mobile phone; word dictionary, kanji dictionary, word lists, flashcards and even a manga viewer! Kana/kanji input/display for non japanese phones! (please see http://suiteki.sourceforge.net)

slowechko

  •    

Troll\\\\\\\'s QT based dictionary shell. Support (tested) Russian and Japanese. Use simple, wide availabel dictionary format, HTML formatting, incremental search, Romanji to Kana conversion on input. English-Russian, Russian-English and Japanese

gWaei, Japanese Dictionary for GNOME

  •    C

gWaei is an easy to use and yet powerful dictionary program for Japanese to English translation. It organizes results by relevance, supports regex searches, tabs, spell checking, kanji handwriting recognition and an console interface.

DictionaryForMIDs

  •    Java

DictionaryForMIDs is an dictionary application for cell phones, tablets and PCs. The dictionary is completely installed on the device (quot;offline dictionaryquot;), i.e. after installation there is no need for an internet connection. DictionaryForMIDs can be set up for any dictionary, for any language, or for any other lookup-purpose. The DfM-Creator tool is used to set up a dictionary for use with DictionaryForMIDs.

style-dictionary - A build system for creating cross-platform styles.

  •    Javascript

Style once, use everywhere. A Style Dictionary is a system that allows you to define styles once, in a way for any platform or language to consume. A single place to create and edit your styles, and a single command exports these rules to all the places you need them - iOS, Android, CSS, JS, HTML, sketch files, style documentation, etc. It is available as a CLI through npm, but can also be used like any normal node module if you want to extend its functionality.

The Dictionary System

  •    Javascript

The application Dictionary System (DS) is a web application designed for creation of one-way bilingual dictionaries or encyclopaedias offering a working environment for creation of a dictionary and a web page which enables the general public to search in the dictionary. It is so-called DWS application (Dictionary Writing System) or DPS (Dictionary Production / Publishing System). Aplikace Dictionary System (daacute;le DS) je webovaacute; aplikace. Je to tzv. DWS aplikace (Dictionary Writin

WaJEi

  •    

WaJEi is a Japanese-English dictionary, using Jim Breens edict and kanjidic dictionary files, written in Java and designed for the Sharp Zaurus SL-5500. It offers lookups phonetically (by kana) and reverse lookups of jukugo by kanji using

JMDict.NET

  •    CSharp

JMDict.NET stands for Japanese Multi-lingual Dictionary for .NET platform based on the JMdict dictionary base by Jim Breen (http://www.csse.monash.edu.au/~jwb/j_jmdict.html).

Dict protocol J2SE implementation

  •    Java

J2SE implementation of the Dictionary Server Protocol (DICT) that allows a client to access dictionary definitions from a set of natural language dictionary databases.

Atlantida Multilingual Dictionary

  •    Java

Atlantida is an open source multilingual dictionary written in Java. It can translate words from one language to another and pronounce them. As of version alpha 0.15, Atlantida uses XDXF dictionary format.

define - A command-line dictionary (thesaurus) app, with access to multiple sources, written in Go.

  •    Go

A command-line dictionary (thesaurus) app, with access to multiple sources, written in Go. Pre-compiled binaries are available on the releases page.