Displaying 1 to 20 from 102 results

pangu.js - 為什麼你們就是不能加個空格呢?

  •    Javascript

Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width characters (alphabetical letters, numerical digits and symbols).

OpenCC - A project for conversion between Traditional and Simplified Chinese

  •    C++

Open Chinese Convert (OpenCC, 開放中文轉換) is an opensource project for conversion between Traditional Chinese and Simplified Chinese, supporting character-level conversion, phrase-level conversion, variant conversion and regional idioms among Mainland China, Taiwan and Hong kong.




learn-with-open-source - 借助开源项目,学习软件开发

  •    Shell

This book uses GitBook to build. Welcome! Join us to finish this book.

chinese-poetry - 最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。

  •    Python

最全的中华古典文集数据库, 包含5.5万首唐诗、26万首宋诗和2.1万首宋词. 唐宋两朝近1.4万古诗人, 和两宋时期1.5K词人. 数据来源于互联网. 为什么要做这个仓库? 古诗是中华民族乃至全世界的瑰宝, 我们应该传承下去, 虽然有古典文集, 但大多数人并没有拥有这些书籍. 从某种意义上来说, 这些庞大的文集离我们是有一定距离的。而电子版方便拷贝, 所以此开源数据库诞生了. 你可以用此数据做任何有益的事情, 甚至我也可以帮助你.


node-segment - 基于Node.js的中文分词模块

  •    Javascript

Chinese word segmentation 中文分词模块

raft-zh_cn - Raft一致性算法论文的中文翻译

  •    

Raft一致性算法论文的中文翻译

weibo_terminater - Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything

  •    Python

Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator

Chinese-Word-Vectors - 100+ Chinese Word Vectors 上百种预训练中文词向量

  •    Python

This project provides 100+ Chinese Word Vectors (embeddings) trained with different representations (dense and sparse), context features (word, ngram, character, and more), and corpora. One can easily obtain pre-trained vectors with different properties and use them for downstream tasks. Moreover, we provide a Chinese analogical reasoning dataset CA8 and an evaluation toolkit for users to evaluate the quality of their word vectors.

gse - Go efficient text segmentation; support english, chinese, japanese and other. Go 语言高性能分词

  •    Go

Go efficient text segmentation; support english, chinese, japanese and other. Dictionary with double array trie (Double-Array Trie) to achieve, Sender algorithm is the shortest path based on word frequency plus dynamic programming.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.