WordSegment

  •        44

This project is used to segment text into tokens according its context and semantic. the segment use front-maximum matching and CRF algorithms to split text.

http://wordseg.codeplex.com/

Tags
Implementation
License
Platform

   




Related Projects

windoze-wordseg


A simple word segment lib for Chinese

CRFSharp


CRFSharp is Conditional Random Fields implemented by .NET(C#), a machine learning algorithm for learning from labeled sequences of examples.

MMSEGO - Chinese word splitting algorithm MMSEG in GO


This is a GO implementation of MMSEG which a Chinese word splitting algorithm.

jieba - 结巴中文分词


"Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best Python Chinese word segmentation module.

WordBreaker


WordBreaker is a fun iPhone game where you attempt to guess the computer's secret word before the computer guesses yours, using logic and deduction. The game is based on Jotto and similar to MasterMind.



stanford-segment - stanford chinese segment ,train segment


stanford chinese segment ,train segment

CRF++


CRF++ is a simple, customizable, and open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data. CRF++ is designed for generic purpose and will be applied to a variety of NLP tasks.

chinese-segment - A Chinese Segment Programm.


A Chinese Segment Programm.

cl-chinese-segment - A Chinese segment package in common lisp


A Chinese segment package in common lisp

gkseg


Yet another Chinese word segmentation package based on character-based tagging heuristics and CRF algorithm

segment - A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29


You can use a bufio.Scanner with the SplitWords implementation of SplitFunc. The SplitWords function will identify the appropriate word boundaries in the input text and the Scanner will return tokens at the appropriate place.Sometimes you would also like information returned about the type of token. To do this we have introduce a new type named Segmenter. It works just like Scanner but additionally a token type is returned.

NLP-WSD - NLP Word-Sense Disambiguation


NLP Word-Sense Disambiguation

wordseg - Finnish compound word segmentation web service


Finnish compound word segmentation web service

esa-wordseg - An implementation of the ESA unsupervised word segmentation algorithm in Clojure.


An implementation of the ESA unsupervised word segmentation algorithm in Clojure.

ChineseWordSegmentation - Segment Chinese sentences into separated words.


Segment Chinese sentences into separated words.

scws - chinese segment


chinese segment

finalseg - Chinese Words Segment Library based on HMM model


Chinese Words Segment Library based on HMM model