jchardet - Charset detection algorithm in Java
jchardet is a java port of the source from mozilla's automatic charset detection algorithm. The original author is Frank Tang. What is available here is the java port of that code.The original source in C++ can be found from http://lxr.mozilla.org/mozilla/source/intl/chardet/ More information can be found at http://www.mozilla.org/projects/intl/chardet.html
http://jchardet.sourceforge.net/
comments powered by Disqus
Related Products
TextCat
TextCat written in Perl helps to identify 69 natural langauge.
UIMA - Unstructured information management architecture
UIMA analyzes large volumes of unstructured information in order to discover knowledge that is relevant to an end user. It is a framework with different set of components. The components include Language Identification, Language specific segmentation, Sentence boundary detection, Entity detection (person/place names) etc. The framework manages these components and the data flows between them.
Telosys - Global framework on AJAX and JavaEE
Telosys is a lightweight global framework based on AJAX and standard JavaEE technologies. Its pragmatic approach allows to build easily business web applications. Telosys is a self-sufficient solution that covers seamlessly all the application layers (presentation, persistence, services, navigation, internationalization, authentication, etc…).
Django - Python Web framework
Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Its feature include Admin site, Authentication, Internationalization, Jython support, Pagnition, Session management, Siemap, Feeds, Caching, Signals, Comments and lot more.
Monotone
Monotone is a free distributed version control system. It provides a simple, single-file transactional version store, with fully disconnected operation and an efficient peer-to-peer synchronization protocol. It understands history-sensitive merging, lightweight branches, integrated code review and 3rd party testing. It uses cryptographic version naming and client-side RSA certificates.
Jasper Reports
JasperReports is the world's most popular open source reporting engine. It is entierly written in Java and it is able to use data coming from any kind of data source and produce pixel-perfect documents that can be viewed, printed or exported in a variety of document formats including HTML, PDF, Excel, OpenOffice and Word.
Wymeditor
WYMeditor is a web-based WYSIWYM (What You See Is What You Mean) XHTML editor (not WYSIWYG). WYMeditor has been created to generate perfectly structured XHTML strict code, to conform to the W3C XHTML specifications and to facilitate further processing by modern applications.
Dokuwiki - simple to use Wiki
DokuWiki is a standards compliant, simple to use Wiki, mainly aimed at creating documentation of any kind. It is targeted at developer teams, workgroups and small companies. It has a simple but powerful syntax which makes sure the datafiles remain readable outside the Wiki and eases the creation of structured texts. All data is stored in plain text files – no database is required.
PEAR Framework - reusable PHP components
PEAR is a framework and distribution system for reusable PHP components. It has all categories of components from DB access, security, xml parsing, encryption etc.
VosaoCMS - simple CMS for Google App Engine
Vosao (vo-za) is a content management system (CMS) that enables you to build web sites and online applications on the Google App Engine platform for Java.