UTF8-CPP - UTF-8 with C++ in a Portable Way

  •        632

UTF8-CPP is a small generic library to handle UTF-8 encoded Unicode strings.

http://utfcpp.sourceforge.net/
https://github.com//ledger/utfcpp

Tags
Implementation
License
Platform

   




Related Projects

utf8

  •    Javascript

utf8.js is a well-tested UTF-8 encoder/decoder written in JavaScript. Unlike many other JavaScript solutions, it is designed to be a proper UTF-8 encoder/decoder: it can encode/decode any scalar Unicode code point values, as per the Encoding Standard. Here’s an online demo.A string representing the semantic version number.

Awesome-Unicode - :joy: :ok_hand: A curated list of delightful Unicode tidbits, packages and resources

  •    Javascript

A curated list of delightful Unicode tidbits, packages and resources.Please read the contribution guidelines before contributing. Key Unicode terminology is defined in the glossary.

Utf8Json - Definitely Fastest and Zero Allocation JSON Serializer for C#(NET,

  •    CSharp

Definitely Fastest and Zero Allocation JSON Serializer for C#(.NET, .NET Core, Unity and Xamarin), this serializer write/read directly to UTF8 binary so boostup performance. And I adopt the same architecture as the fastest binary serializer, MessagePack for C# that I've developed.This benchmark is convert object to UTF8 and UTF8 to object benchmark. It is not to string(.NET UTF16), so Jil, NetJSON and Json.NET contains additional UTF8.GetBytes/UTF8.GetString call. Definitely means does not exists encoding/decoding cost. Benchmark code is in sandbox/PerfBenchmark by BenchmarkDotNet.

utf8.h - single header utf8 string functions for C and C++

  •    C

A simple one header solution to supporting utf8 strings in C and C++. The current supported compilers are gcc, clang and msvc.


Voidspace Python Guestbook

  •    Javascript

The Voidspace Python Guestbook. A guestbook script for websites, written in Python. Features anti spam protection, smilies, javascript form validation, email notification, fully customisable with (X)HTML templates, entries in UTF8 (unicode), and more.

string - Provides an object-oriented API to strings and deals with bytes, UTF-8 code points and grapheme clusters in a unified way

  •    PHP

The String component provides an object-oriented API to strings and deals with bytes, UTF-8 code points and grapheme clusters in a unified way.

Dictionaries - Hunspell UTF8 dictionaries. These work with Sublime Text. [Spell check]

  •    Python

The following repository contains some UTF8-ready dictionaries for the spell checker feature of Sublime Text. Most of them were downloaded from the Open Office list. Credits to the people working on these! Read every LANG.txt for details.

forceutf8 - PHP Class Encoding featuring popular Encoding::toUTF8() function --formerly known as forceUTF8()-- that fixes mixed encoded strings

  •    PHP

PHP Class Encoding featuring popular \ForceUTF8\Encoding::toUTF8() function --formerly known as forceUTF8()-- that fixes mixed encoded strings. If you apply the PHP function utf8_encode() to an already-UTF8 string it will return a garbled UTF8 string.

multidiff - Binary data diffing for multiple objects or streams of data

  •    Python

It's purpose is to make machine friendly data easier to understand by humans that are looking at it. Specifically multidiff helps in viewing the differences within a large set of objects by doing diffs between relevant objects and displaying them in a sensible manner. This kind of visualization is handy when looking for patterns and structure in proprietary protocols or weird file formats. The obvious use-cases are reverse engineering and binary data analysis. At the core of multidiff is the python difflib library and multidiff wraps it in data providing mechanisms and visualization code. The visualization is the most important part of the project and everything else is just utilities to make it easier to feed data for the visualizer. At this time the tool can do basic format parsing such as hex decoding, hexdumping, and handling data as utf8 strings, as well as read from files, stdin, and sockets. Any preprocessing such as cropping, indenting, decompression, etc. will have be done by the user before the objects are provided to multidiff.

Tantivy - Full-text search engine library inspired by Lucene and written in Rust

  •    Rust

Tantivy is a full text search engine library written in rust. It is closer to Lucene than to Elastic Search and Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine.

Khmer Unicode Converter

  •    CSharp

Khmer Unicode converter is a .NET library that converts Khmer text from legacy font to Unicode font and vice-versa. This library developed base on Khmer Converter from KhmerOS (http://www.khmeros.info). All the codes in this library is converted from Python version of Khmer Co...

globalize - A JavaScript library for internationalization and localization that leverages the official Unicode CLDR JSON data

  •    Javascript

A JavaScript library for internationalization and localization that leverage the official Unicode CLDR JSON data. The library works both for the browser and as a Node.js module. Each language, and the countries that speak that language, have different expectations when it comes to how numbers (including currency and percentages) and dates should appear. Obviously, each language has different names for the days of the week and the months of the year. But they also have different expectations for the structure of dates, such as what order the day, month and year are in. In number formatting, not only does the character used to delineate number groupings and the decimal portion differ, but the placement of those characters differ as well.

utf8proc - a clean C library for processing UTF-8 Unicode data

  •    C

utf8proc is a small, clean C library that provides Unicode normalization, case-folding, and other operations for data in the UTF-8 encoding. It was initially developed by Jan Behrens and the rest of the Public Software Group, who deserve nearly all of the credit for this package. With the blessing of the Public Software Group, the Julia developers have taken over development of utf8proc, since the original developers have moved to other projects. The utf8proc package is licensed under the free/open-source MIT "expat" license (plus certain Unicode data governed by the similarly permissive Unicode data license); please see the included LICENSE.md file for more detailed information.

utf8proc - a clean C library for processing UTF-8 Unicode data

  •    C

utf8proc is a small, clean C library that provides Unicode normalization, case-folding, and other operations for data in the UTF-8 encoding. It was initially developed by Jan Behrens and the rest of the Public Software Group, who deserve nearly all of the credit for this package. With the blessing of the Public Software Group, the Julia developers have taken over development of utf8proc, since the original developers have moved to other projects. The utf8proc package is licensed under the free/open-source MIT "expat" license (plus certain Unicode data governed by the similarly permissive Unicode data license); please see the included LICENSE.md file for more detailed information.

Open Layer for Unicode

  •    C++

An open-source friendly replacement library for the Microsoft Layer for Unicode. This library allows a unicode Windows application to run unchanged on all versions of Windows, including Windows 95, 98 and ME.

PDFJet - PDF library for Java and .NET

  •    Java

PDFjet is a high performance PDF library for Java and .NET. It has support of drawing points, lines, box, polygons etc. It supports unicode text, embedding images, embedding hyperlinks and lot more. Its simple to use table class helps to generate flexible reports.

Lexical Analyzer Generator Quex

  •    C++

Generator of extremely fast lexical analysers. Sophisticated input/buffer management. Many character encodings (incl. ASCII, UTF8, UTF16, RUSCII, ...) are directly supported. Regular expressions are specified in the lex/flex style.

diycms

  •    PHP

This is the effort to produce an easy-to-use CMS in PHP5 PostgreSQL Native. The interface is in italian language but, because the db is UTF8, multilanguage support is in the todo list.

simple swing database

  •    Java

Simple swing database is a small platform indipendent database which stores records in a file format of csv.The program was for my friend,who wanted to have access to the data in a file format of CSV (utf8).Being able to search/edit/remove/add new record






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.