Displaying 1 to 20 from 38 results

Stringy - A PHP string manipulation library with multibyte support

  •    PHP

A PHP string manipulation library with multibyte support. Compatible with PHP 5.4+, PHP 7+, and HHVM.Refer to the 1.x branch or 2.x branch for older documentation.

Awesome-Unicode - :joy: :ok_hand: A curated list of delightful Unicode tidbits, packages and resources

  •    Javascript

A curated list of delightful Unicode tidbits, packages and resources.Please read the contribution guidelines before contributing. Key Unicode terminology is defined in the glossary.

jurl - Fast and simple URL parsing for Java, with UTF-8 and path resolving support

  •    Java

Fast and simple URL parsing for Java, with UTF-8 and path resolving support.The recommended medium to report and track issues is by opening one on Github.




gbk - Convert gbk to utf-8 made easy

  •    Javascript

Convert gbk to utf-8 made easy.The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

ustring - Simple C library for safely handling utf8 strings

  •    C

Simple C library to provide safer utf8 string functions similiar to those found in the stdlib.All functions in ustring.c are documented. Note that C99 is required.

strip-bom - Strip UTF-8 byte order mark (BOM) from a string

  •    Javascript

The Unicode Standard permits the BOM in UTF-8, but does not require nor recommend its use. Byte order has no meaning in UTF-8.

strip-bom-buf - Strip UTF-8 byte order mark (BOM) from a buffer

  •    Javascript

The Unicode Standard permits the BOM in UTF-8, but does not require nor recommend its use. Byte order has no meaning in UTF-8.


strip-bom-cli - Strip UTF-8 byte order mark (BOM)

  •    Javascript

The Unicode Standard permits the BOM in UTF-8, but does not require nor recommend its use. Byte order has no meaning in UTF-8.

strip-bom-stream - Strip UTF-8 byte order mark (BOM) from a stream

  •    Javascript

The Unicode Standard permits the BOM in UTF-8, but does not require nor recommend its use. Byte order has no meaning in UTF-8.It's a Transform stream.

characteristics - Character info under different encodings

  •    Ruby

Extra data is available for Unicode characters (see below). The unibits and uniscribe gems makes use of this data to visualize it accordingliy.

utf8-bytes - return an array of bytes from a unicode string

  •    Javascript

This module is like Buffer(str).toJSON(), but without using Buffer.Return an array of integers from 0 through 255, inclusive, representing the bytes in the unicode string str.

stringz - :100: Zero dependency unicode-aware string tools for NodeJS

  •    Javascript

A really small, performant, zero-dependency, unicode-aware library for working with Strings in Node.js. Javascript has a serious problem with unicode. Even ES6 can’t solve the problem entirely since some characters like the new colored emojis are three bytes instead of two bytes. Sometimes even more! "๐Ÿ‘๐Ÿฝ".length returns 4 which is totally wrong (hint: it should be 1!). ES6's Array.from tried to solve this, but that even fails: Array.from("๐Ÿ‘๐Ÿฝ") returns ["๐Ÿ‘", "๐Ÿฝ"] which is incorrect. This library tries to tackle all these problems with a mega RegExp. Read More Here.

string_theory - Flexible C++11 string library with type-safe formatting

  •    C++

String Theory is a flexible C++11 library for string manipulation and storage. It stores data internally as UTF-8, for ease of use with exiting C/C++ APIs. It can also handle conversion to and from UTF-16 and UTF-32, and has a variety of methods for easier text manipulation. In addition, if your compiler supports it, String Theory includes a powerful type-safe string formatter (ST::format), which can be extended with custom type formatters by end-user code.

utf8-encoding - utf8 encoder/decoder of whatwg Encoding Living Standard https://encoding

  •    TypeScript

utf8 encoder/decoder of whatwg Encoding Living Standard https://encoding.spec.whatwg.org/

babel-plugin-utf-8-regex - transforms regexes like /\p{Letter}/ into working js regexes

  •    Javascript

Transforms a regular expression like /^\p{Cyrillic}+$/ to /^[ะ€-า„า‡-ิงแดซแตธโท -โทฟ๊™€-๊š—๊šŸ]+$/. This Plugin was inspired by the unicode addon of xregexp. The list of possible types are explained here or can be found in the code. The most useful probably is \p{L} or its alias \p{Letter} which indicates any letter in any alphabet.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.