tldextract - Accurately separate the TLD from the registered domain and subdomains of a URL, using the Public Suffix List

  •        942

tldextract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and subdomains of a URL. For example, say you want just the 'google' part of 'http://www.google.com'. Everybody gets this wrong. Splitting on the '.' and taking the last 2 elements goes a long way only if you're thinking of simple e.g. .com domains. Think parsing http://forums.bbc.co.uk for example: the naive splitting method above will give you 'co' as the domain and 'uk' as the TLD, instead of 'bbc' and 'co.uk' respectively.

https://github.com/john-kurkowski/tldextract

Tags
Implementation
License
Platform

   




Related Projects

tld.js - JavaScript API to work easily with complex domain names, subdomains and well-known TLDs.

  •    Javascript

tld.js is a Node.js module written in JavaScript to work against complex domain names, subdomains and well-known TLDs. It answers with accuracy to questions like what is mail.google.com's domain?, what is a.b.ide.kyoto.jp's subdomain? and is https://big.data's TLD a well-known one?.

php-domain-parser - Public Suffix List based domain parsing implemented in PHP

  •    PHP

PHP Domain Parser is a Public Suffix List based domain parser implemented in PHP. While there are plenty of excellent URL parsers and builders available, there are very few projects that can accurately parse a url into its component subdomain, registrable domain, and public suffix parts.

publicsuffix-ruby - Domain Name parser based on the Public Suffix List.

  •    Ruby

Domain Name parser based on the Public Suffix List.

publicsuffix-ruby - Domain name parser for Ruby based on the Public Suffix List.

  •    Ruby

PublicSuffix is a Ruby domain name parser based on the Public Suffix List. For an older versions of Ruby use a previous release.

domainatrix - A cruel mistress that uses the public suffix domain list to dominate URLs by canonicalizing, finding the public suffix, and breaking them into their domain parts

  •    Ruby

A cruel mistress that uses the public suffix domain list to dominate URLs by canonicalizing, finding public suffixes, and breaking them into their domain parts. This simple library can parse a URL into its canonical form. It uses the list of domains from http://publicsuffix.org to break the domain into its public suffix, domain, and subdomain.


Countries - Countries, Languages & Continents data (capital and currency, native name, calling codes)

  •    Javascript

Continents & countries: ISO 3166-1 alpha-2 code, name, ISO 639-1 languages, capital and currency, native name, calling codes. Lists are available in JSON, CSV and SQL formats. Also, contains separate JSON files with additional country Emoji flags data. This version changes a lot in the data structures, and placement of the files. So, if your projects depend on the old structure — specify previous versions, <2.0.0.

ISO-3166-Countries-with-Regional-Codes - ISO 3166-1 country lists merged with their UN Geoscheme regional codes in ready-to-use JSON, XML, CSV data sets

  •    Ruby

These lists are the result of merging data from two sources, the Wikipedia ISO 3166-1 article for alpha and numeric country codes, and the UN Statistics site for countries' regional, and sub-regional codes. In addition to countries, it includes dependent territories. The International Organization for Standardization (ISO) site provides partial data (capitalised and sometimes stripped of non-latin ornamentation), but sells the complete data set as a Microsoft Access 2003 database. Other sites give you the numeric and character codes, but there appeared to be no sites that included the associated UN-maintained regional codes in their data sets. I scraped data from the above two websites that is all publicly available already to produce some ready-to-use complete data sets that will hopefully save someone some time who had similar needs.

TLD.js - Kickstarter project for applying for a JavaScript top level domain (.js TLD)

  •    Javascript

Kickstarter project for applying for a JavaScript top level domain (.js TLD)

mcc-mnc-table - Mobile Country Codes (MCC) and Mobile Network Codes (MNC) table in CSV, JSON and XML

  •    Python

Mobile Country Codes (MCC) and Mobile Network Codes (MNC) table in CSV, JSON and XML. Updated monthly. The table is pulled from http://mcc-mnc.com/.

django-countries - A Django application that provides country choices for use with forms, flag icons static files, and a country field for models

  •    Python

A Django application that provides country choices for use with forms, flag icons static files, and a country field for models. For more accurate sorting of translated country names, install the optional pyuca package.

CountryPicker - CountryPicker is a custom UIPickerView subclass that provides an iOS control allowing a user to select a country from a list

  •    Objective-C

CountryPicker is a custom UIPickerView subclass that provides an iOS control allowing a user to select a country from a list. It can optionally display a flag next to each country name, and the library includes a set of 249 public domain flag images from https://github.com/koppi/iso-country-flags-svg-collection that have been renamed to work with the library. Note that the list of countries is based on the ISO 3166 country code standard (http://en.wikipedia.org/wiki/ISO_3166-1). This list excludes certain smaller countries, regarding them as part of a larger state. For example, England, Scotland, Wales and Northern Ireland are lumped together as Great Britain. For most purposes this is fine as it matches the convention used for locales, but if you need to specify additional countries, you can subclass and modify the countires list as described under "Subclassing" below.

CountryPicker

  •    Objective-C

CountryPicker is a custom UIPickerView subclass that provides an iOS control allowing a user to select a country from a list. It can optionally display a flag next to each country name, and the library includes a set of 249 high-quality, public domain flag images from FAMFAMFAM (http://www.famfamfam.com/lab/icons/flags/) that have been painstakingly re-named by country code to work with the library.

OpenUnReID - PyTorch open-source toolbox for unsupervised or domain adaptive object re-ID.

  •    Python

OpenUnReID is an open-source PyTorch-based codebase for both unsupervised learning (USL) and unsupervised domain adaptation (UDA) in the context of object re-ID tasks. It provides strong baselines and multiple state-of-the-art methods with highly refactored codes for both pseudo-label-based and domain-translation-based frameworks. It works with Python >=3.5 and PyTorch >=1.1. We are actively updating this repo, and more methods will be supported soon. Contributions are welcome.

purl - Purl is a simple Object Oriented URL manipulation library for PHP 5.3+

  •    PHP

A Fragment is made of a path and a query and comes after the hashmark (#). Purl can parse a URL in to parts and its canonical form. It uses the list of domains from http://publicsuffix.org to break the domain into its public suffix, registerable domain, subdomain and canonical form.

Java IP (InetAddress) Locator

  •    Java

Java and ColdFusion libraries to lookup country code and language from IP address. It uses a local copy of the WHOIS database to perform fast, accurate lookups of country codes. Useful for log analysis, internationalization, geolocation, etc..

country-flags - SVG and PNG renders of all countries' flags.

  •    Javascript

This repository contains renders of all the worlds flags in SVG and PNG format. The source files were taken from Wikipedia and are not under copyright protection since flags are effectively in public domain (there may be other restrictions on how the flag can be used though).

country-list - :globe_with_meridians: List of all countries with names and ISO 3166-1 codes in all languages and data formats

  •    HTML

List of all countries with names and ISO 3166-1 codes in all languages and all data formats. All formats are also available in multiple languages, please find full language list here.

TLD.js - Kickstarter project for applying for a JavaScript top level domain (.js TLD)

  •    HTML

Update: Closing signing for now, since we need to work through the two blocker issues, before this could move forward as a kickstarter or by other means of funding. Note: This kick-starter hasn't been submitted yet. Please read and help solve open bugs.

markdown - A Python implementation of John Gruber’s Markdown with Extension support.

  •    Python

This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though there are a few known issues. See Features for information on what exactly is supported and what is not. Additional features are supported by the Available Extensions. Installation and usage documentation is available in the docs/ directory of the distribution and on the project website at https://Python-Markdown.github.io/.

commonmark - Markdown parser for PHP based on the CommonMark spec.

  •    PHP

league/commonmark is a PHP-based Markdown parser created by Colin O'Dell which supports the full CommonMark spec. It is based on the CommonMark JS reference implementation by John MacFarlane (@jgm).Note: See Versioning for important information on which version constraints you should use.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.