These lists are the result of merging data from two sources, the Wikipedia ISO 3166-1 article for alpha and numeric country codes, and the UN Statistics site for countries' regional, and sub-regional codes. In addition to countries, it includes dependent territories. The International Organization for Standardization (ISO) site provides partial data (capitalised and sometimes stripped of non-latin ornamentation), but sells the complete data set as a Microsoft Access 2003 database. Other sites give you the numeric and character codes, but there appeared to be no sites that included the associated UN-maintained regional codes in their data sets. I scraped data from the above two websites that is all publicly available already to produce some ready-to-use complete data sets that will hopefully save someone some time who had similar needs.
region-codes countries iso csv json xml dataset dataA curated list of awesome JSON datasets that don't require authentication. Pro Tip: Check out Blockchain Data API for more options.
json-dataset json awesome awesome-list list data dataset datasetsThis repository contains compatibility data for Web technologies. Browser compatibility data describes which platforms (where "platforms" are usually, but not always, web browsers) support particular Web APIs. This data can be used in documentation, to build compatibility tables listing browser support for APIs. For example: Browser support for WebExtension APIs.
compatibility compat data dataset json browser-compat-data browser mdn mozillaDataset is a JavaScript library that makes managing the data behind client-side visualisations easy, including realtime data. It takes care of the loading, parsing, sorting, filtering and querying of datasets as well as the creation of derivative datasets. The following builds do not have any of the dependencies built in. It is your own responsibility to include them as appropriate script elements in your page.
data datasetCurated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas
deep-learning forecasting machine-learning classification series-forecasting image-classification awesome-list awesome dataset multi-label-classificationThe Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located. You can browse the recognized drawings on quickdraw.withgoogle.com/data. We're sharing them here for developers, researchers, and artists to explore, study, and learn from. If you create something with this dataset, please let us know by e-mail or at A.I. Experiments.
datasetThis is the official documentation for Quandl's Python Package. The package can be used to interact with the latest version of the Quandl RESTful API. This package is compatible with python v2.7.x and v3.x+. quandl.ApiConfig.api_version is optional however it is strongly recommended to avoid issues with rate-limiting. For premium databases, datasets and datatables quandl.ApiConfig.api_key will need to be set to identify you to our API. Please see API Documentation for more detail.
quandl api-client retrieve-data dataset data-frameFashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.
mnist deep-learning benchmark machine-learning dataset computer-vision fashion fashion-mnist gan zalando convolutional-neural-networksFaker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you. Faker is heavily inspired by PHP Faker, Perl Faker, and by Ruby Faker.
fake testing dataset fake-data test-data test-data-generatorThis is a dataset that I collected to train my own Raccoon detector with TensorFlow's Object Detection API. Images are from Google and Pixabay. In total, there are 200 images (160 are used for training and 40 for validation). See LICENSE for details. Copyright (c) 2017 Dat Tran.
dataset tensorflow-experimentsObjectron is a dataset of short object centric video clips with pose annotations. The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different angles. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. The dataset consists of 15K annotated video clips supplemented with over 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes. In addition, to ensure geo-diversity, our dataset is collected from 10 countries across five continents. Along with the dataset, we are also sharing a 3D object detection solution for four categories of objects — shoes, chairs, mugs, and cameras. These models are trained using this dataset, and are released in MediaPipe, Google's open source framework for cross-platform customizable ML solutions for live and streaming media.
machine-learning ai computer-vision deep-learning neural-network tensorflow augmented-reality pytorch dataset 3d 3d-reconstruction 3d-visionThe Waymo Open Dataset was first launched in August 2019 with a perception dataset comprising high resolution sensor data and labels for 1,950 segments. We have released the Waymo Open Dataset publicly to aid the research community in making advancements in machine perception and autonomous driving technology. We expanded the Waymo Open Dataset to also include a motion dataset comprising object trajectories and corresponding 3D maps for over 100,000 segments. We have updated this repository to add support for this new dataset. Please refer to the Quick Start.
dataset autonomous-drivingTry it out at udt.dev, download the desktop app or run on-premise. The Universal Data Tool is a web/desktop app for editing and annotating images, text, audio, documents and to view and edit any data defined in the extensible .udt.json and .udt.csv standard.
machine-learning csv computer-vision deep-learning image-annotation desktop dataset named-entity-recognition classification labeling image-segmentation hacktoberfest semantic-segmentation annotation-tool text-annotation labeling-tool entity-recognition annotate-images image-labeling-tool text-labelingAn NLP library with Awesome pre-trained Transformer models and easy-to-use interface, supporting wide-range of NLP tasks from research to industrial applications.
nlp dataset transformer seq2seq pretrained-models embedding bert ernie paddlenlpN.B: Mapping has been done to the level of ATT&CK technique (not procedure).
dfir dataset threat-hunting winlogbeat mitre-attack evtx windows-security detection-engineeringTensorFlow Datasets provides many public datasets as tf.data.Datasets. To install and use TFDS, we strongly encourage to start with our getting started guide. Try it interactively in a Colab notebook.
data machine-learning tensorflow numpy dataset datasets jaxdoccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequence to sequence tasks. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create a project, upload data and start annotating. You can build a dataset in hours. You can try the annotation demo.
machine-learning natural-language-processing vuejs vue nuxt dataset datasets nuxtjs annotation-tool text-annotation data-labelingProject Summary: To build a public open dataset of chest X-ray and CT images of patients which are positive or suspected of COVID-19 or other viral and bacterial pneumonias (MERS, SARS, and ARDS.). Data will be collected from public sources as well as through indirect collection from hospitals and physicians. All images and data will be released publicly in this GitHub repo. Lung Bounding Boxes and Chest X-ray Segmentation (license: CC BY 4.0) contributed by General Blockchain, Inc.
computer-vision deep-learning dataset xray computed-tomography covid-19 covid coronavirusThis is the data repository for the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL). DATA SOURCES: This list includes a complete list of all sources ever used in the data set, since January 21, 2010. Some sources listed here (e.g. ECDC, US CDC, BNO News) are not currently relied upon as a source of data.
engineering johns-hopkins-university jhu csse 2019-ncov coronavirus covid-19 systems-science dataset covidMobius provides C# language binding to Apache Spark enabling the implementation of Spark driver program and data processing operations in the languages supported in the .NET framework like C# or F#.For more code samples, refer to Mobius\examples directory or Mobius\csharp\Samples directory.
spark apache-spark rdd dataframe dstream dataset streaming mobius kafka-streaming spark-streaming fsharp bigdata mapreduce eventhubs near-real-time
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.