Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON. With Miller, you get to use named fields without needing to count positional indices, using familiar formats such as CSV, TSV, JSON, and positionally-indexed.



dasel - Query, update and convert data structures from the command line

  •    Go

Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but supports JSON, YAML, TOML, XML and CSV with zero runtime dependencies.

CleverCSV - CleverCSV is a Python package for handling messy CSV files

  •    Python

CleverCSV provides a drop-in replacement for the Python csv package with improved dialect detection for messy CSV files. It also provides a handy command line tool that can standardize a messy file or generate Python code to import it. Click here to go to the introduction with more details about CleverCSV. If you're in a hurry, below is a quick overview of how to get started with the CleverCSV Python package and the command line interface.

Jackson - Best JSON parser for Java

  •    Java

Jackson is one of best JSON parser for Java. More than that, Jackson is a suite of data-processing tools for Java (and the JVM platform), including the flagship streaming JSON parser / generator library, matching data-binding library (POJOs to and from JSON) and additional data format modules to process data encoded in Avro, BSON, CBOR, CSV, Smile, (Java) Properties, Protobuf, XML or YAML; and even the large set of data format modules to support data types of widely used data types such as Guava, Joda, PCollections and many, many more.

esri2open - this repo is an ESRI toolbox and tool(s) that export ESRI Feature Classes to open data formats, CSV, JSON, and GeoJSON

  •    Python

This repo is an ESRI toolbox and tool(s) that exports ESRI Feature Classes to open data formats, CSV, JSON, SQLite, and GeoJSON. Much of the data in government coffers is contained in spatial databases. A large percentage of government spatial data is created and managed using ESRI software. While the common interchange format, the ESRI Shapefile, is easily exported and imported by many other softwares, this data file format (the Shapefile) is not intrinsically part of the www ecology. Moreover, ESRI software does not provide an export of its generic 'feature class' (shapefile, file geodatabase, and personal geodatabase) to the most common open data file formats, CSV, JSON, and/or GeoJSON. Finally while open source tools easily transform ESRI shapefiles to open data, most government geospatial infrastructures only have ESRI tools. Lacking this basic export feature presented here, means the lion's share of government spatial data users cannot export their data to the most common open data formats.

php-export-data - PHP class to export data in CSV, TSV, or Excel XML (aka SpreadsheeML) format to a file or directly to the browser

  •    PHP

A simple library for exporting tabular data to Excel-friendly XML, CSV, or TSV. It supports streaming exported data to a file or directly to the browser as a download so it is suitable for exporting large datasets (you won't run out of memory). See the test/ directory for more examples.

sq - swiss-army knife for data

  •    Go

sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document formats like CSV or Excel. sq can perform cross-source joins, execute database-native SQL, and output to a multitude of formats including JSON, Excel, CSV, HTML, Markdown and XML, or insert directly to a SQL database. sq can also inspect sources to view metadata about the source structure (tables, columns, size) and has commands for common database operations such as copying or dropping tables.

omniparser - omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc

  •    Go

Omniparser is a native Golang ETL parser that ingests input data of various formats (CSV, txt, fixed length/width, XML, EDI/X12/EDIFACT, JSON, and custom formats) in streaming fashion and transforms data into desired JSON output based on a schema written in JSON. In the example folders above you will find pairs of input files and their schema files. Then in the .snapshots sub directory, you'll find their corresponding output files.

csv-parser - Streaming csv parser inspired by binary-csv that aims to be faster than everyone else

  •    Javascript

csv-parser can convert CSV into JSON at at rate of around 90,000 rows per second (perf varies with data, try bench.js with your data).The data emitted is a normalized JSON object. Each header is used as the property name of the object.

Memgraph - Build modern, graph-based applications on top of your streaming data in minutes

  •    C++

Memgraph is a streaming graph application platform that helps you wrangle your streaming data, build sophisticated models that you can query in real-time, and develop graph applications.

csvtk - A cross-platform, efficient and practical CSV/TSV toolkit in Golang

  •    Go

Similar to FASTA/Q format in field of Bioinformatics, CSV/TSV formats are basic and ubiquitous file formats in both Bioinformatics and data sicence. People usually use spreadsheet softwares like MS Excel to do process table data. However it's all by clicking and typing, which is not automatically and time-consuming to repeat, especially when we want to apply similar operations with different datasets or purposes.

reckon - Flexibly import bank account CSV files into Ledger for command-line accounting

  •    Ruby

Reckon automagically converts CSV files for use with the command-line accounting tool Ledger. It also helps you to select the correct accounts associated with the CSV data using Bayesian machine learning. First, login to your bank and export your transaction data as a CSV file.

WCF Data Service Format Extensions for CSV, TXT


This project add support for Legacy formats like CSV, TXT (CSV Export) to the data service output and allow $format=txt query. By default WCF Data Services support Atom and JSON responses however legacy systems do not understand ATOM or JSON but they understand CSV, TXT f...

active_importer - Define importers that load tabular data from spreadsheets or CSV files into any ActiveRecord-like ORM

  •    Ruby

Define importers that load tabular data from spreadsheets or CSV files into any ActiveRecord-like ORM. Define classes that you instruct on how to import data into data models.

FSharp.Data - F# Data: Library for Data Access

  •    HTML

The F# Data library (FSharp.Data.dll) implements everything you need to access data in your F# applications and scripts. It implements F# type providers for working with structured file formats (CSV, HTML, JSON and XML) and for accessing the WorldBank data. It also includes helpers for parsing CSV, HTML and JSON files and for sending HTTP requests.We're open to contributions from anyone. If you want to help out but don't know where to start, you can take one of the Up-For-Grabs issues, or help to improve the documentation.

MapShaper - Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files

  •    MPL

Mapshaper is software for editing Shapefile, GeoJSON, TopoJSON, CSV and several other data formats, written in JavaScript. The mapshaper command line program supports essential map making tasks like simplifying shapes, editing attribute data, clipping, erasing, dissolving, filtering and more. - A Clojure high performance data processing system

  •    Clojure is a Clojure library for data processing and machine learning. Datasets are currently in-memory columnwise databases and we support parsing from file or input-stream. We support these formats: raw/gzipped csv/tsv, xls, xlsx, json, and sequences of maps as input sources. SQL bindings are provided as a separate library. Data size in memory is minimized (primitive arrays), datetime types are often converted to an integer representation and strings are loaded into string tables. These features together dramatically decrease the working set size in memory. Because data is stored in columnar fashion columnwise operations on the dataset are very fast.

ISO-3166-Countries-with-Regional-Codes - ISO 3166-1 country lists merged with their UN Geoscheme regional codes in ready-to-use JSON, XML, CSV data sets

  •    Ruby

These lists are the result of merging data from two sources, the Wikipedia ISO 3166-1 article for alpha and numeric country codes, and the UN Statistics site for countries' regional, and sub-regional codes. In addition to countries, it includes dependent territories. The International Organization for Standardization (ISO) site provides partial data (capitalised and sometimes stripped of non-latin ornamentation), but sells the complete data set as a Microsoft Access 2003 database. Other sites give you the numeric and character codes, but there appeared to be no sites that included the associated UN-maintained regional codes in their data sets. I scraped data from the above two websites that is all publicly available already to produce some ready-to-use complete data sets that will hopefully save someone some time who had similar needs.

