structured-text-tools - A list of command line tools for manipulating structured text data

  •        113

The following is a list of text-based file formats and command line tools for manipulating each. Delimiter-separated values, including CSV, TSV, etc.



Related Projects

markor - Text editor - Notes & ToDo (for Android) - Markdown, todo.txt, plaintext, math, ..

  •    Java

Markor is a TextEditor for Android. This project aims to make an editor that is versatile, flexible, and lightweight. Markor utilizes simple markup formats like Markdown and todo.txt for note-taking and list management. It is versatile at working with text; it can also be used for keeping bookmarks, copying to clipboard, fast opening a link from text and lots more. Created files are interoperable with any other plaintext software on any platform. Markor is openly developed free software that accepts community contributions. 📝 Create notes and manage your to-do list using simple markup formats 🌲 Work completely offline -- whenever, wherever 👌 Compatible with any other plaintext software on any platform -- edit with notepad or vim, filter with grep, convert to PDF or create a zip archive 🖍 Syntax Highlighting and format related actions -- quickly insert pictures and to-dos 👀 Convert, preview, and share documents as HTML and PDF 📚 Notebook: Store all documents on a common filesystem folder 📓 QuickNote: Fast accessible for keeping notes ☑️ To-Do: Write down your to-do 🖍 Formats: Markdown, todo.txt, csv, ics, ini, json, toml, txt, vcf, yaml 📋 Copy to clipboard: Copy any text, including text shared into Markor 💡 Notebook is the root folder of documents and can be changed to any location on the filesystem. QuickNote and To-Do are textfiles 🎨 Highly customizable, dark theme available 💾 Auto-Save with options for undo/redo 👌 No ads or unnecessary permissions 🌎 Language selection -- use other language than on the system 🔃 Markor is an offline app. It works with sync apps, but they have to do syncing respectively. Sync clients known to work in combination include BitTorrent Sync, Dropbox, FolderSync, OwnCloud, NextCloud, Seafile, Syncthing, Syncopoli 🔒 Can encrypt your textfiles with AES256. You need to set a password at the settings and use Android device with version Marshmallow or newer. You can use jpencconverter to encrypt/decrypt easily on desktop. Be aware that only the text is encrypted not pictures or attachments.

CherryTree - A hierarchical note taking application

  •    C++

A hierarchical note taking application, featuring rich text and syntax highlighting, storing data in a single xml or sqlite file.

Tikka - A content analysis toolkit

  •    Java

Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.

TextQL - Execute SQL against structured text like CSV or TSV

  •    Go

TextQL allows you to easily execute SQL against structured text like CSV or TSV.


  •    Java

Editor for Fixed Width, Csv and Existing Xml files.

FSharp.Data - F# Data: Library for Data Access

  •    HTML

The F# Data library (FSharp.Data.dll) implements everything you need to access data in your F# applications and scripts. It implements F# type providers for working with structured file formats (CSV, HTML, JSON and XML) and for accessing the WorldBank data. It also includes helpers for parsing CSV, HTML and JSON files and for sending HTTP requests.We're open to contributions from anyone. If you want to help out but don't know where to start, you can take one of the Up-For-Grabs issues, or help to improve the documentation.

dasel - Query, update and convert data structures from the command line

  •    Go

Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but supports JSON, YAML, TOML, XML and CSV with zero runtime dependencies.

vroom - Fast reading of delimited files

  •    C++

The fastest delimited reader for R, 1.27 GB/sec. vroom doesn’t stop to actually read all of your data, it simply indexes where each record is located so it can be read later. The vectors returned use the Altrep framework to lazily load the data on-demand when it is accessed, so you only pay for what you use. This lazy access is done automatically, so no changes to your R data-manipulation code are needed.

rq - Record Query - A tool for doing record analysis and transformation

  •    Javascript

This is the home of the tool called rq (record query). It's a tool that's used for performing queries on streams of records in various formats. The goal is to make ad-hoc exploration of data sets easy without having to use more heavy-weight tools like SQL/MapReduce/custom programs. rq fills a similar niche as tools like awk or sed, but works with structured (record) data instead of text.

config - Config is a lightweight configuration file loader that supports PHP, INI, XML, JSON, and YAML files

  •    PHP

Config is a file configuration loader that supports PHP, INI, XML, JSON, and YML files. Config requires PHP 5.5.9+.

Swiss File Knife

  •    C++

Multi function command line tool that belongs onto every usb stick.

sq - swiss-army knife for data

  •    Go

sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document formats like CSV or Excel. sq can perform cross-source joins, execute database-native SQL, and output to a multitude of formats including JSON, Excel, CSV, HTML, Markdown and XML, or insert directly to a SQL database. sq can also inspect sources to view metadata about the source structure (tables, columns, size) and has commands for common database operations such as copying or dropping tables.


  •    Java

HTMLtools includes several Java HTML tools for preparing Web pages. The HTMLtools program automates batch conversion of tab-delimited spreadsheet text files to HTML Web-page files, file amp; table editing, keyword mapping, templates, and more.

omniparser - omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc

  •    Go

Omniparser is a native Golang ETL parser that ingests input data of various formats (CSV, txt, fixed length/width, XML, EDI/X12/EDIFACT, JSON, and custom formats) in streaming fashion and transforms data into desired JSON output based on a schema written in JSON. In the example folders above you will find pairs of input files and their schema files. Then in the .snapshots sub directory, you'll find their corresponding output files.

PEAR Framework - reusable PHP components

  •    PHP

PEAR is a framework and distribution system for reusable PHP components. It has all categories of components from DB access, security, xml parsing, encryption etc.

TWiki - Wiki and Web 2.0 Application Platform

  •    Perl

TWiki is a flexible, powerful, and easy to use enterprise wiki, enterprise collaboration platform, and web application platform. It is a Structured Wiki, typically used to run a project development space, a document management system, a knowledge base, or any other groupware tool, on an intranet, extranet or the Internet. TWiki is a cgi-bin script written in Perl. It reads a text file, hyperlinks it and converts it to HTML on the fly.

imgmin - Lossy image optimization

  •    C

Image files constitute a majority of static web traffic.[17] Unlike text-based web file formats, binary image files do not benefit from built-in webserver-based HTTP gzip compression. imgmin offers an automated means for enforcing image quality as a standalone tool and as a webserver module. imgmin determines the optimal balance of image quality and filesize, often greatly reducing image size while retaining quality for casual use, which translates into more efficient use of storage and network bandwidth, which saves money and improves user experience. Websites are composed of several standard components. Most (HTML, CSS, Javascript, JSON, XML, etc) are text-based. They can be efficiently compressed for transfer via gzip, supported by all mainstream webservers and browsers. But image and video files are binary, non-text files, and generally are not worth auto-compressing in the webserver.

ServiceStack text - NET's fastest JSON, JSV and CSV Text Serializers

  •    CSharp

ServiceStack.Text is an independent, dependency-free serialization library that contains ServiceStack's text processing functionality, including: JsonSerializer, TypeSerializer (JSV-Format), CsvSerializer, T.Dump extension method, StringExtensions - Xml/Json/Csv/Url encoding, BaseConvert, Rot13, Hex escape, etc., Stream, Reflection, List, DateTime, etc extensions and utils.

We have large collection of open source products. Follow the tags from Tag Cloud >>

Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.