xidel - A command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3

  •        241

Xidel is a command line tool to download and extract data from HTML/XML pages using CSS selectors, XPath/XQuery 3.0, as well as querying JSON files or APIs (e.g. REST) using JSONiq. There are dependency-free binaries for Windows, Linux and Mac.

http://www.videlibri.de/xidel.html
https://github.com/benibela/xidel

Tags
Implementation
License
Platform

   




Related Projects

HTTPie - a CLI, cURL-like tool for humans

  •    Python

HTTPie (pronounced aitch-tee-tee-pie) is a command line HTTP client. Its goal is to make CLI interaction with web services as human-friendly as possible. It provides a simple http command that allows for sending arbitrary HTTP requests using a simple and natural syntax, and displays colorized output. HTTPie can be used for testing, debugging, and generally interacting with HTTP servers.

httpie - As easy as /aitch-tee-tee-pie/ 🥧 Modern, user-friendly command-line HTTP client for the API era

  •    Python

HTTPie (pronounced aitch-tee-tee-pie) is a command-line HTTP client. Its goal is to make CLI interaction with web services as human-friendly as possible. HTTPie is designed for testing, debugging, and generally interacting with APIs & HTTP servers. The http & https commands allow for creating and sending arbitrary HTTP requests. They use simple and natural syntax and provide formatted and colorized output. This documentation is best viewed at httpie.org/docs.

karate - Test Automation Made Simple

  •    Java

Karate is the only open-source tool to combine API test-automation, mocks, performance-testing and even UI automation into a single, unified framework. The BDD syntax popularized by Cucumber is language-neutral, and easy for even non-programmers. Assertions and HTML reports are built-in, and you can run tests in parallel for speed.


HTTPie - cURL for humans

  •    Python

HTTPie is a CLI HTTP utility. Its goal is to make CLI interaction with HTTP-based services as human-friendly as possible. It does so by providing an http command that allows for issuing arbitrary HTTP requests using a simple and natural syntax and displaying colorized responses.

dasel - Query, update and convert data structures from the command line

  •    Go

Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but supports JSON, YAML, TOML, XML and CSV with zero runtime dependencies.

Nokogiri - HTML, XML, SAX, and Reader parser with XPath and CSS selector support

  •    Ruby

Nokogiri (?) is an HTML, XML, SAX, DOM parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors, XML/HTML builder, XSLT transformer. Nokogiri parses and searches XML/HTML using native libraries (either C or Java, depending on your Ruby), which means it's fast and standards-compliant.

wttr.in - The right way to check the weather

  •    Python

wttr.in — the right way to check the weather. wttr.in is a console oriented weather forecast service, that supports various information representation methods like terminal oriented ANSI-sequences for console HTTP clients (curl, httpie, or wget), HTML for web browsers, or PNG for graphical viewers. wttr.in uses wego for visualization and various data sources for weather forecast information.

wring - Extract content from webpages using CSS Selectors, XPath, and JS expressions

  •    PureScript

Wring utilizes PhantomJS for some of its commands. To use these, install it using your system package manager by running something like brew install phantomjs on OS X, or apt-get install phantomjs on Ubuntu. You can make sure it's on your PATH by running phantomjs -v.

graphtage - A semantic diff utility and library for tree-like files such as JSON, JSON5, XML, HTML, YAML, and CSV

  •    Python

Graphtage is a command-line utility and underlying library for semantically comparing and merging tree-like structures, such as JSON, XML, HTML, YAML, plist, and CSS files. Its name is a portmanteau of “graph” and “graftage”—the latter being the horticultural practice of joining two trees together such that they grow as one. Graphtage performs an analysis on an intermediate representation of the trees that is divorced from the filetypes of the input files. This means, for example, that you can diff a JSON file against a YAML file. Also, the output format can be different from the input format(s). By default, Graphtage will format the output diff in the same file format as the first input file. But one could, for example, diff two JSON files and format the output in YAML. There are several command-line arguments to specify these transformations; please check the --help output for more information.

jsonpipe - Convert JSON to a UNIX-friendly line-based format.

  •    Python

Everyone I know prefers to work with JSON over XML, but sadly there is a sore lack of utilities of the quality or depth of html-xml-utils and XMLStarlet for actually processing JSON data in an automated fashion, short of writing an ad hoc processor in your favourite programming language. jsonpipe is a step towards a solution: it traverses a JSON object and produces a simple, line-based textual format which can be processed by all your UNIX favourites like grep, sed, awk, cut and diff. It may also be valuable within programming languages---in fact, it was originally conceived as a way of writing simple test assertions against JSON output without coupling the tests too closely to the specific structure used.

fast-xml-parser - Validate XML, Parse XML to JS/JSON and vise versa, or parse XML to Nimn rapidly without C/C++ based libraries and no callback

  •    Javascript

This project welcomes contributors. If you have a feature you'd like to see implemented or a bug you'd liked fixed, the best and fastest way to make that happen is to implement it and submit a PR. Basic knowledge of JS is sufficient. Feel free to ask for any guidance. To use it from CLI Install it globally with -g option.

Soup - Web Scraper in Go, similar to BeautifulSoup

  •    Go

soup is a small web scraper package for Go, with its interface highly similar to that of BeautifulSoup.

curl - A command line tool and library for transferring data with URL syntax, supporting HTTP, HTTPS, FTP, FTPS, GOPHER, TFTP, SCP, SFTP, SMB, TELNET, DICT, LDAP, LDAPS, FILE, IMAP, SMTP, POP3, RTSP and RTMP

  •    C

curl is used in command lines or scripts to transfer data. It is also used in cars, television sets, routers, printers, audio equipment, mobile phones, tablets, settop boxes, media players and is the internet transfer backbone for thousands of software applications affecting billions of humans daily.

Lux - XML Search engine

  •    Java

Lux is an open source XML search engine using Lucene /Solr and Saxon XQuery/XSLT processor. Lux provides XML-aware indexing, an XQuery 1.0 optimizer that rewrites queries to use the indexes, and a function library for interacting with Lucene via XQuery. These capabilities are tightly integrated with Solr, and leverage its application framework in order to deliver a REST service, application server, and supporting tools.

XMLStarlet command line XML toolkit

  •    C

XMLStarlet is a set of command line utilities (tools) to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for text files with UNIX grep, sed, awk, diff, patch, join, etc utilities.

RakNet - RakNet is a cross platform, open source, C++ networking engine for game programmers.

  •    C

------------------------------------------ See Help\swigtutorial.html Upgrading from version 3 ------------------------------------------ See 3.x_to_4.x_upgrade.txt Windows users (Visual Studio 2008 and 2010) ----------------------------------------- Load RakNet_VS2008.sln and convert if necessary.After the project conversion, if you encounter error MSB4006,follow the steps below to fix it:1. Open project properties2. Click on "Common Properties"3. Click on "Framework and References"4. Look

Phantomjs - Headless WebKit with JavaScript API

  •    Javascript

PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG. It is an optimal solution for headless website testing. It run functional tests with frameworks such as Jasmine, QUnit, Mocha, Capybara, WebDriver, and many others.

emacs-request - Request.el -- Easy HTTP request for Emacs Lisp

  •    Emacs

Request.el is a HTTP request library with multiple backends. It supports url.el which is shipped with Emacs and curl command line program. User can use curl when s/he has it, as curl is more reliable than url.el. Library author can use request.el to avoid imposing external dependencies such as curl to users while giving richer experience for users who have curl. As request.el is implemented in extensible manner, it is possible to implement other backend such as wget. Also, if future version of Emacs support linking with libcurl, it is possible to implement a backend using it. Libraries using request.el automatically can use these backend without modifying their code.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.