Displaying 1 to 8 from 8 results

artoo - artoo.js - the client-side scraping companion.

  •    Javascript

artoo.js is a piece of JavaScript code meant to be run in your browser's console to provide you with some scraping utilities. The library's full documentation is available on github pages.

Soup - Web Scraper in Go, similar to BeautifulSoup

  •    Go

soup is a small web scraper package for Go, with its interface highly similar to that of BeautifulSoup.

ScrapeMeAgain - Yet another Python web scraping application

  •    Python

ScrapeMeAgain is a Python 3 powered web scraper. It uses multiprocessing to get the work done quicker and stores collected data in an SQLite database. You have to provide your own database table description and an actual scraper class which must follow the BaseScraper interface. See scrapemeagain/scrapers/examplescraper for more details.

Rcrawler - An R web crawler and scraper

  •    R

Rcrawler is an R package for web crawling websites and extracting structured data which can be used for a wide range of useful applications, like web mining, text mining, web content mining, and web structure mining. So what is the difference between Rcrawler and rvest : rvest extracts data from one specific page by navigating through selectors. However, Rcrawler automatically traverses and parse all web pages of a website, and extract all data you need from them at once with a single command. For example collect all published posts on a blog, or extract all products on a shopping website, or gathering comments, reviews for your opinion mining studies. More than that, Rcrawler can help you studies web site structure by building a network representation of a website internal and external hyperlinks (nodes & edges). Help us improve Rcrawler by asking questions, revealing issues, suggesting new features. If you have a blog write about it, or just share it with your collegues.




shutterscrape - Speedy, lightweight web scrapper.

  •    Python

ShutterScrape is a web scrapper for bulk downloading images or videos from Shutterstock with blinding speed. ⚡ It implements Selenium for browser automation and Beautiful Soup for parsing. If you like this repo, feel free to star ⭐ it! For more information, contact https://davidlin.io/.

crypto - Cryptocurrency Historical Market Data R Package

  •    R

Retrieves all the open, high, low, close values for all cryptocurrencies. This retrieves data from CoinMarketCap's historical prices, exchange details and current prices API. Below are the high level dependencies for the package to install correctly.

xidel - A command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3

  •    Pascal

Xidel is a command line tool to download and extract data from HTML/XML pages using CSS selectors, XPath/XQuery 3.0, as well as querying JSON files or APIs (e.g. REST) using JSONiq. There are dependency-free binaries for Windows, Linux and Mac.

keeper-core-api - Nunux Keeper core API

  •    Javascript

Your personal content curation service. This project is the core system of Nunux Keeper. It's an API that allow you to collect, organize, and retrieve online documents.