Displaying 1 to 4 from 4 results

get-image-urls - Scrape image urls from HTML website including CSS background images.

  •    Javascript

Scrape image urls from a HTML website. It's using PhantomJS in the background to get all images including CSS backgrounds.

algolia-webcrawler - Simple node worker that crawls sitemaps in order to keep an algolia index up-to-date

  •    Javascript

Simple node worker that crawls sitemaps in order to keep an Algolia index up-to-date. It uses simple CSS selectors in order to find the actual text content to index.

supercrawler - A web crawler

  •    Javascript

Supercrawler is a Node.js web crawler. It is designed to be highly configurable and easy to use. When Supercrawler successfully crawls a page (which could be an image, a text document or any other file), it will fire your custom content-type handlers. Define your own custom handlers to parse pages, save data and do anything else you need.

crawlerr - A simple and fully customizable web crawler/spider for Node

  •    Javascript

crawlerr is simple, yet powerful web crawler for Node.js, based on Promises. This tool allows you to crawl specific urls only based on wildcards. It uses Bloom filter for caching. A browser-like feeling. Creates a new Crawlerr instance for a specific website with custom options. All routes will be resolved to base.