spider - Unsurprising JavaScript - No longer active

  •        5

The Next-Gen Programming Language for the Web. Note: This project is no longer active.

http://spiderlang.org
https://github.com/alongubkin/spider

Dependencies:

escodegen : estools/escodegen#21a9331
pegjs : ~0.8.0
nomnom : ~1.8.1
chalk : ~0.5.1
traceur : 0.0.74
multi-stage-sourcemap : ~0.2.1

Tags
Implementation
License
Platform

   




Related Projects

Spider Compiler

  •    

Spider Compiler parses the input of a spider programming source file and compiles it (with help of csc.exe; the C#-Compiler) to an exe-file. This project is developed in C#.

SPIDER on Rails

  •    Java

SPIDER on Rails (new name of J2EE Spider) is a open source tool for rapidly developing form-based web applications. See more: http://www.infoq.com/news/2008/03/J2EE-Spider

node-rolling-spider - A library for controlling a Parrot Rolling Spider drone via BLE.

  •    Javascript

There are a few steps you should take when getting started with this. We're going to learn how to get there by building out a simple script that will take off, move forward a little, then land.To connect you need to create a new Drone instance.

node-readability - Scrape/Crawl article from any site automatically

  •    Javascript

In my case, the speed of spider is about 1500k documents per day, and the maximize crawling speed is 1.2k /minute, avg 1k /minute, the memory cost are about 200 MB on each spider kernel, and the accuracy is about 90%, the rest 10% can be fixed by customizing Score Rules or Selectors. it's better than any other readability modules.

Monkey-Spider

  •    Python

The Monkey-Spider is a crawler based low-interaction Honeyclient Project. It is not only restricted to this use but it is developed as such. The Monkey-Spider crawles Web sites to expose their threats to Web clients.


dhtspider - Bittorrent dht network spider

  •    Javascript

Bittorrent dht network infohash spider, for engiy.com[a bittorrent resource search engine]

scrapy-examples - Multifarious Scrapy examples

  •    Python

Multifarious scrapy examples with integrated proxies and agents, which make you comfy to write a spider. There are several depths in the spider, and the spider gets real data from depth2.

php-spider - A configurable and extensible PHP web spider

  •    PHP

The easiest way to install PHP-Spider is with composer. Find it on Packagist. This is a very simple example. This code can be found in example/example_simple.php. For a more complete example with some logging, caching and filters, see example/example_complex.php. That file contains a more real-world example.

tarantula - a big hairy fuzzy spider that crawls your site, wreaking havoc

  •    Ruby

a big hairy fuzzy spider that crawls your site, wreaking havoc

node-crawler - Web Crawler/Spider for NodeJS + server-side jQuery ;-)

  •    Javascript

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

spidermonkey - DEFUNCT: PHP Web Spider I started in 2011, accepts regex, css selectors

  •    PHP

DEFUNCT: PHP Web Spider I started in 2011, accepts regex, css selectors

anemone - Anemone web-spider framework

  •    Ruby

Anemone web-spider framework

Arachnid Web Spider Framework

  •    Java

Arachnid is a Java-based web spider framework. It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page

VeryCD WebSpider - A plugin for InfoVista.NET

  •    

VeryCD is an web-spider application which can fetch the content of emule information from www.verycd.com, the result is stored as access(mdb) format, it is developed under VS2005, it is also a plugin for InfoVista.NET as a content provider.

Tyrannt Micro

  •    

This is an RPG game which runs on the .net Micro Framework Gadgeteer platform. It uses the FEZ Spider hardware

Norconex HTTP Collector - Enterprise Web Crawler

  •    Java

Norconex HTTP Collector is a full-featured web crawler (or spider) that can manipulate and store collected data into a repositoriy of your choice (e.g. a search engine). It very flexible, powerful, easy to extend, and portable.

dht - BitTorrent DHT Protocol && DHT Spider.

  •    Go

See the video on the Youtube.It contains two modes, the standard mode and the crawling mode. The standard mode follows the BEPs, and you can use it as a standard dht server. The crawling mode aims to crawl as more metadata info as possiple. It doesn't follow the standard BEPs protocol. With the crawling mode, you can build another BTDigg.

spidr - A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely

  •    Ruby

Spidr is a versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Simple Web Spider

  •    Java

Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.