Rarawel

  •        44

Crawl website with custom URIs and grab content

http://rarawel.codeplex.com/

Tags
Implementation
License
Platform

   




Related Projects

simple-crawler - A simple python web crawler (bot)


A simple python web crawler (bot)

TrustRankBot - Web Crawler in PHP : provides Bot + Web interface


Web Crawler in PHP : provides Bot + Web interface

arachnode.net


http://arachnode.net 2.6 release +lucene.net

toastie-bot - A web crawler programmed in C# for a small scale search engine.


A web crawler programmed in C# for a small scale search engine.



dezi-bot - Dezi web crawler


Dezi web crawler

novel-crawler - The bot will crawl novel , chapter by chapter , and pack it to RTF / TXT format


The bot will crawl novel , chapter by chapter , and pack it to RTF / TXT format

goredis-crawler - Cross-platform persistent and distributed web crawler :ant: :computer:


A cross-platform persistent and distributed web crawler.goredis-crawler is persistent because the queue is stored in a remote database that is automatically re-initialized if interrupted. goredis-crawler is distributed because multiple instances of goredis-crawler will work on the remotely stored queue, so you can start as many crawlers as you want on separate machines to speed along the process. goredis-crawler is also fast because it is threaded and uses connection pools.

llazzaro-cplusplusbot


A crawler/bot for webpages written in C

Crawler-Web - Web component for Crawler which will use the results of the crawler


Web component for Crawler which will use the results of the crawler

simple-crawler - Simple crawler app in python for a class presentation in crawler.


Simple crawler app in python for a class presentation in crawler.

crawler - Hacker news crawler & Start up News crawler


Hacker news crawler & Start up News crawler

crawler-commons - crawler-commons (fork of https://code.google.com/p/crawler-commons/)


crawler-commons (fork of https://code.google.com/p/crawler-commons/)

Crawler - It's a simple web crawler that includes crawler, tokenizer, stemmer and classifier.


It's a simple web crawler that includes crawler, tokenizer, stemmer and classifier.

fess-crawler - Web/FileSystem Crawler Library


Fess Crawler is Crawler Framework.

webleech - A web crawler framework, with a sample crawler for PCC (???????)


A web crawler framework, with a sample crawler for PCC (???????)

Norconex HTTP Collector - A Web Crawler in Java


Norconex HTTP Collector is a web spider, or crawler that aims to make Enterprise Search integrators and developers's life easier. It is Portable, Extensible, reusable, Robots.txt support, Obtain and manipulate document metadata, Resumable upon failure and lot more.

gocrawl - Polite, slim and concurrent web crawler.


gocrawl is a polite, slim and concurrent web crawler written in Go.For a simpler yet more flexible web crawler written in a more idiomatic Go style, you may want to take a look at fetchbot, a package that builds on the experience of gocrawl.

Ex-Crawler


Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net

Squzer - Distributed Web Crawler


Squzer is the Declum's open-source, extensible, scale, multithreaded and quality web crawler project entirely written in the Python language.