Displaying 1 to 15 from 15 results

dht - BitTorrent DHT Protocol && DHT Spider.

  •    Go

See the video on the Youtube.It contains two modes, the standard mode and the crawling mode. The standard mode follows the BEPs, and you can use it as a standard dht server. The crawling mode aims to crawl as more metadata info as possiple. It doesn't follow the standard BEPs protocol. With the crawling mode, you can build another BTDigg.

colly - Fast and Elegant Scraping Framework for Gophers

  •    Go

Colly provides a clean interface to write any kind of crawler/scraper/spider.With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.

go_spider - [爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework

  •    Go

A crawler of vertical communities achieved by GOLANG. Latest stable Release: Version 1.2 (Sep 23, 2014).

Scrapy - Web crawling & scraping framework for Python

  •    Python

Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.




colly - Elegant Scraper and Crawler Framework for Golang

  •    Go

Colly provides a clean interface to write any kind of crawler/scraper/spider. With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.

Yioop - Open Source Search Engine Software

  •    PHP

Yioop is an open source, PHP search engine capable of crawling, index, and providing search results for hundred of millions of pages on relatively low end hardware. It can index a variety of text formats HTML, RSS, PDF, RTF, DOC and images GIF, JPEG, PNG, etc. It can import data from ARC, WARC, Media-Wiki, Open Directory RDF. It is easily localized to many languages. It has built-in support for new feeds, discussion groups, blogs, and wikis. It also supports mixing indexes to create mash ups.

gopa-abandoned - GOPA, a spider written in Go

  •    Go

[狗爬], A Spider Written in Go. It's safety to press ctrl+c stop the current running Gopa, Gopa will handle the rest,saving the checkpoint, you may restore the job later,the world is still in your hand.


gopa - [WIP] GOPA, a spider written in Golang, for Elasticsearch

  •    Go

GOPA, A Spider Written in Go. First of all, get it, two opinions: download the pre-built package or compile it yourself.

input-field-finder - Spiders given URLs for input fields.

  •    Go

Spiders the domain of a single URL or a set or URLs and prints out all <input> elements found on the given domain and scheme (http/https). Input fields are the most common vector/sink for web application vulnerabilities. I wrote this tool to help automate the reconnaissance phase when testing web applications for security vulnerabilities.

marmot - 💐Marmot | Web Crawler/HTTP protocol Download Package 🐭

  •    Go

If you go get difficult, you can move those files under GOPATH in this project to your Golang env's GOPATH. HTTP Download Helper, Supports Many Features such as Cookie Persistence, HTTP(S) and SOCKS5 Proxy....