Displaying 1 to 4 from 4 results

Storm Crawler - Web crawler SDK based on Apache Storm

  •    Java

StormCrawler is an open source collection of resources for building low-latency, scalable web crawlers on Apache Storm. StormCrawler is a library and collection of resources that developers can leverage to build their own crawlers. The good news is that doing so can be pretty straightforward. Often, all you'll have to do will be to declare StormCrawler as a Maven dependency, write your own Topology class (tip : you can extend ConfigurableTopology), reuse the components provided by the project and maybe write a couple of custom ones for your own secret sauce.

Soup - Web Scraper in Go, similar to BeautifulSoup

  •    Go

soup is a small web scraper package for Go, with its interface highly similar to that of BeautifulSoup.

ChiChew - :notebook_with_decorative_cover: 教育部《重編國語辭典修訂本》 網路爬蟲 :: A live web crawler for the Chinese-Chinese dictionary published by the Ministry of Education in Taiwan

  •    Python

教育部《重編國語辭典修訂本》 網路爬蟲 (即時資料查詢) A live web crawler for the Chinese-Chinese dictionary published by the Ministry of Education in Taiwan.

gopa - [WIP] GOPA, a spider written in Golang, for Elasticsearch

  •    Go

GOPA, A Spider Written in Go. First of all, get it, two opinions: download the pre-built package or compile it yourself.