Scrapy - Web crawling & scraping framework for Python

  •        1247

Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

https://scrapy.org/
https://github.com/scrapy/scrapy

Tags
Implementation
License
Platform

   




Related Projects

ElectionCrawler - my first web crawler using python scrapy module


my first web crawler using python scrapy module

pystoruman - Scrapy web crawler for storumanenergy.se


Scrapy web crawler for storumanenergy.se

scrapy-poc - envaluate scrapy project ,a crawler framework pure python implements.


envaluate scrapy project ,a crawler framework pure python implements.

scrapy-useragents - A middleware to use random user agent in Scrapy crawler.


A middleware to use random user agent in Scrapy crawler.

blog-crawler - Basic crawler written in python using scrapy framework


Basic crawler written in python using scrapy framework



jira-crawler - basic crawler for very specific purpose with scrapy


basic crawler for very specific purpose with scrapy

django-scrappy - Scrapy crawler integrated in django project


Scrapy crawler integrated in django project

CercAziendeCrawler - CercAziende Crawler companies info with Scrapy


CercAziende Crawler companies info with Scrapy

fpspider - Simple crawler based on scrapy.


Simple crawler based on scrapy.

yeeyan-spider - ??????????????scrapy #a python spider based on scrapy


??????????????scrapy #a python spider based on scrapy

scrapy-kaskus-crawler - Scraping kaskus.co.id, biggest indonesian forum


Scraping kaskus.co.id, biggest indonesian forum

Scrapy - flexible threaded web crawler based on hpricot and anemone


flexible threaded web crawler based on hpricot and anemone

Norconex HTTP Collector - Enterprise Web Crawler


Norconex HTTP Collector is a full-featured web crawler (or spider) that can manipulate and store collected data into a repositoriy of your choice (e.g. a search engine). It very flexible, powerful, easy to extend, and portable.

scrapy-indeed-spider - Scrapy spider for pulling job listings from Indeed


Scrapy spider for pulling job listings from Indeed

pharmdnepr - Scrapy-based spider that crawls the Dnipropetrovs’k pharmacies web sites.


Scrapy-based spider that crawls the Dnipropetrovs’k pharmacies web sites.

Arachnode.net


An open source .NET web crawler written in C# using SQL 2005/2008. Arachnode.net is a complete and comprehensive .NET web crawler for downloading, indexing and storing Internet content including e-mail addresses, files, hyperlinks, images, and Web pages.

Norconex HTTP Collector - A Web Crawler in Java


Norconex HTTP Collector is a web spider, or crawler that aims to make Enterprise Search integrators and developers's life easier. It is Portable, Extensible, reusable, Robots.txt support, Obtain and manipulate document metadata, Resumable upon failure and lot more.

scrapy-medica - Prueba de scraping usando el framework scrapy escrito en python.


Prueba de scraping usando el framework scrapy escrito en python.

google-play-crawler - a scrapy script to crawl all google play applications


a scrapy script to crawl all google play applications