Anti-Anti-Spider - 越来越多的网站具有反爬虫特性,有的用图片隐藏关键数据,有的使用反人类的验证码,建立反反爬虫的代码仓库,通过与不同特性的网站做斗争(无恶意)提高技术。(欢迎提交难以采集的网站)(因工作原因,项目暂停)

  •        19

如今项目会包含多项技术的样例代码.

https://www.urlteam.org
https://github.com/luyishisi/Anti-Anti-Spider

Tags
Implementation
License
Platform

   




Related Projects

al-khaser - Public malware techniques used in the wild: Virtual Machine, Emulation, Debuggers, Sandbox detection

  •    C++

al-khaser is a PoC "malware" application with good intentions that aims to stress your anti-malware system. It performs a bunch of common malware tricks with the goal of seeing if you stay under the radar. You can download the latest release here: x86 | x64.

Scrollout F1 - An easy-to-use anti-spam email gateway

  •    C++

Scrollout F1 is an easy to use, already adjusted email firewall (gateway) offering free anti-spam and anti-virus protection aiming to secure existing email servers, old or new, such as Microsoft Exchange, Lotus Domino, Postfix, Exim, Sendmail, Qmail and others.

Spider Compiler

  •    

Spider Compiler parses the input of a spider programming source file and compiles it (with help of csc.exe; the C#-Compiler) to an exe-file. This project is developed in C#.

SPIDER on Rails

  •    Java

SPIDER on Rails (new name of J2EE Spider) is a open source tool for rapidly developing form-based web applications. See more: http://www.infoq.com/news/2008/03/J2EE-Spider

Super Av Anti Virus

  •    

Super Av Anti Virus is an open source anti virus with full source code


sqlcheck - Automatically identify anti-patterns in SQL queries

  •    C++

sqlcheck automatically detects common SQL anti-patterns. Such anti-patterns often slow down queries. Addressing them will, therefore, help accelerate queries.sqlcheck targets all major SQL dialects.

cloudflare-scrape - A Python module to bypass Cloudflare's anti-bot page.

  •    Python

A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests. Cloudflare changes their techniques periodically, so I will update this repo frequently. This can be useful if you wish to scrape or crawl a website protected with Cloudflare. Cloudflare's anti-bot page currently just checks if the client supports Javascript, though they may add additional techniques in the future.

ScyllaHide - Fork of ScyllaHide: https://bitbucket.org/NtQuery/scyllahide, Releases:

  •    C++

ScyllaHide is an advanced open-source x64/x86 usermode Anti-Anti-Debug library. It hooks various functions in usermode to hide debugging. This tool is intended to stay in usermode (ring3). If you need kernelmode (ring0) Anti-Anti-Debug please see TitanHide https://github.com/mrexodia/titanhide. PE x64 debugging is fully supported with plugins for x64dbg and IDA.

node-rolling-spider - A library for controlling a Parrot Rolling Spider drone via BLE.

  •    Javascript

There are a few steps you should take when getting started with this. We're going to learn how to get there by building out a simple script that will take off, move forward a little, then land.To connect you need to create a new Drone instance.

node-readability - Scrape/Crawl article from any site automatically

  •    Javascript

In my case, the speed of spider is about 1500k documents per day, and the maximize crawling speed is 1.2k /minute, avg 1k /minute, the memory cost are about 200 MB on each spider kernel, and the accuracy is about 90%, the rest 10% can be fixed by customizing Score Rules or Selectors. it's better than any other readability modules.

Monkey-Spider

  •    Python

The Monkey-Spider is a crawler based low-interaction Honeyclient Project. It is not only restricted to this use but it is developed as such. The Monkey-Spider crawles Web sites to expose their threats to Web clients.

dhtspider - Bittorrent dht network spider

  •    Javascript

Bittorrent dht network infohash spider, for engiy.com[a bittorrent resource search engine]

scrapy-examples - Multifarious Scrapy examples

  •    Python

Multifarious scrapy examples with integrated proxies and agents, which make you comfy to write a spider. There are several depths in the spider, and the spider gets real data from depth2.

php-spider - A configurable and extensible PHP web spider

  •    PHP

The easiest way to install PHP-Spider is with composer. Find it on Packagist. This is a very simple example. This code can be found in example/example_simple.php. For a more complete example with some logging, caching and filters, see example/example_complex.php. That file contains a more real-world example.

ClamAV for OS X

  •    Objective-C

A Macintosh OS X anti-virus software that uses the ClamAV anti-virus library. The project's focus is on usability. Its purpose is to develop native GUI-based binary distributions of a ClamAV-based anti-virus software that behaves as OS X users expect.

Anti Inference Hub

  •    Java

Anti Inference Hub is the first dynamic query processing engine that defends against the Inference Problem in Multilevel Databases by integrating smoothly with common DBMSs (Oracle, PostgreSQL, and MySQL), and monitoring queries submitted by users. Please post your questions to Anti Inference Hub mailing list at: https://lists.sourceforge.net/lists/listinfo/aih-list.

Haze Anti-Virus

  •    CSharp

Haze Anti-Virus is a anti virus written in native C++, it uses signatures and heuristics scanning. This antivirus is aimed at providing all users with a secure computer enviroment, by making it as simple to use but still packs even more features than other complex antivirus so...

jASEN - java Anti Spam ENgine

  •    Java

jASEN is a pure java Anti Spam ENgine combining bayesian-like scanning with intelligent email inspection and classification. jASEN is best suited to developers wishing to integrate anti-spam services into an existing server based java email application.

gdkxft: anti-aliased fonts for gtk+-1.2

  •    C

Gdkxft transparently adds anti-aliased font support to gtk+-1.2. Once you have installed it, you can run any (well, nearly any) existing gtk+ binary and see anti-aliased fonts in the gtk widgets. You don't need to recompile gtk+ or your applications.





We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.