Humanoid - Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

  •        62

A Node.js package to bypass WAF anti-bot JS challenges. Humanoid is a Node.js package to solve and bypass CloudFlare (and hopefully in the future - other WAFs' as well) JavaScript anti-bot challenges. While anti-bot pages are solvable via headless browsers, they are pretty heavy and are usually considered over the top for scraping. Humanoid can solve these challenges using the Node.js runtime and present the protected HTML page. The session cookies can also be delegated to other bots to continue scraping causing them to avoid the JS challenges altogether.

https://github.com/evyatarmeged/Humanoid

Dependencies:

cheerio : ^1.0.0-rc.2
iltorb : ^2.4.0
request : ^2.88.0
request-promise-native : ^1.0.5
safe-eval : ^0.4.1
url-parse : ^1.4.3

Tags
Implementation
License
Platform

   




Related Projects

cloudflare-scrape - A Python module to bypass Cloudflare's anti-bot page.

  •    Python

A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests. Cloudflare changes their techniques periodically, so I will update this repo frequently. This can be useful if you wish to scrape or crawl a website protected with Cloudflare. Cloudflare's anti-bot page currently just checks if the client supports Javascript, though they may add additional techniques in the future.

NAXSI - High performance, low rules maintenance WAF for NGINX

  •    C

NAXSI means Nginx Anti XSS & SQL Injection. NAXSI is an open-source, high performance, low rules maintenance WAF for NGINX. Technically, it is a third party nginx module, available as a package for many UNIX-like platforms. This module, by default, reads a small subset of simple (and readable) rules containing 99% of known patterns involved in website vulnerabilities. For example, <, | or drop are not supposed to be part of a URI.

GoBot2 - Second Version of The GoBot Botnet, But more advanced.

  •    Go

After seeing another users Go based botnet i wanted to do more work on my GoBot, But i ended up building something a bit more. There is issues with this but it more of a advanced PoC.... I am not a good coder but i was able to make this buy doing some basic reading online. There was more i wanted to do with this project but i stopped, I am getting out of making Malware and virus's... I am going to move on to more legitimet things. Though i will be posting some of my old projects on my Github, and most of witch are malevolent i am putting them here to make it simpler for the 'good guys' to fight them and there kin. The C&C is a program, You can compile it for Windows, Linux, Mac systems. Its a self-running web-server that handles all connections on the selected port in the settings. it will serve the HTLM C&C to a connector if you allow it and it saves data about account, bots and commands as a SQL database and bots files (screenshots, keylogs, ect) as file under the bots own "Profile" You can control the botnet from the program(more secure) or control it from the HTML C&C. The C&C's program is extremely stable, Go based servers are know for handling millions or requests at once without fail, just make sure you have a good connection. The C&C has a build in hard-coded login (kinda like a Backdoor) you can use if you 'forgot' the account login. the C&C can have any number of accounts. With it being a self-contained program this removes the issue of SQLi attacks on the C&C so its more SECURE. The C&C can also run inside a Tor Hidden service if configured right and the client (bot) can connect to it using a onion.to or onion.cab forwarder if needed. Tor can also be used by the bot via a SOCKS proxy... Simple to do, Google it.

al-khaser - Public malware techniques used in the wild: Virtual Machine, Emulation, Debuggers, Sandbox detection

  •    C++

al-khaser is a PoC "malware" application with good intentions that aims to stress your anti-malware system. It performs a bunch of common malware tricks with the goal of seeing if you stay under the radar. You can download the latest release here: x86 | x64.


GroupButler - This bot can help you in managing your group with rules, anti-flood, description, custom triggers, and much more!

  •    Lua

This bot has been created to help people administrate their groups, and includes many useful tools. Group Butler was born as an otouto v3.1 (@mokubot), but it has been turned into an administration bot.

ferret - Declarative web scraping

  •    Go

ferret is a web scraping system aiming to simplify data extraction from the web for such things like UI testing, machine learning and analytics. Having its own declarative language, ferret abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself. It's extremely portable, extensible and fast. The following example demonstrates the use of dynamic pages. First of all, we load the main Google Search page, type search criteria into an input box and then click a search button. The click action triggers a redirect, so we wait till its end. Once the page gets loaded, we iterate over all elements in search results and assign the output to a variable. The final for loop filters out empty elements that might be because of inaccurate use of selectors.

aws-waf-sample - This repository contains example scripts and sets of rules for the AWS WAF service

  •    Python

Examples of sets of rules for the AWS WAF service and scripts to automate the management and configuration of AWS WAF rule sets. These examples include SDK usage, AWS CloudFormation templates and automations using AWS Lambda functions.This example AWS CloudFormation template contains an AWS WAF web access control list (ACL) and condition types and rules that illustrate various mitigations against application flaws described in the OWASP Top 10. However, note that this template is designed only as a starting point and may not provide sufficient protection to every workload. You should customize the template’s rules for each workload. For more information, please review the Use AWS WAF to Mitigate OWASP's Top 10 Web Application Vulnerabilities whitepaper.

aws-waf-sample - This repository contains example scripts and sets of rules for the AWS WAF service

  •    Python

Examples of sets of rules for the AWS WAF service and scripts to automate the management and configuration of AWS WAF rule sets. These examples include SDK usage, AWS CloudFormation templates and automations using AWS Lambda functions. This example AWS CloudFormation template contains an AWS WAF web access control list (ACL) and condition types and rules that illustrate various mitigations against application flaws described in the OWASP Top 10. However, note that this template is designed only as a starting point and may not provide sufficient protection to every workload. You should customize the template’s rules for each workload. For more information, please review the Use AWS WAF to Mitigate OWASP's Top 10 Web Application Vulnerabilities whitepaper.

rvest - Simple web scraping for R

  •    R

rvest helps you scrape information from web pages. It is designed to work with magrittr to make it easy to express common web scraping tasks, inspired by libraries like beautiful soup. Create an html document from a url, a file on disk or a string containing html with read_html().

portia - Visual scraping for Scrapy

  •    Python

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages. For more detailed instructions, and alternatives to using Docker, see the Installation docs.

pjscrape - A web-scraping framework written in Javascript, using PhantomJS and jQuery

  •    Javascript

pjscrape is a framework for anyone who's ever wanted a command-line tool for web scraping using Javascript and jQuery. Built for PhantomJS, it allows you to scrape pages in a fully rendered, Javascript-enabled context from the command line, no browser required. Please see http://nrabinowitz.github.io/pjscrape/ for usage, examples, and documentation.

nginx-ultimate-bad-bot-blocker - Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail for Repeat Offenders

  •    Shell

Please make sure you are subscribed to Github Notifications to be notified when the blocker is updated or when any important or mission critical (potentially breaking) changes may take place. This is our new preferred method of installation which is now done through a set of shell scripts contributed to this repo and maintained by Stuart Cardall @itoffshore who is one of the Alpine Linux package maintainers.

ModSecurity - Cross platform Web Application Firewall (WAF)

  •    C

ModSecurity is an open source, cross platform web application firewall (WAF) engine for Apache, IIS and Nginx that is developed by Trustwave's SpiderLabs. It has a robust event-based programming language which provides protection from a range of attacks against web applications and allows for HTTP traffic monitoring, logging and real-time analysis. With over 10,000 deployments world-wide, ModSecurity is the most widely deployed WAF in existence.

dryscrape - [not actively maintained] A lightweight Python library that uses Webkit to enable easy scraping of dynamic, Javascript-heavy web pages

  •    Python

NOTE: This package is not actively maintained. It uses QtWebkit, which is end-of-life and probably doesn't get security fixes backported. Consider using a similar package like Spynner instead. dryscrape is a lightweight web scraping library for Python. It uses a headless Webkit instance to evaluate Javascript on the visited pages. This enables painless scraping of plain web pages as well as Javascript-heavy “Web 2.0” applications like Facebook.

OWASP-Xenotix-XSS-Exploit-Framework - OWASP Xenotix XSS Exploit Framework is an advanced Cross Site Scripting (XSS) vulnerability detection and exploitation framework

  •    Python

OWASP Xenotix XSS Exploit Framework is an advanced Cross Site Scripting (XSS) vulnerability detection and exploitation framework. It provides Zero False Positive scan results with its unique Triple Browser Engine (Trident, WebKit, and Gecko) embedded scanner. It is claimed to have the world’s 2nd largest XSS Payloads of about 1500+ distinctive XSS Payloads for effective XSS vulnerability detection and WAF Bypass. It is incorporated with a feature rich Information Gathering module for target Reconnaissance. The Exploit Framework includes highly offensive XSS exploitation modules for Penetration Testing and Proof of Concept creation. Antivirus Solutions may detect it as a threat. However it is due to the features in the exploitation framework.

janusec - Janusec Application Gateway, a Golang based application security solution which provides WAF (Web Application Firewall), CC attack defense, unified web administration portal, private key protection, web routing and scalable load balancing

  •    Go

Janusec Application Gateway, an application security solution which provides WAF (Web Application Firewall), CC attack defense, unified web administration portal, private key protection, web routing and scalable load balancing. With Janusec, you can build secure and scalable applications. Detailed documentation is available at Janusec Application Gateway Documentation.

scrape - A simple, higher level interface for Go web scraping.

  •    Go

A simple, higher level interface for Go web scraping. When scraping with Go, I find myself redefining tree traversal and other utility functions.