broken-link-checker - Find broken links, missing images, etc in your HTML.

  •        5

Find broken links, missing images, etc in your HTML. Node.js >= 0.10 is required; < 4.0 will need Promise and Object.assign polyfills.

https://github.com/stevenvachon/broken-link-checker

Dependencies:

bhttp : ^1.2.1
calmcard : ~0.1.1
chalk : ^1.1.3
char-spinner : ^1.0.1
condense-whitespace : ^1.0.0
default-user-agent : ^1.0.0
errno : ~0.1.4
extend : ^3.0.0
humanize-duration : ^3.9.1
http-equiv-refresh : ^1.0.0
is-stream : ^1.0.1
is-string : ^1.0.4
limited-request-queue : ^2.0.0
link-types : ^1.1.0
maybe-callback : ^2.1.0
nopter : ~0.3.0
parse5 : ^3.0.2
robot-directives : ~0.3.0
robots-txt-guard : ~0.1.0
robots-txt-parse : ~0.0.4
urlcache : ~0.7.0
urlobj : 0.0.11

Tags
Implementation
License
Platform

   




Related Projects

sitecheck

  •    Python

Modular web site spider for web developers.

check-links - Robustly checks an array of URLs for liveness. Extremely fast ⚡

  •    Javascript

Robustly checks an array of URLs for liveness. For each URL, it first attempts an HTTP HEAD request, and if that fails it will attempt an HTTP GET request, retrying several times by default with exponential falloff.

SharePoint Link Checker

  •    

SharePoint Link Checker can be used by administrators to schedule scans of site collections and report on broken links that are found in publishing content, link fields, rich text fields, summary link fields/web parts and content editor web parts.

Linkrot checker

  •    

Linkrot scans a site for inaccessible links (http error 404, 500, etc.) and saves a log with bad links that you can open in Excel. It's a Windows console application developed in C# (.NET 2.0 stack). Simple, single thread crawling for dead links, broken links, dangling links.


html-proofer - Test your rendered HTML files to make sure they're accurate.

  •    Ruby

If you generate HTML files, then this tool might be for you. HTMLProofer is a set of tests to validate your HTML output. These tests check if your image references are legitimate, if they have alt tags, if your internal links are working, and so on. It's intended to be an all-in-one checker for your output.

PHPSPELLBOOK

  •    PHP

PHPSPELLBOOK is a suite of tools for Websites promotion, diagnosis and improvement. It provides to webmasters tools as: advertisiment submission, mass mailer, broken link checker, link exchange checker, fake click generators, anonymizer tools, etc..

Jenu -- The Java URL link checker

  •    Java

Jenu is a multi threaded Java based graphical www link checker. Should run with any JDK 1.3 runtime.

Dead Link Check (DLC)

  •    Perl

DLC - HTTP link checker written in Perl. Can generate HTML output for easy checking of results and process a link cache file to hasten multiple requests. Initially created as an extension to Public Bookmark Generator (PBM); can be used alone.

ht://Check

  •    PHP

ht://Check is more than a link checker. It's particularly suitable for checking broken links, anchors and web accessibility barriers, but retrieved data can also be used for Web structure mining. Uses a MySQL backend. Derived from ht://Dig.

Link-Master

  •    

Link-Checker is a Java-based application which provides further information about external links on a local website.

parse5 - HTML parsing/serialization toolset for Node

  •    Javascript

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.parse5 provides nearly everything you may need when dealing with HTML. It's the fastest spec-compliant HTML parser for Node to date. It parses HTML the way the latest version of your browser does. It has proven itself reliable in such projects as jsdom, Angular2, Polymer and many more.

dcrawl - Simple, but smart, multi-threaded web crawler for randomly gathering huge lists of unique domain names

  •    Go

dcrawl is a simple, but smart, multi-threaded web crawler for randomly gathering huge lists of unique domain names. dcrawl takes one site URL as input and detects all <a href=...> links in the site's body. Each found link is put into the queue. Successively, each queued link is crawled in the same way, branching out to more URLs found in links on each site's body.

muffet - Fast website link checker in Go

  •    Go

Muffet is a website link checker which scrapes and inspects all pages in a website recursively. For more information, see muffet --help.

WebFix

  •    CSharp

C# library and application to help maintain large websites. Goals for this project right now include: Site Crawler, Link Checker, (X)HTML / CSS compliance checker, missing images and files report, Metrics and Statistics, Fancy Reporting - Intuitive UI

bootlint - HTML linter for Bootstrap projects

  •    Javascript

Bootlint is a tool that checks for several common HTML mistakes in webpages that are using Bootstrap in a fairly "vanilla" way. Vanilla Bootstrap's components/widgets require their parts of the DOM to conform to certain structures. Bootlint checks that instances of Bootstrap components have correctly-structured HTML. Optimal usage of Bootstrap also requires that your pages include certain <meta> tags, an HTML5 doctype declaration, etc.; Bootlint checks that these are present. Bootlint assumes that your webpage is already valid HTML5. If you need to check HTML5 validity, we recommend tools like vnu.jar, grunt-html, or grunt-html-validation.

Seaside - Web framework for Smalltalk platforms

  •    Pharo

Seaside provides a layered set of abstractions over HTTP and HTML that let you build highly interactive web applications quickly, reusably and maintainably. It is based on Smalltalk, a proven and robust language that is implemented by different vendors.

Toplinker

  •    

It's a set of two scripts one that goes on for as long as you allow it to find all the links it can through a given url and then saves all the links into .txt file. The second script sorts all the links in increasing popularity by the number of occurrences of each url and then saves the list of urls in a different .txt file with no repeats and the number occurrences next to the link. Both scripts need to be in the same directory to work.

Simple Web Spider

  •    Java

Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.