HtmlFilter - Asp.Net and SharePoint Regex Response.Filter

  •        165

HtmlFilter will allow you to intercept whatever HTML the server outputs to the browser and filter it via regular expressions returning whatever you would like. It works via the page.response.filter property.



Related Projects

taggie - The tiniest little HTML/XML parser...using regex

The tiniest little HTML/XML parser...using regex

Python-Dom-Parser - parses html/xml DOM only regex method

parses html/xml DOM only regex method

CEOL - Add Flash, Iframes, etc. to MOSS HTML Content Editor

The MOSS HTML Content Editor Object Link allows editors of MOSS Publishing sites to embed html object elements directly into content areas. This can be used for example to embed YouTube videos directly in content areas, rather than only using a Web Part.

Neko HTML Parser - simple HTML scanner

NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces. The parser can scan HTML files and fix up many common mistakes that human (and computer) authors make in writing HTML documents. NekoHTML adds missing parent elements. Automatically closes elements with optional end tags and can handle mismatched inline element tags.

TagSoup - HTML/XML parser for Haskell

TagSoup is a library for parsing HTML/XML. It supports the HTML 5 specification, and can be used to parse either well-formed XML, or unstructured and malformed HTML from the web. The library also provides useful functions to extract information from an HTML document, making it ideal for screen-scraping.


An example RegEx for lexing title tags in valid html docs. HOW DARE I USE REGEX AND HTML IN THE SAME SENTENCE!

Hpricot - HTML parser for Ruby

Hpricot is a fast, flexible HTML parser. Hpricot can be handy for reading broken XML files, since many of the same techniques can be used. If a quote is missing, Hpricot tries to figure it out. If tags overlap, Hpricot works on sorting them out.

filter-reply-mails - Filter and trim plain text and html parts of mails fetched from an IMAP folder

The perl script connects to an IMAP server and fetches (and by default deletes) all mails from a specific IMAP folder to a temporary file system folder. A set of regexes from two different files and a file for CSS selectors, to query the DOM of HTML mails, trim all fetched mails. One regex file holds regexes for plain text parts of the mails while the other file holds regexes for HTML parts. Each line in these files defines one regex. CSS selectors are used to remove the matching elements. HTML parts can reference images therefore all referenced images of trimmed content are removed from the mail. After a mail has been trimmed it is moved to the destination file system folder.I tried to use self-explanatory options and arguments for the script. If you need further details please execute the script with the --help argument or have a look at the example below.

Html Agility Pack

This is an HTML parser that builds a read/write DOM from “real world” HTML files. It supports XPATH or XSLT and is tolerant with "real world" malformed HTML.

HtmlCleaner - HTML parser in Java

HtmlCleaner is HTML parser written in Java. HTML found on Web is usually dirty, ill-formed and unsuitable for further processing. HtmlCleaner reorders individual elements and produces well-formed XML. By default, it follows similar rules that the most of web browsers use in order to create Document Object Model. However, user may provide custom tag and rule set for tag filtering and balancing.

JTidy - HTML parser and pretty printer in Java

JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML. In addition, JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML.

perl-HTML-StripScripts-Parser - HTML::StripScripts::Parser - XSS filter using HTML::Parser

HTML::StripScripts::Parser - XSS filter using HTML::Parser


Concats files specified in blocks in HTML and replaces the reference to the new file. Uses an HTML parser rather than regex.

regex-parser - Play with ideas from

Play with ideas from

Nokogiri - HTML, XML, SAX, and Reader parser with XPath and CSS selector support

Nokogiri (?) is an HTML, XML, SAX, DOM parser. Among Nokogiri's many features is the ability to search documents via XPath or CSS3 selectors, XML/HTML builder, XSLT transformer. Nokogiri parses and searches XML/HTML using native libraries (either C or Java, depending on your Ruby), which means it's fast and standards-compliant.


A html parser that turns badly formatted html into XPath query able xml. Similar to html tidy and html agility pack; I suppose you can call it "Just Another Html Parser". Written in c# and does not require anything that isn't found in the dot net framework.