GrabberX

  •        88

GrabberX is a site-mirroring tool. It is used to deal with form/cookie sealed websites, javascript generated links, and so on. The goal is not performance, but a handy tool that can help the crawl of other enterprise search engines.

http://grabberx.codeplex.com/

Tags
Implementation
License
Platform

   




Related Projects

FAST ESP Web Parts for SharePoint Server 2007


This project provides a set of installable Web parts for integrating FAST ESP search capabilities with SharePoint Server 2007. With these Web parts SharePoint administrtors can quickly build ESP-based search sites in SharePoint Server 2007 by simply dropping in and configuring...

Heritrix


Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project. Heritrix is designed to respect the robots.txt exclusion directives and META robots tags, and collect material at a measured, adaptive pace unlikely to disrupt normal website activity.

Wildcard Search Web Part for SharePoint 2010


The Wildcard Search web part for MOSS 2007 was wildly successful. Although, SharePoint 2010 has built-in wildcard searching functionality, the out-of-the box web part requires the user to add an asterisk to the search query. This web part resolves that issue.

SharePoint Search Service Tool


The SharePoint Search Service Tool is a rich web service client that allows a developer to explore the scopes and managed properties of a given SharePoint Search SSP, build queries in either Keyword or SQL Syntax, submit those queries and examine the raw web service results. ...

Data Extracting SDK


Data Extracting SDK can help you to extract information from the web resources in a simple way.



Add web part page quick search extension for SharePoint Server 2007


This small solution provide quick search feature on add web part page in SharePoint Server 2007. Now you can easy and fast search nesessary webpart - you need write only a first letters the name's webpart without any large page scrolling.

MiniCrawler - Super-tiny crawler script that will grab links or images from a web page


Super-tiny crawler script that will grab links or images from a web page

podcast - web crawler to grab itunes podcast info


web crawler to grab itunes podcast info

SharePoint Search XSL Samples


This project is a place to share examples of XSL that can be applied to SharePoint search web parts. Products include SharePoint Server 2010, Microsoft Office SharePoint Server 2007, Microsoft Search Server 2008, and Microsoft Search Server 2008 Express.

Norconex HTTP Collector - A Web Crawler in Java


Norconex HTTP Collector is a web spider, or crawler that aims to make Enterprise Search integrators and developers's life easier. It is Portable, Extensible, reusable, Robots.txt support, Obtain and manipulate document metadata, Resumable upon failure and lot more.

Search-Engine-Web-Crawler - Search engine, web crawler, and index maker in Java.


Search engine, web crawler, and index maker in Java.

Nutch - Highly extensible, highly scalable Web crawler


Nutch is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.

SharePoint 2007 Wildcard Search


A Microsoft Office SharePoint Server Search web part that allows for WildCard Searches and a second web part for the presentation of the search data using an XSL Transform document.

Gigablast - Web and Enterprise search engine in C++


Gigablast is one of the remaining four search engines in the United States that maintains its own searchable index of over a billion pages. It is scalable to thousands of servers. Has scaled to over 12 billion web pages on over 200 servers. It supports Distributed web crawler, Document conversion, Automated data corruption detection and repair, Can cluster results from same site, Synonym search, Spell checker and lot more.

FAST Search for Sharepoint MOSS 2010 Query Tool


Tool to query FAST for Sharepoint and Sharepoint 2010 Enterprise Search. It utilizes the search web services to run your queries so you can test your queries remotely from your local machine. It shows your results, allows you to refine your query (FAST), and page your results.

Open Search Server


Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.

Business Data - web information retrivial


We try to develop an opensource website crawler to retrieve business and marketing data from web sites or search engines.

Norconex HTTP Collector - Enterprise Web Crawler


Norconex HTTP Collector is a full-featured web crawler (or spider) that can manipulate and store collected data into a repositoriy of your choice (e.g. a search engine). It very flexible, powerful, easy to extend, and portable.

SharePoint Column Filtered Search Web Part


SharePoint Column Filtered search provides a filtered view of a SharePoint full-text search. Results are filtered by column values selected at runtime. The web part is configured for one or more libraries and associated columns. The user selects column values for results to ma...

.Net helpers for the SharePoint Server 2007 Search Query Web Service.


This project was created for an MSDN article. The code and article demonstrate a number of helper classes that can be used to easily inject queries to the SharePoint Server 2007 Search Query Web Service.