Displaying 1 to 20 from 54 results

paperwork - Personal document manager (Linux/Windows)

  •    Python

Paperwork is a personal document manager. It manages scanned documents and PDFs.It's designed to be easy and fast to use. The idea behind Paperwork is "scan & forget": You can just scan a new document and forget about it until the day you need it again.

FileMasta - Search servers for video, music, books, software, games, subtitles and much more

  •    CSharp

FileMasta is a search engine allowing you to find a file among millions of files located on FTP-servers. The search engine database contains the regularly updated information on the contents of thousands FTP-servers worldwide. We don't search the contents of the files. We host no content, we provide only access to already available files in the same way Google and other search engines do.

Index Analysis


Index Analysis allows DBAs and developers to analyze the indexes in a database to determine the use, characteristics, and statistics of the index. The project contains the query written in T-SQL and report that can be opened as a custom report in SSMS.

SharePoint OCR image files indexing


IFilter plugin for the Microsoft Indexing Service (and Sharepoint in particular) to index and search image files (including TIFF, PDF, JPEG, BMP...) using OCR technology.

hypopg - Hypothetical Indexes for PostgreSQL

  •    C

HypoPG is a PostgreSQL extension adding support for hypothetical indexes. An hypothetical, or virtual, index is an index that doesn't really exists, and thus doesn't cost CPU, disk or any resource to create. They're useful to know if specific indexes can increase performance for problematic queries, since you can know if PostgreSQL will use these indexes or not without having to spend resources to create them. For more information on this extension usage, you can see this blog post.

spotweb - Decentralized community

  •    PHP

Spotweb is a decentralized usenet community based on the Spotnet protocol. Spotweb requires an operational webserver with PHP5 installed, it uses either an MySQL or an PostgreSQL database to store it's contents in.

ethql - A GraphQL interface to Ethereum :fire:

  •    TypeScript

Example queries. EthQL is a server that exposes a GraphQL endpoint to the public Ethereum ledger. It works against the standard JSON-RPC APIs offered by all Ethereum clients. It is built in TypeScript, and thus leverages the vast ecosystem of GraphQL tooling while preserving compile-time type safety.

Foundatio.Repositories - Generic repositories

  •    CSharp

Generic repository contract and implementations. Currently only implemented for Elasticsearch, but there are plans for other implementations.

changes-index - create indexes from a leveldb changes feed

  •    Javascript

This package provides a way to create a materialized view on top of an append-only log.To update an index, just change the index code and delete the indexed data.


  •    Javascript

A simple and flexible indexer for LevelDB, built on LevelUP and Map Reduce; allowing asynchronous index calculation.After initialising Mapped Index, your LevelUP instance will have some new methods that let you register new indexes and fetch values from them.

hyperdrive-index - index changes to a hyperdrive feed

  •    Javascript

Use this package to generate indexes to quickly answer questions about files written to hyperdrive. For example, you could create an index that parses exif headers and generates thumbnails for a p2p photo album served over a hyperdrive.This example indexes the number of lines in each file written to hyperdrive.

Crawlme - Ajax crawling for your web application

  •    Javascript

#Crawlme A Connect/Express middleware that makes your node.js web application indexable by search engines. Crawlme generates static HTML snapshots of your JavaScript web application on the fly and has a built in periodically refreshing in-memory cache, so even though the snapshot generation may take a second or two, search engines will get them really fast. This is beneficial for SEO since response time is one of the factors used in the page rank algorithm.Making ajax applications crawlable has always been tricky since search engines don't execute the JavaScript on the web sites they crawl. The solution to this is to provide the search engines with pre-rendered HTML versions of each page on your site, but creating those HTML versions has until now been a tedious and error prone process with many manual steps. Crawlme fixes this by rendering HTML snapshots of your web application on the fly whenever the Googlebot crawls your site. Apart from making the process of more or less manually creating indexable HTML versions of your site obsolete, this also has the benefit that Google will always index the latest version of your site and not some old pre-rendered version.

esbulk - elasticsearch bulk indexing for newline delimited JSON.

  •    Go

Please note that, in such a case, some documents are indexed and some are not. Your index will be in an inconsistent state, since there is no transactional bracket around the indexing process. However, using defaults (parallism: number of cores) on a single node setup will just work. For larger clusters, increase the number of workers until you see full CPU utilization. After that, more workers won't buy any more speed.

solrbulk - SOLR bulk indexing utility for the command line.

  •    Go

solrbulk expects as input a file with line-delimited JSON. Each line represents a single document. solrbulk takes care of reformatting the documents into the bulk JSON format, that SOLR understands. solrbulk will send documents in batches and in parallel. The number of documents per batch can be set via -size, the number of workers with -w.