ambar - :mag: Ambar: Document Search System

  •        96

Ambar is an open-source document search and management system with automated crawling, OCR, tagging and instant full-text search.There are two editions available: Community and Enterprise. Enterprise Edition is a full featured document search and management system that can handle terabytes of data.

https://ambar.cloud/
https://github.com/RD17/ambar

Tags
Implementation
License
Platform

   




Related Projects

Solr - Blazing-fast, open source enterprise search platform

  •    Java

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

Xapian - Search Engine Library

  •    C++

Xapian is an Open Source Search Engine Library. It is written in C++, with bindings to allow use from Perl, Python, PHP, Java, Tcl, C# and Ruby. Xapian is a highly adaptable toolkit which allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also supports a rich set of boolean query operators.

magnetico - Autonomous (self-hosted) BitTorrent DHT search engine suite.

  •    Python

Autonomous (self-hosted) BitTorrent DHT search engine suite. Both programs, combined together, allows anyone with a decent Internet connection to access the vast amount of torrents waiting to be discovered within the BitTorrent DHT space, without relying on any central entity.

Sourcegraph - Code search and intelligence, self-hosted and scalable

  •    Go

Sourcegraph is a fast, open-source, fully-featured code search and navigation engine. It provides Fast global code search with a hybrid backend that combines a trigram index with in-memory streaming, Code intelligence for many languages via the Language Server Protocol.

PDFBox - Java PDF library

  •    Java

Apache PDFBox is an open source Java PDF library for working with PDF documents. This library allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. It provides support for adding bookmarks, fonts, text extraction, Encryption, PDF printing and lot more.


Vespa - Yahoo's big data serving engine

  •    Java

Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time. Vespa is serving platform for Yahoo.com, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Gemini, Flickr.

Manticore Search - High performance full-text search engine with SQL and JSON support

  •    C++

Manticore Search is an open source high performance full-text search oriented engine. It is a fork of Sphinx Search. Manticore Search is written in C++. It means speed and low resource consumption, it means you don’t have to worry about a garbage collector that suddenly makes a trouble.

Open Search Server

  •    C++

Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.

algoliasearch-client-php - Algolia Search API Client for PHP

  •    PHP

Algolia Search is a hosted full-text, numerical, and faceted search engine capable of delivering realtime results from the first keystroke. The Algolia Search API Client for PHP lets you easily use the Algolia Search REST API from your PHP code.

Yioop - Open Source Search Engine Software

  •    PHP

Yioop is an open source, PHP search engine capable of crawling, index, and providing search results for hundred of millions of pages on relatively low end hardware. It can index a variety of text formats HTML, RSS, PDF, RTF, DOC and images GIF, JPEG, PNG, etc. It can import data from ARC, WARC, Media-Wiki, Open Directory RDF. It is easily localized to many languages. It has built-in support for new feeds, discussion groups, blogs, and wikis. It also supports mixing indexes to create mash ups.

DropboxBrowser - A simple ios Dropbox PDF Document Browser - list Dropbox, browse directory, download PDF Documents

  •    Objective-C

Dropbox Browser provides a simple and effective way to browse, search, and download files using the Dropbox's API and SDK. In a few minutes you'll have a working Dropbox file browser in your app that lets users browse and download their files. Project highlights and key features are listed below. Dropbox Browser has a great interface built for iOS 7, solid file handling features, notification integration, background support, and file search capability.

Constellio - Enterprise Search engine

  •    Java

Constellio Open Source Enterprise Search is based on Apache Solr and using Google Search Appliances connectors architecture, it allows, with a single click, to find all relevant content in your organization (Web, email, ECM, CRM etc.).

Sphinix - Search server

  •    C++

Sphinix is free open-source SQL full-text search engine. How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.

tantivy - Tantivy is a full-text search engine library inspired by Lucene and written in Rust

  •    Rust

Tantivy is a full text search engine library written in rust. It is closer to Lucene than to Elastic Search and Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine.

Jumper - Collaborative search engine in PHP

  •    PHP

Jumper 2.0 is a collaborative community search platform that revolutionizes search by crowdsourcing knowledge management powered by a shared bookmarking engine. It is easily and quickly deployed into a community of practice that benefits users with complex and specialized search requirements. Jumper delivers universal search of any databases, flat files, fileshares, content systems, web pages, blogs and wikis, even people - through one simple search box.

SharePoint OCR image files indexing

  •    

IFilter plugin for the Microsoft Indexing Service (and Sharepoint in particular) to index and search image files (including TIFF, PDF, JPEG, BMP...) using OCR technology.

JSSindex: JavaScript Search Engine

  •    Javascript

JSSindex (The JavaScript Search Engine) provides full-text search for collections of documents in HTML, PS, PDF, and DjVu. The index and query engine are entirely contained in JavaScript/HTML files. Therefore, searching merely requires a Web browser.

algoliasearch-client-javascript - 🔎 Algolia Search API Client for JavaScript platforms

  •    Javascript

Algolia Search is a hosted full-text, numerical, and faceted search engine capable of delivering realtime results from the first keystroke. The Algolia Search API Client for JavaScript lets you easily use the Algolia Search REST API from your JavaScript code. The JavaScript client works both on the frontend (browsers) or on the backend (Node.js) with the same API.

tntsearch - A fully featured full text search engine written in PHP

  •    PHP

We created also some demo pages that show tolerant retrieval with n-grams in action. The package has bunch of helper functions like jaro-winkler and cosine similarity for distance calculations. It supports stemming for English, Croatian, Arabic, Italian, Russian, Portuguese and Ukrainian. If the built in stemmers aren't enough, the engine lets you easily plugin any compatible snowball stemmer. Some forks of the package even support Chinese. Unlike many other engines, the index can be easily updated without doing a reindex or using deltas.

Strus - Full text Search Engine in C++

  •    C++

The open source project strus provides a collection of C++ (C++98) libraries and command line tools for building a full-text search engine. The strus search engine can be build using any key value store database that provides an upper bound seek function for the stored key/value pairs. Currently there exists an implementation based on the LevelDB library.