pytrec_eval - pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval

  •        121

pytrec_eval is a Python interface to TREC's evaluation tool, trec_eval. It is an attempt to stop the cultivation of custom implementations of Information Retrieval evaluation measures for the Python programming language. The module was developed using Python 3.5. You need a Python distribution that comes with development headers. In addition to the default Python modules, numpy and scipy are required.



Related Projects

Terrier - Information Retrieval Platform

  •    Java

Terrier is a highly flexible, efficient, and effective open source search engine, readily deployable on large-scale collections of documents. Terrier implements state-of-the-art indexing and retrieval functionalities, and provides an ideal platform for the rapid development and evaluation of large-scale retrieval applications. Terrier can index large corpora of documents, and provides multiple indexing strategies, such as multi-pass, single-pass and large-scale MapReduce indexing.

Information Retrieval Toolkit

  •    C++

High-performance software for information retrieval research. Emphasis on semi-structured text retrieval, especially for HTML and XML. The goal is to facilitate information retrieval research by providing an interchangable toolkit of functions.



The EvalJ project develops java source code for the evaluation of information retrieval experiments. - Instructional notebooks on music information retrieval.

  •    Jupyter

stanford-mir is now This repository contains instructional Jupyter notebooks related to music information retrieval (MIR). Inside these notebooks are Python code snippets that illustrate basic MIR systems.

gensim - Topic Modelling for Humans

  •    Python

Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community. If this feature list left you scratching your head, you can first read more about the Vector Space Model and unsupervised document analysis on Wikipedia.

ferret - Ferret: the extensible information retrieval library for ruby.

  •    C

Ferret: the extensible information retrieval library for ruby.

iir - Machine Learning / Natural Language Processing / Information Retrieval

  •    Python

Machine Learning / Natural Language Processing / Information Retrieval

FSSearchIndexFX - A cross platform information retrieval API framework


FSSearchIndexFX is a cross platform Information Retrieval (IR) framework written in C# and supports both Windows and Mac OSX OSes It aims at developers writing or looking for some basic infrastructure API needed to perform IR tasks such as searching and indexing of text content.


  •    Java

MIREX (MapReduce Information Retrieval Experiments) provides solutions to easily and quickly run large-scale information retrieval experiments on a cluster of machines using Hadoop. Version 0.3 has tools for the TREC ClueWeb09 and ClueWeb12 collections.

Lemur - Search Engine

  •    Java

The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri search engine, Lemur Toolbar, and ClueWeb09 dataset.

Ferret - The extensible information retrieval library for ruby.

  •    Ruby

Ferret is an information retrieval library in the same vein as Apache Lucene. Originally it was a full port of Lucene but it now uses it's own file format and indexing algorithm although it is still very similar in many ways to Lucene. Everything you can do in Lucene you should be able to do in Ferret.

httpry - HTTP logging and information retrieval tool

  •    Perl

HTTP logging and information retrieval tool

madmom - Python audio and music signal processing library

  •    Python

Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is internally used by the Department of Computational Perception, Johannes Kepler University, Linz, Austria ( and the Austrian Research Institute for Artificial Intelligence (OFAI), Vienna, Austria (

Sequence-Semantic-Embedding - Tools and recipes to train deep learning models and build services for NLP tasks such as text classification, semantic search ranking and recall fetching, cross-lingual information retrieval, and question answering etc

  •    Python

SSE(Sequence Semantic Embedding) is an encoder framework toolkit for natural language processing related tasks. It's implemented in TensorFlow by leveraging TF's convenient deep learning blocks like DNN/CNN/LSTM etc. Depending on each specific task, similar semantic meanings can have different definitions. For example, in the category classification task, similar semantic meanings means that for each correct pair of (listing-title, category), the SSE of listing-title is close to the SSE of corresponding category. While in the information retrieval task, similar semantic meaning means for each relevant pair of (query, document), the SSE of query is close to the SSE of relevant document. While in the question answering task, the SSE of question is close to the SSE of correct answers.

meyda - Audio feature extraction for JavaScript.

  •    Javascript

Meyda is a Javascript audio feature extraction library. Meyda supports both offline feature extraction as well as real-time feature extraction using the Web Audio API. We wrote a paper about it, which is available here. Please see the documentation for setup and usage instructions.

essentia - C++ library for audio and music analysis, description and synthesis, including Python bindings

  •    Jupyter

Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications. If you use example extractors (located in src/examples), or your own code employing Essentia algorithms to compute descriptors, you should be aware of possible incompatibilities when using different versions of Essentia.

awesome-deep-learning-music - List of articles related to deep learning applied to music

  •    TeX

By Yann Bayle (Website, GitHub) from LaBRI (Website, Twitter), Univ. Bordeaux (Website, Twitter), CNRS (Website, Twitter) and SCRIME (Website). The role of this curated list is to gather scientific articles, thesis and reports that use deep learning approaches applied to music. The list is currently under construction but feel free to contribute to the missing fields and to add other resources! To do so, please refer to the How To Contribute section. The resources provided here come from my review of the state-of-the-art for my PhD Thesis for which an article is being written. There are already surveys on deep learning for music generation, speech separation and speaker identification. However, these surveys do not cover music information retrieval tasks that are included in this repository.

[GW]ammu - Talk to any phone

  •    C

Gammu and Wammu provides support to talk to any phone using standard API provided by libGammu. The most popular and widely used in various phones which includes Nokia, Siemens, Samsung, Motorola, LG, Alcatel. It provides support to send and receive SMS and MMS, Calendar, Contacts, backup SMS etc.

qone - Next-generation web query language, extend .NET LINQ for javascript.

  •    Javascript

Recently, it has just changed some bug of the Excel formula of the Tencent document, mainly modifying the parser of the formula. After getting code string, you can run (JIT) dynamically, such as using Eval in JS, Eval can retain context information, the disadvantage is that the execution code contains compiler code, and it is unsafe, and so on. After getting code string, you can also use the generated code string to run (AOT) directly, and the disadvantage is to rely on the build tool or editor plug-in to dynamically replace the source code.

piggieback - nREPL support for ClojureScript REPLs

  •    Clojure

nREPL middleware that enables the use of a ClojureScript REPL on top of an nREPL session. Piggieback provides an alternative ClojureScript REPL entry point (cemerick.piggieback/cljs-repl) that changes an nREPL session into a ClojureScript REPL for eval and load-file operations, while accepting all the same options as cljs.repl/repl. When the ClojureScript REPL is terminated (by sending :cljs/quit for evaluation), the nREPL session is restored to it original state.