Displaying 1 to 5 from 5 results

nextflow - A DSL for data-driven computational pipelines

  •    Groovy

With the rise of big data, techniques to analyse and run experiments on large datasets are increasingly necessary. Parallelization and distributed computing are the best ways to tackle this kind of problem, but the tools commonly available to the bioinformatics community traditionally lack good support for these techniques, or provide a model that fits badly with the specific requirements in the bioinformatics domain and, most of the time, require the knowledge of complex tools or low-level APIs.

batchtools - Tools for computation on batch systems

  •    R

As a successor of the packages BatchJobs and BatchExperiments, batchtools provides a parallel implementation of Map for high performance computing systems managed by schedulers like Slurm, Sun Grid Engine, OpenLava, TORQUE/OpenPBS, Load Sharing Facility (LSF) or Docker Swarm (see the setup section in the vignette). Next, you need to setup batchtools for your HPC (it will run sequentially otherwise). See the vignette for instructions.

clustermq - R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH

  •    R

Computations are done entirely on the network and without any temporary files on network-mounted storage, so there is no strain on the file system apart from starting up R once per job. This way, we can also send data and results around a lot quicker. All calculations are load-balanced, i.e. workers that get their jobs done faster will also receive more function calls to work on. This is especially useful if not all calls return after the same time, or one worker has a high load.

ensembl-hive - EnsEMBL Hive - a system for creating and running pipelines on a distributed compute resource

  •    Perl

eHive is a system for running computation pipelines on distributed computing resources - clusters, farms or grids. The name comes from the way pipelines are processed by a swarm of autonomous agents.


  •    R

The future package provides a generic API for using futures in R. A future is a simple yet powerful mechanism to evaluate an R expression and retrieve its value at some point in time. Futures can be resolved in many different ways depending on which strategy is used. There are various types of synchronous and asynchronous futures to choose from in the future package. This package, future.batchtools, provides a type of futures that utilizes the batchtools package. This means that any type of backend that the batchtools package supports can be used as a future. More specifically, future.batchtools will allow you or users of your package to leverage the compute power of high-performance computing (HPC) clusters via a simple switch in settings - without having to change any code at all.