MPJ Express - Parallel Programming in Java

MPJ Express is an open source Java message passing library that allows application developers to write and execute parallel applications for multicore processors and compute clusters/clouds. It allows writing parallel Java applications using an MPI-like API.


A wrapper for NVidia's CuBLAS (Compute Unified Basic Linear Algebra Subprograms) for the CLR.


Blog de atividades do Microsoft Innovation Center - Interop, também conhecido como LMS, na Unicamp.


UACluster2 is set of manuals and tools to create and manage high performance computing cluster based on Microsoft Hyper-V virtual machines. It needs Microsoft HPC Server 2008 (Microsoft HPC Server 2008 R2) as a basis of cluster creation.

Parallel Dwarfs

The Parallel Dwarfs project is a suite of 13 kernels (as VS projects in C++/C#/F#) parallelized using various technologies such as MPI, OpenMP, TPL, MPI.Net, etc. It also has a driver to run them, collect traces, and visualize the results using Vampir, Jumpshot, Xperf and Excel

Shared Genomics Project MPI Codebase

The Shared Genomics project has developed parallelised statistical applications (MPI/OpenMP) which can analyse large genomic data-sets containing thousands of Single Nucleotide Polymorphisms (SNP). The code is based on the popular PLINK SNP-analysis program.

Transactional Entity Framework


Interop Router

This project establishes a communication framework and job dispatcher for a mixed operating system cluster environment.

udocker - A basic user tool to execute simple docker containers in batch or interactive systems without root privileges

A basic user tool to execute simple docker containers in user space without requiring root privileges. Enables download and execution of docker containers by non-privileged users in Linux systems where docker is not available. It can be used to pull and execute docker containers in Linux batch systems and interactive clusters that are managed by other entities such as grid infrastructures or externally managed batch or interactive systems. The INDIGO udocker does not require any type of privileges nor the deployment of services by system administrators. It can be downloaded and executed entirely by the end user.

skale - High performance distributed data processing engine

High performance distributed data processing and machine learning.Skale provides a high-level API in Javascript and an optimized parallel execution engine on top of NodeJS.

batch-shipyard - Execute batch and HPC Dockerized workloads on Azure Batch with shared file system provisioning and linking support

Additionally, Batch Shipyard provides the ability to provision and manage entire standalone remote file systems (storage clusters) in Azure, independent of any integrated Azure Batch functionality.Batch Shipyard is now integrated directly into Azure Cloud Shell and you can execute any Batch Shipyard workload using your web browser or the Microsoft Azure Android and iOS app.


A node.js wrapper for the high-performance LAPACK linear algebra library.This library require LAPACK to be built and installed as a shared library. In time the entire build process may be unified into this project, but that's some time away.

hpc-in-a-day - a temporary fork of softwarecarpentry/hpc-novice

Novice introduction to high performance computing. This material was conceived as a sandbox project for swcarpentry/hpc-novice. Parts of it will be contributed to swcarpentry/hpc-novice in due course. The material as such targets future users of a HPC infrastructure of any discipline. The learners are expected to have an introductory level of programming skills and should know their way around the UNIX command line on a beginners level as well.

OFF - OFF, Open source Finite volume Fluid dynamics code

OFF, Open source Finite volumes Fluid dynamics code see documentation. It is written in in standard (compliant) Fortran 2003 with highly modularity as design target.

omega_h - Simplex mesh adaptivity for HPC

Omega_h is a C++11 library that implements tetrahedron and triangle mesh adaptativity, with a focus on scalable HPC performance using (optionally) MPI, OpenMP, or CUDA. It is intended to provided adaptive functionality to existing simulation codes. Mesh adaptivity allows one to minimize both discretization error and number of degrees of freedom live during the simulation, as well as enabling moving object and evolving geometry simulations. Omega_h will do this for you in a way that is fast, memory-efficient, and portable across many different architectures. For a bare minimum setup with no parallelism, you just need CMake, a C++11 compiler, and preferably ZLib installed.