•        183

managedCUDA makes the CUDA Driver API available in .net applications written in C#, Visual Basic or any other .net language. It also includes classes for an easy handling and interop with CUDA, i.e. build-in CUDA types like float3.



Related Projects

ManagedCuda Galaxy Simulator

This project is a test of ManagedCuda and graphics interop to OpenTK to simulate a simple galaxy on the GPU.

vexcl - VexCL is a C++ vector expression template library for OpenCL/CUDA

VexCL is a vector expression template library for OpenCL/CUDA. It has been created for ease of GPGPU development with C++. VexCL strives to reduce amount of boilerplate code needed to develop GPGPU applications. The library provides convenient and intuitive notation for vector arithmetic, reduction, sparse matrix-vector products, etc. Multi-device and even multi-platform computations are supported. The source code of the library is distributed under very permissive MIT license.


A wrapper for NVidia's CuBLAS (Compute Unified Basic Linear Algebra Subprograms) for the CLR.


Optix.NET is a .NET wrapper for the Nvidia Optix GPU ray-tracing library.

scikit-cuda - Python interface to GPU-powered libraries

scikit-cuda provides Python interfaces to many of the functions in the CUDA device/runtime, CUBLAS, CUFFT, and CUSOLVER libraries distributed as part of NVIDIA's CUDA Programming Toolkit, as well as interfaces to select functions in the CULA Dense Toolkit. Both low-level wrapper functions similar to their C counterparts and high-level functions comparable to those in NumPy and Scipy are provided. Package documentation is available at Many of the high-level functions have examples in their docstrings. More illustrations of how to use both the wrappers and high-level functions can be found in the demos/ and tests/ subdirectories.

NyuziProcessor - GPGPU microprocessor architecture

Nyuzi is an experimental GPGPU processor hardware design focused on compute intensive tasks. It is optimized for use cases like blockchain mining, deep learning, and autonomous driving. This project includes a synthesizable hardware design written in System Verilog, an instruction set emulator, an LLVM based C/C++ compiler, software libraries, and tests. It can be used to experiment with microarchitectural and instruction set design tradeoffs.

mshadow - Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning

MShadow is a lightweight CPU/GPU Matrix/Tensor Template Library in C++/CUDA. The goal of mshadow is to support efficient, device invariant and simple tensor library for machine learning project that aims for maximum performance and control, while also emphasize simplicity.MShadow also provides interface that allows writing Multi-GPU and distributed deep learning programs in an easy and unified way.

coriander - Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices

Build applications written in NVIDIA® CUDA™ code for OpenCL™ 1.2 devices. Other systems should work too, ideally. You will need at a minimum at least one OpenCL-enabled GPU, and appropriate OpenCL drivers installed, for the GPU. Both linux and Mac systems stand a reasonable chance of working ok.

Image Resizer GPGPU

Make images smaller, resizing and resampling with incredible performance, scalability and ease with features such as GPGPU processing and distributed computing.


FsGPU project contains library and samples to assist general purpose GPU programming in F# for CUDA enabled devices.

Permutations with CUDA and OpenCL

Finding massive permutations on GPU with CUDA and OpenCL


Simple program that displays information about CUDA-enabled devices. The program is equipped with GPU performance test.

nnabla - Neural Network Libraries

Neural Network Libraries is a deep learning framework that is intended to be used for research, development and production. We aim to have it running everywhere: desktop PCs, HPC clusters, embedded devices and production servers.This installs the CPU version of Neural Network Libraries. GPU-acceleration can be added by installing the CUDA extension with pip install nnabla-ext-cuda.

GPU Flame Fractal Renderer

Renderer for flam3 cosmic recursive fractal flames implemented on GPU. Requires a CUDA-capable graphics card.

GPUVerify: A verifier for GPU kernels

GPUVerify is a tool for verifying race- and divergence-freedom of GPU kernels written in OpenCL and CUDA.


GPCompute is an old CUDA-like but Based on DX81 (or later) for compatibility with almost any current Videocards. It's Developped in C/C++. With Simple Interface for Arrayed-Computations. The Limitation all came from its DX version implemention.


An implementation of linear algebra numerical structures and methods for the CLR. NPack is unique in that it uses generics for matrix element definitions, and a set of matrix operations via an interface, allowing a CLR-based operations engine as well as the opportunity to use ...

C++ AMP LAPACK Library

Project Description C++ AMP LAPACK Library is a library of linear algebra subroutines that C++ AMP developers can freely use in their own projects. Note that this project builds upon and is dependent upon the C++ AMP BLAS library. Prerequisite Understanding C++ AMP is an ...


GPGPUs offer significant horsepower in our computers that are unfortunately not easily available to .NET programs. <project name> is a system capable to map .NET bytecode into GPU IL (e.g. nVidia PTX) so that you can run .NET algorithms on state of the art hardware.