VexCL is a vector expression template library for OpenCL/CUDA. It has been created for ease of GPGPU development with C++. VexCL strives to reduce amount of boilerplate code needed to develop GPGPU applications. The library provides convenient and intuitive notation for vector arithmetic, reduction, sparse matrix-vector products, etc. Multi-device and even multi-platform computations are supported. The source code of the library is distributed under very permissive MIT license.
http://vexcl.readthedocs.orgTags | opencl cuda c-plus-plus gpgpu scientific-computing cpp11 |
Implementation | C++ |
License | MIT |
Platform |
Boost.Compute is a GPU/parallel-computing library for C++ based on OpenCL. The core library is a thin C++ wrapper over the OpenCL API and provides access to compute devices, contexts, command queues and memory buffers.
opencl boost c-plus-plus cpp compute gpu gpgpu performance hpcArraymancer is a tensor (N-dimensional array) project in Nim. The main focus is providing a fast and ergonomic CPU, Cuda and OpenCL ndarray library on which to build a scientific computing and in particular a deep learning ecosystem. The library is inspired by Numpy and PyTorch. The library provides ergonomics very similar to Numpy, Julia and Matlab but is fully parallel and significantly faster than those libraries. It is also faster than C-based Torch.
tensor nim multidimensional-arrays cuda deep-learning machine-learning cudnn high-performance-computing gpu-computing matrix-library neural-networks parallel-computing openmp linear-algebra ndarray opencl gpgpu iot automatic-differentiation autogradnVIDIA's Runtime API for CUDA is intended for use both in C and C++ code. As such, it uses a C-style API, the lowest common denominator (with a few notable exceptions of templated function overloads). This library of wrappers around the Runtime API is intended to allow us to embrace many of the features of C++ (including some C++11) for using the runtime API - but without reducing expressivity or increasing the level of abstraction (as in, e.g., the Thrust library). Using cuda-api-wrappers, you still have your devices, streams, events and so on - but they will be more convenient to work with in more C++-idiomatic ways.
wrapper gpu modern-cpp cuda nvidia gpgpu api-wrapper gpu-memory gpu-computing cuda-toolkit cuda-device cuda-runtime-api gpgpu-computing cuda-api-wrappersThe University of California holds the copyright on all BOINC source code. By submitting contributions to the BOINC code, you irrevocably assign all right, title, and interest, including copyright and all copyright rights, in such contributions to The Regents of the University of California, who may then use the code for any purpose that it desires. BOINC is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
boinc distributed-computing volunteer-computing high-performance-computing citizen-science scientific-computing grid-computing science high-throughput-computing c-plus-plusRecent generations of CPUs, and GPUs in particular, require data-parallel codes for full efficiency. Data parallelism requires that the same sequence of operations is applied to different input data. CPUs and GPUs can thus reduce the necessary hardware for instruction decoding and scheduling in favor of more arithmetic and logic units, which execute the same instructions synchronously. On CPU architectures this is implemented via SIMD registers and instructions. A single SIMD register can store N values and a single SIMD instruction can execute N operations on those values. On GPU architectures N threads run in perfect sync, fed by a single instruction decoder/scheduler. Each thread has local memory and a given index to calculate the offsets in memory for loads and stores. Current C++ compilers can do automatic transformation of scalar codes to SIMD instructions (auto-vectorization). However, the compiler must reconstruct an intrinsic property of the algorithm that was lost when the developer wrote a purely scalar implementation in C++. Consequently, C++ compilers cannot vectorize any given code to its most efficient data-parallel variant. Especially larger data-parallel loops, spanning over multiple functions or even translation units, will often not be transformed into efficient SIMD code.
vectorization parallel simd-vector simd-instructions simd avx c-plus-plus avx512 sse neon cpp portable cpp11 cpp14 cpp17 avx2 simd-programming data-parallel parallel-computingNeanderthal is a Clojure library for fast matrix and linear algebra computations based on the highly optimized native libraries of BLAS and LAPACK computation routines for both CPU and GPU.. Read the documentation at Neanderthal Web Site.
clojure-library matrix gpu gpu-computing gpgpu opencl cuda high-performance-computing vectorization api matrix-factorization matrix-multiplication matrix-functions matrix-calculationsThe National Library of Medicine Insight Segmentation and Registration Toolkit (ITK), or Insight Toolkit, is an open-source, cross-platform C++ toolkit for segmentation and registration. Segmentation is the process of identifying and classifying data found in a digitally sampled representation. Typically the sampled representation is an image acquired from such medical instrumentation as CT or MRI scanners. Registration is the task of aligning or developing correspondences between data. For example, in the medical environment, a CT scan may be aligned with a MRI scan in order to combine the information contained in both. The toolkit may be built from source using CMake.
itk insight-toolkit c-plus-plus image-analysis medical-imaging scientific-computing open-science open-source reproducible-researchThis quick start guide is meant as a very brief overview of some of the things that can be done with NumCpp. For a full breakdown of everything available in the NumCpp library please visit the Full Documentation. The main data structure in NumCpp is the NdArray. It is inherently a 2D array class, with 1D arrays being implemented as 1xN arrays. There is also a DataCube class that is provided as a convenience container for storing an array of 2D NdArrays, but it has limited usefulness past a simple container.
c-plus-plus algorithms cpp numpy data-structures scientific-computing mathematical-functions numerical-analysisArmadillo: fast C++ library for linear algebra & scientific computing - http://arma.sourceforge.net
linear-algebra matrix matrix-functions linear-algebra-library statistics matlab blas lapack hpc scientific-computing mkl machine-learning armadillo openmp gaussian-mixture-models cpp11 vector sparse-matrix expression-template matrix-factorization⚠ Please note that while Emu 0.2.0 is quite usable, it suffers from 2 key issues. It firstly does nothing to minimize CPU-GPU data transfer and secondly it's compiler is not well-tested. These can be reasons not to use Emu 0.2.0. A new version of Emu is in the works, however, with significant improvements in the language, compiler, and compile-time checker. This new version of Emu should be released some time in Q4 of 2019. But unlike OpenCL/CUDA/Halide/Futhark, Emu is embedded in Rust. This lets it take advantage of the ecosystem in ways...
emu gpu gpgpu gpu-computing gpu-acceleration gpu-programmingArrayFire is a high performance software library for parallel computing with an easy-to-use API. Its array based function set makes parallel programming simple. ArrayFire's multiple backends (CUDA, OpenCL and native CPU) make it platform independent and highly portable. A few lines of code in ArrayFire can replace dozens of lines of parallel computing code, saving you valuable time and lowering development costs.
parallel-computing parallel cuda libraryCatch2 stands for C++ Automated Test Cases in a Header and is a multi-paradigm test framework for C++. which also supports Objective-C (and maybe C). It is primarily distributed as a single header file, although certain extensions may require additional headers.
c-plus-plus cpp98 cpp11 testing test-framework tdd bdd header-only single-file no-dependencies framework unit-testing testFit is a header-only C++11/C++14 library that provides utilities for functions and function objects, which can solve many problems with much simpler constructs than whats traditionally been done with metaprogramming. This requires a C++11 compiler. There are no third-party dependencies. This has been tested on clang 3.5-3.8, gcc 4.6-6.2, and Visual Studio 2015. Gcc 5.1 is not supported at all, however, gcc 5.4 is supported.
modern constexpr cplusplus lambda functional c-plus-plus cplusplus-11 cplusplus-14 cpp cpp11 cpp14 functional-programmingHigherOrderFunctions is a header-only C++11/C++14 library that provides utilities for functions and function objects, which can solve many problems with much simpler constructs than whats traditionally been done with metaprogramming. This requires a C++11 compiler. There are no third-party dependencies. This has been tested on clang 3.5-3.8, gcc 4.6-7, and Visual Studio 2015 and 2017.
modern constexpr cplusplus lambda functional c-plus-plus cplusplus-11 cplusplus-14 cpp cpp11 cpp14 functional-programmingV1 replaced Boost with C++11. V2 added Lua bindings. V3 refactored to remove some warts, ease future development, and re-add serialisation. API is evolving. Ponder is a C++ multi-purpose reflection library. It provides an abstraction for most of the high-level concepts of C++: classes, enumerations, functions, properties.
c-plus-plus reflection library lua-binding serialization introspection cpp cpp11 cpp14 cpp-library reflection-library serialization-library🔥简洁易用的C++11网络库 / 支持单机千万并发连接 / a simple C++11 network server framework
c-plus-plus networking cpp11 concurrent-programming epollDLL is a library that aims to provide a C++ implementation of Restricted Boltzmann Machine (RBM) and Deep Belief Network (DBN) and their convolution versions as well. It also has support for some more standard neural networks. Note: When you clone the library, you need to clone the sub modules as well, using the --recursive option.
c-plus-plus cpp cpp11 cpp14 performance machine-learning deep-learning artificial-neural-networks gpu rbm cpu convolutional-neural-networks:books: Solutions for C++ Primer 5th exercises.
cpp-primer cpp11 c-plus-plus exercise-solutions cplusplus-11 cplusplus cppsdoctest is a new C++ testing framework but is by far the fastest both in compile times (by orders of magnitude) and runtime compared to other feature-rich alternatives. It brings the ability of compiled languages such as D / Rust / Nim to have tests written directly in the production code by providing a fast, transparent and flexible test runner with a clean interface. The framework is and will stay free but needs your support to sustain its development. There are lots of new features and maintenance to do. If you work for a company using doctest or have the means to do so, please consider financial support. Monthly donations via Patreon and one-offs via PayPal.
c-plus-plus doctest tdd testing unit-testing header-only cplusplus cpp cpp11 cpp14 cpp17 single-file test-frameworkThis repository contains all the code examples from the Boost C++ Application Development Cookbook, Second Edition (ISBN: 9781787282247), Packt Publishing, by Antony Polukhin. Compile and Run Examples Online.
boost c-plus-plus cpp book cpp11 tutorial teaching tutorials recipes cpp14 cpp17 online-learning online-compiler
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.