A realtime CPU/GPU profiler hosted in a single C file with a viewer that runs in a web browser. Windows (MSVC) - add lib/Remotery.c and lib/Remotery.h to your program. Set include directories to add Remotery/lib path. The required library ws2_32.lib should be picked up through the use of the #pragma comment(lib, "ws2_32.lib") directive in Remotery.c.
https://github.com/Celtoys/RemoteryTags | profiler gpu cpu d3d11 opengl cuda metal |
Implementation | C |
License | Apache |
Platform |
GPUImage 3 is the third generation of the GPUImage framework, an open source project for performing GPU-accelerated image and video processing on Mac and iOS. The original GPUImage framework was written in Objective-C and targeted Mac and iOS, the second iteration rewritten in Swift using OpenGL to target Mac, iOS, and Linux, and now this third generation is redesigned to use Metal in place of OpenGL. The objective of the framework is to make it as easy as possible to set up and perform realtime video processing or machine vision against image or video sources. Previous iterations of this framework wrapped OpenGL (ES), hiding much of the boilerplate code required to render images on the GPU using custom vertex and fragment shaders. This version of the framework replaces OpenGL (ES) with Metal. Largely driven by Apple's deprecation of OpenGL (ES) on their platforms in favor of Metal, it will allow for exploring performance optimizations over OpenGL and a tighter integration with Metal-based frameworks and operations.
by Emery Berger, Sam Stern, and Juan Altmayer Pizzorno. Scalene is a high-performance CPU, GPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do. It runs orders of magnitude faster than other profilers while delivering far more detailed information.
cpu profiler gpu memory-management performance-analysis memory-allocation profiling cpu-profiling memory-consumption gpu-programming python-profilers scalene profiles-memory performance-cpuVisual Studio 2015 or later is required to build LLGL on Windows. Xcode 9 or later is required to build LLGL on macOS.
renderer d3d11 d3d12 opengl directx vulkan metalMShadow is a lightweight CPU/GPU Matrix/Tensor Template Library in C++/CUDA. The goal of mshadow is to support efficient, device invariant and simple tensor library for machine learning project that aims for maximum performance and control, while also emphasize simplicity.MShadow also provides interface that allows writing Multi-GPU and distributed deep learning programs in an easy and unified way.
Arraymancer is a tensor (N-dimensional array) project in Nim. The main focus is providing a fast and ergonomic CPU, Cuda and OpenCL ndarray library on which to build a scientific computing and in particular a deep learning ecosystem. The library is inspired by Numpy and PyTorch. The library provides ergonomics very similar to Numpy, Julia and Matlab but is fully parallel and significantly faster than those libraries. It is also faster than C-based Torch.
tensor nim multidimensional-arrays cuda deep-learning machine-learning cudnn high-performance-computing gpu-computing matrix-library neural-networks parallel-computing openmp linear-algebra ndarray opencl gpgpu iot automatic-differentiation autogradA Vulkan and OpenGL overlay for monitoring FPS, temperatures, CPU/GPU load and more. Once done, proceed to the installation.
benchmarking opengl monitoring vulkan hudThe RAPIDS cuSignal project leverages CuPy, Numba, and the RAPIDS ecosystem for GPU accelerated signal processing. In some cases, cuSignal is a direct port of Scipy Signal to leverage GPU compute resources via CuPy but also contains Numba CUDA and Raw CuPy CUDA kernels for additional speedups for selected functions. cuSignal achieves its best gains on large signals and compute intensive functions but stresses online processing with zero-copy memory (pinned, mapped) between CPU and GPU. NOTE: For the latest stable README.md ensure you are on the latest branch.
ShaderConductor is a tool designed for cross-compiling HLSL to other shading languages. Note that this project is still in an early stage, and it is under active development.
spir-v hlsl glsl shader compiler metal dxil graphics vulkan d3d11 d3d12 d3d10 d3d9 opengl opengl-esHic sunt dracones. This repository contains multiple implementations of the same 3D scene, using different APIs and frameworks on various platforms. The goal is to provide a comparison between multiple rendering methods. This is inherently biased due to the variety of algorithms used and available CPU/GPU configurations, but can hopefully still provide interesting insights on 3D rendering.
computer-graphics graphics-programming rendering opengl scenekit cycles blender nds unity webgl software-rendering dragon gba ps2 metal⚠ Please note that while Emu 0.2.0 is quite usable, it suffers from 2 key issues. It firstly does nothing to minimize CPU-GPU data transfer and secondly it's compiler is not well-tested. These can be reasons not to use Emu 0.2.0. A new version of Emu is in the works, however, with significant improvements in the language, compiler, and compile-time checker. This new version of Emu should be released some time in Q4 of 2019. But unlike OpenCL/CUDA/Halide/Futhark, Emu is embedded in Rust. This lets it take advantage of the ecosystem in ways...
emu gpu gpgpu gpu-computing gpu-acceleration gpu-programmingRustaCUDA helps you bring GPU-acceleration to your projects by providing a flexible, easy-to-use interface to the CUDA GPU computing toolkit. RustaCUDA makes it easy to manage GPU memory, transfer data to and from the GPU, and load and launch compute kernels written in any language. RustaCUDA is intended to provide a programmer-friendly library for working with the host-side CUDA Driver API. It is not intended to assist in compiling Rust code to CUDA kernels (though see rust-ptx-builder for that) or to provide device-side utilities to be used within the kernels themselves.
gpu cuda cuda-apiPathfinder 3 is a fast, practical, GPU-based rasterizer for fonts and vector graphics using OpenGL 3.0+, OpenGL ES 3.0+, or Metal. Please note that Pathfinder is under heavy development and is incomplete in various areas.
nVIDIA's Runtime API for CUDA is intended for use both in C and C++ code. As such, it uses a C-style API, the lowest common denominator (with a few notable exceptions of templated function overloads). This library of wrappers around the Runtime API is intended to allow us to embrace many of the features of C++ (including some C++11) for using the runtime API - but without reducing expressivity or increasing the level of abstraction (as in, e.g., the Thrust library). Using cuda-api-wrappers, you still have your devices, streams, events and so on - but they will be more convenient to work with in more C++-idiomatic ways.
wrapper gpu modern-cpp cuda nvidia gpgpu api-wrapper gpu-memory gpu-computing cuda-toolkit cuda-device cuda-runtime-api gpgpu-computing cuda-api-wrappersCross-platform, graphics API agnostic, "Bring Your Own Engine/Framework" style rendering library. http://airmech.com/ AirMech is a free-to-play futuristic action real-time strategy video game developed and published by Carbon Games.
engine rendering graphics directx vulkan metal opengl d3d9 d3d11 d3d12 gles webgl graphics-programming emscripten glfw sdl gamedev gamedev-libraryfgprof is a sampling Go profiler that allows you to analyze On-CPU as well as Off-CPU (e.g. I/O) time together. Go's builtin sampling CPU profiler can only show On-CPU time, but it's better than fgprof at that. Go also includes tracing profilers that can analyze I/O, but they can't be combined with the CPU profiler.
performance performance-analysis profiling profiling-libraryRenderDoc is a frame-capture based graphics debugger, currently available for Vulkan, D3D11, D3D12, OpenGL, and OpenGL ES development on Windows 7 - 10, Linux, and Android. It is completely open-source under the MIT license. If you have any questions, suggestions or problems or you can create an issue here on github, email me directly or come into IRC to discuss it.
opengl vulkan vulkan-api directx direct3d d3d11 d3d12 graphics-programming graphics debuggerNeanderthal is a Clojure library for fast matrix and linear algebra computations based on the highly optimized native libraries of BLAS and LAPACK computation routines for both CPU and GPU.. Read the documentation at Neanderthal Web Site.
clojure-library matrix gpu gpu-computing gpgpu opencl cuda high-performance-computing vectorization api matrix-factorization matrix-multiplication matrix-functions matrix-calculationsWIP of a GPU Texture generator using dear imgui for UI. Not production ready and a bit messy but really fun to code. Basically, add GPU and CPU nodes in a graph to manipulate and generate images. Nodes are hardcoded now but a discovery system is planned. Currently nodes can be written in GLSL or C or Python. Use CMake and VisualStudio to build it. Only Windows system supported for now.
gpu texture imgui procgen tool glsl shaders openglStackImpact is a production-grade performance profiler built for both production and development environments. It gives developers continuous and historical code-level view of application performance that is essential for locating CPU, memory allocation and I/O hot spots as well as latency bottlenecks. Included runtime metrics and error monitoring complement profiles for extensive performance analysis. Learn more at stackimpact.com. Learn more on the features page (with screenshots).
python3 profiler memory-leak-detection memory-profiler cpu-profiler performance-tuning performance-metrics monitoring hot-spot-profiles tensorflowGunrock is a CUDA library for graph-processing designed specifically for the GPU. It uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. For more details, please visit our website, read Why Gunrock, our TOPC 2017 paper Gunrock: GPU Graph Analytics, look at our results, and find more details in our publications. See Release Notes to keep up with the our latest changes.
gunrock cuda graph-processing graph-analytics gpu graph-primitives
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.