psimd - Portable 128-bit SIMD intrinsics

  •        71

Portable 128-bit SIMD intrinsics

https://github.com/Maratyszcza/psimd

Tags
Implementation
License
Platform

   




Related Projects

Vc - SIMD Vector Classes for C++

  •    C++

Recent generations of CPUs, and GPUs in particular, require data-parallel codes for full efficiency. Data parallelism requires that the same sequence of operations is applied to different input data. CPUs and GPUs can thus reduce the necessary hardware for instruction decoding and scheduling in favor of more arithmetic and logic units, which execute the same instructions synchronously. On CPU architectures this is implemented via SIMD registers and instructions. A single SIMD register can store N values and a single SIMD instruction can execute N operations on those values. On GPU architectures N threads run in perfect sync, fed by a single instruction decoder/scheduler. Each thread has local memory and a given index to calculate the offsets in memory for loads and stores. Current C++ compilers can do automatic transformation of scalar codes to SIMD instructions (auto-vectorization). However, the compiler must reconstruct an intrinsic property of the algorithm that was lost when the developer wrote a purely scalar implementation in C++. Consequently, C++ compilers cannot vectorize any given code to its most efficient data-parallel variant. Especially larger data-parallel loops, spanning over multiple functions or even translation units, will often not be transformed into efficient SIMD code.

libsimdpp - Portable header-only zero-overhead C++ low level SIMD library

  •    C++

libsimdpp is a portable header-only zero-overhead C++ low level SIMD library. The library presents a single interface over SIMD instruction sets present in x86, ARM, PowerPC and MIPS architectures. On architectures that support different SIMD instruction sets the library allows the same source code files to be compiled for each SIMD instruction set and then hooked into an internal or third-party dynamic dispatch mechanism. This allows the capabilities of the processor to be queried on runtime and the most efficient implementation to be selected. The library sits somewhere in the middle between programming directly in SIMD intrinsics and even higher-level SIMD libraries. As much control as possible is given to the developer, so that it's possible to exactly predict what code the compiler will generate.

Simd - C++ image processing library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4

  •    C++

The Simd Library is a free open source image processing library, designed for C and C++ programmers. It provides many useful high performance algorithms for image processing such as: pixel format conversion, image scaling and filtration, extraction of statistic information from images, motion detection, object detection (HAAR and LBP classifier cascades) and classification, neural network. The algorithms are optimized with using of different SIMD CPU extensions. In particular the library supports following CPU extensions: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX-512 for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC (big-endian), NEON for ARM.

xsimd - Modern, portable C++ wrappers for SIMD intrinsics and parallelized, optimized math implementations

  •    C++

SIMD (Single Instruction, Multiple Data) is a feature of microprocessors that has been available for many years. SIMD instructions perform a single operation on a batch of values at once, and thus provide a way to significantly accelerate code execution. However, these instructions differ between microprocessor vendors and compilers. xsimd provides a unified means for using these features for library authors. Namely, it enables manipulation of batches of numbers with the same arithmetic operators as for single values. It also provides accelerated implementation of common mathematical functions operating on batches.

Cross-platform SIMD C Headers

  •    C

A cross-platform, cross-compiler, cross-CPU C header library for programming with SIMD instruction sets. X86 (MMX/SSE/SSE2) GCC and MSVC, PPC Altivec GCC, WMMX ARM GCC, and software emulated SIMD are supported.


cgmath - A linear algebra and mathematics library for computer graphics.

  •    Rust

A linear algebra and mathematics library for computer graphics. Not all of the functionality has been implemented yet, and the existing code is not fully covered by the testsuite. If you encounter any mistakes or omissions please let me know by posting an issue, or even better: send me a pull request with a fix.

SIMD Detector

  •    DotNet

This SIMD class helps developers to detect the types of SIMD instruction available on users' processor. It supports Intel and AMD CPUs. It is written in C++.

faster - SIMD for humans

  •    Rust

Easy, powerful, portable, absurdly fast numerical calculations. Includes static dispatch with inlining based on your platform and vector types, zero-allocation iteration, vectorized loading/storing, and support for uneven collections. The vector size is entirely determined by the machine you’re compiling for - it attempts to use the largest vector size supported by your machine, and works on any platform or architecture (see below for details).

ecmascript_simd - SIMD numeric type for EcmaScript

  •    Javascript

SIMD.js has been taken out of active development in TC39 and removed from Stage 3, and is not being pursued by web browsers for implementation. SIMD operations exposed to the web are under active development within WebAssembly, with operations based on the SIMD.js operations. With WebAssembly in advanced development or shipping in multiple browsers, it seems like an adequate vehicle to subsume asm.js use cases, which are judged to be the broader cases. Although some developers have expressed interest in using SIMD.js outside of asm.js, implementers have found that implementing and optimizing for this case reliably creates a lot of complexity, and have made the decision to focus instead on delivering WebAssembly and SIMD instructions in WASM.

FastPFor - The FastPFOR C++ library: Fast integer compression

  •    C++

A research library with integer compression schemes. It is broadly applicable to the compression of arrays of 32-bit integers where most integers are small. The library seeks to exploit SIMD instructions (SSE) whenever possible.This library can decode at least 4 billions of compressed integers per second on most desktop or laptop processors. That is, it can decompress data at a rate of 15 GB/s. This is significantly faster than generic codecs like gzip, LZO, Snappy or LZ4.

stdsimd - Experiments with adding SIMD support to Rust's standard library.

  •    Rust

stdsimd is now shipped with Rust's std library - its is part of libcore and libstd. The easiest way to use it is just to import it via use std::arch.

stdarch - Rust's standard library vendor-specific APIs and run-time feature detection

  •    HTML

std_detect implements std::detect - Rust's standard library run-time CPU feature detection. The std::simd component now lives in the packed_simd crate.

ispc - Intel SPMD Program Compiler

  •    C++

ispc is a compiler for a variant of the C programming language, with extensions for single program, multiple data programming. Under the SPMD model, the programmer writes a program that generally appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs; it frequently provides a 3x or more speedup on CPUs with 4-wide vector SSE units and 5x-6x on CPUs with 8-wide AVX vector units, without any of the difficulty of writing intrinsics code. Parallelization across multiple cores is also supported by ispc, making it possible to write programs that achieve performance improvement that scales by both number of cores and vector unit size.

ArchAssembler

  •    CSharp

ArchAssembler is a .net (c#) library providing the functionalities of an assembler. Target architecture is x86/x64 with streaming SIMD extensions. Target executable file format is Windows Portable Executable (PE).

simdjson - Parsing gigabytes of JSON per second

  •    C++

JSON documents are everywhere on the Internet. Servers spend a lot of time parsing these documents. We want to accelerate the parsing of JSON per se using commonly available SIMD instructions as much as possible while doing full validation (including character encoding). A description of the design and implementation of simdjson appears at https://arxiv.org/abs/1902.08318 and an informal blog post providing some background and context is at https://branchfree.org/2019/02/25/paper-parsing-gigabytes-of-json-per-second/.

Unity

  •    CSharp

A prototype of a C# math library providing vector types and math functions with a shader like syntax. Used by the burst compiler to compile C#/IL to highly efficient native code. The main goal of this library is to provide a friendly Math API familiar to SIMD and graphic/shaders developers, using the well known float4, float3 types...etc. with all intrinsics functions provided by a static class math that can be imported easily into your C# program with using static Unity.Mathematics.math.

GLM - OpenGL Mathematics (GLM)

  •    C++

OpenGL Mathematics (GLM) is a header only C++ mathematics library for graphics software based on the OpenGL Shading Language (GLSL) specifications. GLM provides classes and functions designed and implemented with the same naming conventions and functionality than GLSL so that anyone who knows GLSL, can use GLM as well in C++.

js - turbo.js - perform massive parallel computations in your browser with GPGPU.

  •    Javascript

turbo.js is a small library that makes it easier to perform complex calculations that can be done in parallel. The actual calculation performed (the kernel executed) uses the GPU for execution. This enables you to work on an array of values all at once. turbo.js is compatible with all browsers (even IE when not using ES6 template strings) and most desktop and mobile GPUs.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.