•        0

ArchAssembler is a .net (c#) library providing the functionalities of an assembler. Target architecture is x86/x64 with streaming SIMD extensions. Target executable file format is Windows Portable Executable (PE).



Related Projects

x86-ray-tracer - Ray tracer written in x86 assembler (with SSE/SSE2 extensions).

Ray tracer written in x86 assembler (with SSE/SSE2 extensions).

SIMD Array

Simple array class to use SSE and SSE2.

Cross-platform SIMD C Headers

A cross-platform, cross-compiler, cross-CPU C header library for programming with SIMD instruction sets. X86 (MMX/SSE/SSE2) GCC and MSVC, PPC Altivec GCC, WMMX ARM GCC, and software emulated SIMD are supported.


This library is meant for high performance calculations for science or 3D games/rasterizers using SIMD instructions of x86 processors to allow an unparalleled level of optimization. This takes advantage of MMX, 3DNow!, 3DNow!+/MMX+, amp; SSE/SSE2/SSE3/SSSE3

simd - SIMD abstraction layer for sse, altivec and spu

SIMD abstraction layer for sse, altivec and spu


a JIT assembler for x86(IA-32)/x64(AMD64, x86-64) MMX/SSE/SSE2/SSE3/SSSE3/SSE4/FPU/AVX/AVX2 by C++ header


a JIT assembler for x86(IA-32)/x64(AMD64, x86-64) MMX/SSE/SSE2/SSE3/SSSE3/SSE4/FPU/AVX by C++ header

FluidSSE - SIMD using SSE 4

SIMD using SSE 4

ssela - SIMD (SSE) Linear Algebra package.

SIMD (SSE) Linear Algebra package.

SIMD Detector

This SIMD class helps developers to detect the types of SIMD instruction available on users' processor. It supports Intel and AMD CPUs. It is written in C++.


A performance comparison between different implementations of an SIMD vector using SSE2

despacer - C library to remove white space from strings as fast as possible

We want to remove the space (' ') and the line feeds characters ('\n', '\r') from a string as fast as possible. To avoid unnecessary allocations, we wish to do the processing in-place.Note that clang seems to give better results than gcc.

FastPFor - The FastPFOR C++ library: Fast integer compression

A research library with integer compression schemes. It is broadly applicable to the compression of arrays of 32-bit integers where most integers are small. The library seeks to exploit SIMD instructions (SSE) whenever possible.This library can decode at least 4 billions of compressed integers per second on most desktop or laptop processors. That is, it can decompress data at a rate of 15 GB/s. This is significantly faster than generic codecs like gzip, LZO, Snappy or LZ4.

PE-Header - Displays the Microsoft Portable Executable headers from an executable file

Displays the Microsoft Portable Executable headers from an executable file


Dependency library libjpeg_turbo for WebRTC engine (as used in Open Peer C++ library). libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON) to accelerate baseline JPEG compression and decompression on x86, x86-64, and ARM systems.


LFMat is an open source template fast C++ linear algebra library with storage compatible and asm specializations for 3DNow!, SSE, SSE2 and Altivec (e.g. for solvers) taking cache into account. There's a wide variety of structure and storage styles...

FFFF - Fast Floating Fractal Fun

FFFF is the fastest Win32/OSX/Linux/IRIX Mandelbrot generator. Features OpenGL, realtime zoom, SSE/AltiVec QuadPixel, SSE2/3DNow! DualPixel calc, FPU per pixel calc, GPU asm (Fragment/Vertex) calc, multiprocessor support, and benchmarking. Opt asm code!


A JPEG decoder written entirely in assembly with SSE and SSE2 optimizations. Uses floating point internally for maximum precision.

flat assembler

Fast and efficient self-assembling 80x86 assembler for DOS/Win32/Linux; with 8086-80486/Pentium/MMX/SSE/AVX/XOP instructions support, 16-bit/32-bit/64-bit code, binary/MZ/PE/COFF/ELF output formats.