Multicore-TSNE - Parallel t-SNE implementation with Python and Torch wrappers.

  •        128

This is a multicore modification of Barnes-Hut t-SNE by L. Van der Maaten with python and Torch CFFI-based wrappers. This code also works faster than sklearn.TSNE on 1 core. Barnes-Hut t-SNE is done in two steps.



Related Projects

JSAT - Java Statistical Analysis Tool, a Java library for Machine Learning

  •    Java

JSAT is a library for quickly getting started with Machine Learning problems. It is developed in my free time, and made available for use under the GPL 3. Part of the library is for self education, as such - all code is self contained. JSAT has no external dependencies, and is pure Java. I also aim to make the library suitably fast for small to medium size problems. As such, much of the code supports parallel execution.If you want to use the bleeding edge, but don't want to bother building yourself, I recomend you look at It can build a POM repo for you for any specific commit version. Click on "Commits" in the link and then click "get it" for the commit version you want.

Multicore SWARM

  •    C

Multicore SWARM (Software and Algorithms for Running on Multicore Processors) is an open source library for developing efficient and portable implementations that make use of multi-core processors. David A. Bader (Georgia Tech) began SWARM in 1994.


  •    CSharp

Here you can find the resources required to start building with these new systems today. We have also provided a new forum where you can find more information and share your experiences with these new systems.

ocaml-multicore - Multicore OCaml

  •    OCaml

OCaml is an implementation of the ML language, based on the Caml Light dialect extended with a complete class-based object system and a powerful module system in the style of Standard ML. OCaml comprises two compilers. One generates bytecode which is then interpreted by a C program. This compiler runs quickly, generates compact code with moderate memory requirements, and is portable to essentially any 32 or 64 bit Unix platform. Performance of generated programs is quite good for a bytecoded implementation. This compiler can be used either as a standalone, batch-oriented compiler that produces standalone programs, or as an interactive, toplevel-based system.



MultiCore is a compute cloud wrapper written in c# and supports a simple db role, membership and profile provider. Also offers support for easier Simple DB access. Includes the latest amazon libraries. Azure support coming soon.

scalloc - A Fast, Multicore-Scalable, Low-Fragmentation Memory Allocator

  •    C++

scalloc provides general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. The main ideas behind the design of scalloc are: uniform treatment of small and big objects through so-called virtual spans, efficiently and effectively reclaiming free memory through fast and scalable global data structures.


  •    C

SkyEye is a very fast full system simulator which takes llvm as IR of dynmic compiled framework.. It can simulate series ARM, Coldfire,Mips, Powerpc, Sparc, x86 and Blackfin DSP Processor. Also can simulate multicore system by the multicore of host.



A suite of Ada 2012 generics to facilitate iterative and recursive parallelism for multicore systems and provide safer recursion for single and multicore systems. Generics include Ravenscar compatible versions for real-time systems. Also Includes paraffinalia, which is a set of useful generics for parallel quicksort, fast fourier transform, function integration, prefix sum, and Red-Black trees

mTCP - A Highly Scalable User-level TCP Stack for Multicore Systems

  •    C

mTCP is a high-performance user-level TCP stack for multicore systems. Scaling the performance of short TCP connections is fundamentally challenging due to inefficiencies in the kernel. mTCP addresses these inefficiencies from the ground up - from packet I/O and TCP connection management all the way to the application interface. It translates expensive system calls to shared memory access between two threads within the same CPU core.

mtcp - mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems

  •    C

mTCP is a highly scalable user-level TCP stack for multicore systems. mTCP source code is distributed under the Modified BSD License. For more detail, please refer to the LICENSE. The license term of io_engine driver and ported applications may differ from the mTCP’s. We require the following libraries to run mTCP.

nodegarden - HTML5 Node Garden

  •    Javascript

Really simple node garden made with HTML5. No Barnes-Hut n-body optimization, just simple physics. I used to do these back in the Flash times, when I worked as a Flash developer. BIT-101 released a great article back then, which got me inspired.

QuickMAN - Fast Mandelbrot Generator

  •    C

QuickMAN is a Mandelbrot fractal generator with multicore support. ASM-optimized code reaches billions of iterations per second on fast CPUs. Features an easy-to-use GUI, realtime pan/zoom, multiple palettes, image logging, and saving in PNG format.


  •    Python

PyMW is a Python module for parallel master-worker computing in a variety of environments. With the PyMW module, users can write a single program that scales from multicore machines to global computing platforms.


  •    Assembly

Fractice is a fractal explorer/renderer for Windows. It supports navigation, thumbnails, previews, deep zoom, printing, posters, palettes, multicore and distributed processing, movie recording, undo/redo, job control, VJ mixing, dual-monitor, amp; MIDI.

Clavis: a user level scheduler for Linux

  •    C

The Clavis user-level scheduler is designed to implement various scheduling algorithms under Linux Operating System running on multicore and NUMA machines. It is written in C to make the integration with the default OS scheduling facilities seamless.

Auto-parallelizing compiler for multicore systems using Phoenix framework


The objective of this project is to develop plugins for Phoenix compiler which will divide the intermediate code into various partitions. These partitions will be synthesized further in the later phases and will eventually be ready to run in parallel on chip multiple processor...

Transactional Entity Framework

  •    C++


Dambach Multi-Core Library


The Dambach Multi-Core Library makes it easy to create .Net programs that run faster on multi-core machines than their traditionally programmed counterparts.


  •    DotNet

Brahma is a library for C#, to provide high-level access to parallel streaming computations on a variety of processors. Brahma uses C#'s LINQ syntax to write kernels that are compiled dynamically. All the glue/kernel code required is *automatically* generated by by Brahma.



HyperScan is a minimal port scanner. It can be used to check the status of the ports on a given host. HyperScan is optimized to work on multicore processors for better performance. It is developed in C, and uses OpenMP for multi-core optimization.