Displaying 1 to 17 from 17 results

prince - :crown: Python factor analysis library (PCA, CA, MCA, FAMD)

  •    Python

Prince uses pandas to manipulate dataframes, as such it expects an initial dataframe to work with. In the following example, a Principal Component Analysis (PCA) is applied to the iris dataset. Under the hood Prince decomposes the dataframe into two eigenvector matrices and one eigenvalue array thanks to a Singular Value Decomposition (SVD). The eigenvectors can then be used to project the initial dataset onto lower dimensions.The first plot displays the rows in the initial dataset projected on to the two first right eigenvectors (the obtained projections are called principal coordinates). The ellipses are 90% confidence intervals.

irlba - Fast truncated singular value decompositions

  •    R

Implicitly-restarted Lanczos methods for fast truncated singular value decomposition of sparse and dense matrices (also referred to as partial SVD). IRLBA stands for Augmented, Implicitly Restarted Lanczos Bidiagonalization Algorithm. The package provides the following functions (see help on each for details and examples).Help documentation for each function includes extensive documentation and examples. Also see the package vignette, vignette("irlba", package="irlba").




pca - Principal component analysis (PCA) in Ruby

  •    Ruby

Principal component analysis in Ruby. Uses GSL for calculations. PCA can be used to map data to a lower dimensional space while minimizing information loss. It's useful for data visualization, where you're limited to 2-D and 3-D plots.

osm-data-classification - OpenStreetMap Data Classification

  •    Python

Our first idea was to answer to this question: can we assess the quality of OpenStreetMap data? (and how?). This project is dedicated to explore and analyze the OpenStreetMap data history in order to classify the contributors.

cshl-singlecell-2017 - Single Cell Analysis course at Cold Spring Harbor Laboratory 2017

  •    Jupyter

This is one of many single cell courses/tutorials. An excellent list of all single cell package, courses, tutorials, speakers for conferences, can be found here. We'll use some additional dependencies outside of the scientific python ecosystem.


h2o4gpu - H2Oai GPU Edition

  •    Python

H2O4GPU is a collection of GPU solvers by H2Oai with APIs in Python and R. The Python API builds upon the easy-to-use scikit-learn API and its well-tested CPU-based algorithms. It can be used as a drop-in replacement for scikit-learn (i.e. import h2o4gpu as sklearn) with support for GPUs on selected (and ever-growing) algorithms. H2O4GPU inherits all the existing scikit-learn algorithms and falls back to CPU algorithms when the GPU algorithm does not support an important existing scikit-learn class option. The R package is a wrapper around the H2O4GPU Python package, and the interface follows standard R conventions for modeling. Daal library added for CPU, currently supported only x86_64 architecture.

Miscellaneous-R-Code - Code that might be useful to others for learning/demonstration purposes.

  •    R

This is a place for miscellaneous R and other code I've put together for clients, co-workers or myself for learning and demonstration purposes. The attempt is made to put together some well-commented and/or conceptually clear code from scratch, though most functionality is readily available in any number of well-developed R packages. Typically, examples are provided using such packages for comparison of results. I would say most of these are geared toward intermediate to advanced folks that want to dig a little deeper into the models and underlying algorithms. More recently, if it gets more involved, I usually just create a document of some kind rather than a standard *.R file, so you might check out the docs repo as well.

projector - Project Dense Vectors Text Representation on 2D Plan

  •    R

Project dense vector representations of texts on a 2D plan to better understand neural models applied to NLP. Since the famous word2vec, embeddings are everywhere in NLP (and other close areas like IR). The main idea behind embeddings is to represent texts (made of characters, words, sentences, or even larger blocks) as numeric vectors. This works very well and provides some abilities unreachable with the classic BoW approach. However, embeddings (e.g. vector representations) are difficult to understand, analyze (and debug) for humans because they are made of much more than just 3 dimensions.

motionLib - quaternion, euler angle, interpolation, cubic bezier, cubic spline, PCA, etc.

  •    C++

brief: motionLib is a small lib in computer animation/graphic project, i used it in data compression and motion synthesis. note: these files may depend on each other,in most cases, you need to include them all in your project. And the CASEParser class is absent for license issue.

machine-learning-course - R code for the assignments of Coursera machine learning course

  •    R

This is the R version assignments of the online machine learning course (MOOC) on Coursera website by Prof. Andrew Ng. This repository provides the starter code to solve the assignment in R statistical software; the completed assignments are also available beside each exercise file.

2018-MachineLearning-Lectures-ESA - Machine Learning Lectures at the European Space Agency (ESA) in 2018

  •    Jupyter

In 2018, The European Space Agency (ESA) organized a series of 6 lectures on Machine Learning at the European Space Operations Centre (ESOC). This repository contains the lectures resources: presentations, notebooks and links to the videos (presentation and hands-on).