Displaying 1 to 9 from 9 results

FTRLProximal - R package for online training of regression models using FTRL Proximal

  •    R

This is an R package of the FTRL Proximal algorithm for online learning of elastic net logistic regression models. For more info on the algorithm please see Ad Click Prediction: a View from the Trenches by McMahan et al. (2013).

keras-timeseries-prediction - Time series prediction with Sequential Model and LSTM units

  •    Python

The dataset is international-airline-passengers.csv which contains 144 data points ranging from Jan 1949 to Dec 1960. Each data point represents monthly passengers in thousands.

sjstats - Statistical Functions for Regression Models

  •    R

Collection of convenient functions for common statistical computations, which are not directly provided by R's base or stats packages. This package aims at providing, first, shortcuts for statistical measures, which otherwise could only be calculated with additional effort (like standard errors, Cronbach's Alpha or root mean squared errors), or for which currently no functions available.

bayesian-basics - :no_entry_sign: :leftwards_arrow_with_hook: A document that introduces Bayesian data analysis

  •    Stan

This is a document that introduces Bayesian data analysis. It serves as a practical and applied introduction to Bayesian approaches for the uninitiated. The goal is to provide just enough information in a brief format to allow one to feel comfortable exploring Bayesian data analysis for themselves, assuming they have the requisite context to begin with. There is a shiny app to play with also.




walker - Baysian dynamic linear regression models with Stan

  •    HTML

Walker provides a method for fully Bayesian generalized linear regression where the regression coefficients are allowed to vary over "time" as a first or second order integrated random walk. All computations are done using Hamiltonian Monte Carlo provided by Stan, using a state space representation of the model in order to marginalise over the coefficients for accurate and efficient sampling.

auditor - Audit of regression models

  •    R

A preprint of the article about auditor is availible on arxiv. For more plot types and examples see A Short Overview of Plots section below.

dotwhisker - Dot-and-Whisker Plots of Regression Results

  •    R

dotwhisker is an R package for quickly and easily generating dot-and-whisker plots of regression results, either directly from model objects or from tidy data frames. It provides a convenient way to create highly customizable plots for presenting and comparing statistics. It can be used to plot coefficients or other estimates (e.g., predicted probabilities) within a model or compare them across different models. The estimates are presented as dots with confidence interval whiskers, and predictors can be grouped in brackets. Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

NYCBuildingEnergyUse - Creating Regression Models Of Building Emissions On Google Cloud

  •    Jupyter

In indentifying outliers I will cover both visual inspection as well a machine learning method called Isolation Forests. Since I will completing this project over multiple days and using Google Cloud, I will go over the basics of using BigQuery for storing the datasets so I won't have to start all over again each time I work on it. At the end of this blogpost I will summarize the findings, and give some specific recommendations to reduce mulitfamily and office building energy usage. In this second post I cover imputations techniques for missing data using Scikit-Learn's impute module using both point estimates (i.e. mean, median) using the SimpleImputer class as well as more complicated regression models (i.e. KNN) using the IterativeImputer class. The later requires that the features in the model are correlated. This is indeed the case for our dataset and in our particular case we also need to transform the feautres in order to discern a more meaningful and predictive relationship between them. As we will see, the transformation of the features also gives us much better results for imputing missing values.


Zelig - A statistical framework that serves as a common interface to a large range of models

  •    R

plot to plot the simulation results. Zelig 5 introduced reference classes. These enable a different way of working with Zelig that is detailed in a separate vignette. Directly using the reference class architecture is optional. They are not used in the examples below.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.