rsmtool - RSMTool is a python package for facilitating research on building and evaluating automated scoring models

  •        4

Automated scoring of written and spoken test responses is a growing field in educational natural language processing. Automated scoring engines employ machine learning models to predict scores for such responses based on features extracted from the text/audio of these responses. Examples of automated scoring engines include Project Essay Grade for written responses and SpeechRater for spoken responses. Rater Scoring Modeling Tool (RSMTool) is a python package which automates and combines in a single pipeline multiple analyses that are commonly conducted when building and evaluating such scoring models. The output of RSMTool is a comprehensive, customizable HTML statistical report that contains the output of these multiple analyses. While RSMTool does make it really simple to run a set of standard analyses using a single command, it is also fully customizable and allows users to easily exclude unneeded analyses, modify the default analyses, and even include custom analyses in the report.



Related Projects

RootTheBox - A Game of Hackers (CTF Scoreboard & Game Manager)

  •    HTML

Root the Box is a real-time scoring engine for computer wargames where hackers can practice and learn. The application can be easily configured and modified for any CTF game. Root the Box attempts to engage novice and experienced players alike by combining a fun game-like environment, with realistic challenges that convey knowledge applicable to the real-world, such as penetration testing, incident response, digital forensics and threat hunting. Just as in traditional CTF games, each team or player targets challenges of varying difficulty and sophistication, attempting to collect flags. Root the Box brings additional options to the game. It can be configured to allow the creation of "Botnets" by uploading a small bot program to target machines, which grant periodic rewards with (in-game) money for each bot in the botnet; the larger the botnet the larger the reward. Money can be used to unlock new levels, buy hints to flags, download a target's source code, or even "SWAT" other players by bribing the (in-game) police. Player's "bank account passwords" can also be publically displayed by the scoring engine, allowing players to crack each other's passwords and steal each other's money.

benchm-ml - A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc

  •    R

This project aims at a minimal benchmark for scalability, speed and accuracy of commonly used implementations of a few machine learning algorithms. The target of this study is binary classification with numeric and categorical inputs (of limited cardinality i.e. not very sparse) and no missing data, perhaps the most common problem in business applications (e.g. credit scoring, fraud detection or churn prediction). If the input matrix is of n x p, n is varied as 10K, 100K, 1M, 10M, while p is ~1K (after expanding the categoricals into dummy variables/one-hot encoding). This particular type of data structure/size (the largest) stems from this author's interest in some particular business applications. Note: While a large part of this benchmark was done in Spring 2015 reflecting the state of ML implementations at that time, this repo is being updated if I see significant changes in implementations or new implementations have become widely available (e.g. lightgbm). Also, please find a summary of the progress and learnings from this benchmark at the end of this repo.

scikit-plot - An intuitive library to add plotting functionality to scikit-learn objects.

  •    Python

Scikit-plot is the result of an unartistic data scientist's dreadful realization that visualization is one of the most crucial components in the data science process, not just a mere afterthought. Gaining insights is simply a lot easier when you're looking at a colored heatmap of a confusion matrix complete with class labels rather than a single-line dump of numbers enclosed in brackets. Besides, if you ever need to present your results to someone (virtually any time anybody hires you to do data science), you show them visualizations, not a bunch of numbers in Excel.

pycon-2016-tutorial - Machine Learning with Text in scikit-learn

  •    Jupyter

Presented by Kevin Markham at PyCon on May 28, 2016. Watch the complete tutorial video on YouTube. Although numeric data is easy to work with in Python, most knowledge created by humans is actually raw, unstructured text. By learning how to transform text into data that is usable by machine learning models, you drastically increase the amount of data that your models can learn from. In this tutorial, we'll build and evaluate predictive models from real-world text using scikit-learn.

xcessiv - A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python

  •    Python

Stacked ensembles are simple in theory. You combine the predictions of smaller models and feed those into another model. However, in practice, implementing them can be a major headache. Xcessiv holds your hand through all the implementation details of creating and optimizing stacked ensembles so you're free to fully define only the things you care about.

SpamTestBuddy - Spam Scoring Tool

  •    C

SpamTestBuddy is a simple, light-weight, multiple-input spam scoring tool. It is standalone and can be used with simple procmail rules without root access or daemons. It has built-in support for simple DNS checks including DNSBL (DNS-based blocklist) queries, and can scan headers from filters such as SpamProbe, QSF, DSPAM that you already use. It reduces both false positives and false negatives with the benefit of extra spam tests.

featuretools - automated feature engineering

  •    Python

Featuretools is a python library for automated feature engineering. See the documentation for more information. Below is an example of using Deep Feature Synthesis (DFS) to perform automated feature engineering. In this example, we apply DFS to a multi-table dataset consisting of timestamped customer transactions.

jpmml-sklearn - Java library and command-line application for converting Scikit-Learn pipelines to PMML

  •    Java

Java library and command-line application for converting Scikit-Learn models to PMML

interpret - Fit interpretable models. Explain blackbox machine learning.

  •    C++

Historically, the most intelligible models were not very accurate, and the most accurate models were not intelligible. Microsoft Research has developed an algorithm called the Explainable Boosting Machine (EBM)* which has both high accuracy and intelligibility. EBM uses modern machine learning techniques like bagging and boosting to breathe new life into traditional GAMs (Generalized Additive Models). This makes them as accurate as random forests and gradient boosted trees, and also enhances their intelligibility and editability. In addition to EBM, InterpretML also supports methods like LIME, SHAP, linear models, partial dependence, decision trees and rule lists. The package makes it easy to compare and contrast models to find the best one for your needs.

tpot - A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming

  •    Python

Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines to find the best one for your data.

scikit-learn-videos - Jupyter notebooks from the scikit-learn video series

  •    Jupyter

This video series will teach you how to solve machine learning problems using Python's popular scikit-learn library. It was featured on Kaggle's blog in 2015. There are 9 video tutorials totaling 4 hours, each with a corresponding Jupyter notebook. The notebook contains everything you see in the video: code, output, images, and comments.

gplearn - Genetic Programming in Python, with a scikit-learn inspired API

  •    Python

gplearn implements Genetic Programming in Python, with a scikit-learn inspired and compatible API. While Genetic Programming (GP) can be used to perform a very wide variety of tasks, gplearn is purposefully constrained to solving symbolic regression problems. This is motivated by the scikit-learn ethos, of having powerful estimators that are straight-forward to implement.

scikit-learn-doc-cn - scikit-learn机器学习库中文文档翻译项目

  •    HTML


skll - SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

  •    Python

This Python package provides command-line utilities to make it easier to run machine learning experiments with scikit-learn. One of the primary goals of our project is to make it so that you can run scikit-learn experiments without actually needing to write any code other than what you used to generate/extract the features. For more information about getting started with run_experiment, please check out our tutorial, or our config file specs.

Open Match Scoring System

  •    C++

Open Match Scoring System is a system to store results of Practical Shooting Matches (conformed to the IPSC Rules).

Dart Scorekeeper

  •    VB

Dart Scorekeeper is a freeware Windows application to assist in scoring dart games and tracking statistics. It includes a scoring window, graphical dart board, and displays real-time stats. It supports the dart games of Cricket, 301/501/etc., and Golf.

Network Security Scorebot

  •    Java

Scorebot is a scoring framework which monitors the integrity of various network services for the purpose of scoring a network security exercise.


  •    Python

Bibs is a cross country (running) meet management and scoring program, which provides for teams entry, scoring, and results.

We have large collection of open source products. Follow the tags from Tag Cloud >>

Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.