AzureDSVM - AzureDSVM is an R package that offers convenient harness of Azure DSVM, remote execution of scalable and elastic data science work, and monitoring of on-demand resource consumption

  •        7

The AzureDSVM (Azure Data Science Virtual Machine) is an R Package for Data Scientists working with the Azure compute platform as a complement to the underlying AzureSMR for controlling Azure Data Science Virtual Machines.Azure Data Science Virtual Machine (DSVM) is a powerful data science development environment with pre-installed tools and packages that empower data scientists for convenient data wrangling, model building, and service deployment.



Related Projects


This repository contains walkthroughs, templates and documentation related to Machine Learning & Data Science services and platforms on Azure. Services and platforms include Data Science Virtual Machine, Azure ML, HDInsight, Microsoft R Server, SQL-Server, Azure Data Lake etc.There are also materials from tutorials we have delivered at KDD, Strata etc., using the above services and platforms.

MMLSpark - Microsoft Machine Learning for Apache Spark

MMLSpark provides a number of deep learning and data science tools for Apache Spark, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK) and OpenCV, enabling you to quickly create powerful, highly-scalable predictive and analytical models for large image and text datasets.MMLSpark requires Scala 2.11, Spark 2.1+, and either Python 2.7 or Python 3.5+. See the API documentation for Scala and for PySpark.

pachyderm - Reproducible Data Science at Scale!

Pachyderm is a tool for production data pipelines. If you need to chain together data scraping, ingestion, cleaning, munging, wrangling, processing, modeling, and analysis in a sane way, then Pachyderm is for you. If you have an existing set of scripts which do this in an ad-hoc fashion and you're looking for a way to "productionize" them, Pachyderm can make this easy for you. Install Pachyderm locally or deploy on AWS/GCE/Azure in about 5 minutes.

data-science-with-ruby - Practical Data Science with Ruby based tools.

Data Science is a new "sexy" buzzword without specific meaning but often used to substitute Statistics, Scientific Computing, Text and Data Mining and Visualization, Machine Learning, Data Processing and Warehousing as well as Retrieval Algorithms of any kind. This curated list comprises awesome tutorials, libraries, information sources about various Data Science applications using the Ruby programming language.

connectthedots - Connect tiny devices to Microsoft Azure services to build IoT solutions is an open source project created by Microsoft to help you get tiny devices connected to Microsoft Azure IoT and to implement great IoT solutions taking advantage of Microsoft Azure advanced analytic services such as Azure Stream Analytics and Azure Machine Learning.The project is built with the assumption that the sensors get the raw data and format it into a JSON string. That string is then sent to Azure IoT Hub, from which a Web app gathers the data and displays it as a chart. Optional other functions of the Azure cloud include detecting and displaying alerts and averages, however this is not required.

r4ds - R for data science

This is code and text behind the R for Data Science book.

tpot - A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming

Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines to find the best one for your data.


Dashboard-AzureVM (codename DAVM) is the best and simple application to manage Virtual Machine created on Windows Azure and based on PowerShell cmdlet

SQL Azure Federation Data Migration Wizard

SQL Azure Federation Data Migration Wizard simplifies the process of migrating data from a single database to multiple federation members in SQL Azure Federation.

Libretto - Golang library to create Virtual Machines (VMs) on any cloud

Libretto is a Golang library to create Virtual Machines (VM) on any cloud and Virtual Machine hosting platforms such as AWS, Azure, OpenStack, vSphere, VMware Workstation/Fusion, Exoscale or VirtualBox. Different providers have different utilities and API interfaces to achieve that, but the abstractions of their interfaces are quite similar.

pygdf - GPU Data Frame

PyGDF implements the Python interface to access and manipulate the GPU Dataframe of GPU Open Analytics Initialive (GOAI). We aim to provide a simple interface that similar to the Pandas dataframe and hide the details of GPU programming.

Intro - Course materials for "Introduction to Data Science with R", a video course by RStudio and O'Reilly Media

Course materials for "Introduction to Data Science with R", a video course by RStudio and O'Reilly Media. To purchase the course, or watch sample lessons, visit

Ulysses Agenda

Ulysses Agenda is a web-based MUD set in a futuristic, science fiction world where the only way to survive is to make money - one way or another. The game is built upon Windows Azure and ASP.NET MVC.

Data-Analysis-and-Machine-Learning-Projects - Repository of teaching materials, code, and data for my data analysis and machine learning projects

This is a repository of teaching materials, code, and data for my data analysis and machine learning projects.Each repository will (usually) correspond to one of the blog posts on my web site.

Azure Drive Explorer

AzureDriveExplorer is a tool to easily manage your drives that are mounted on a virtual machine on Window Azure. Through a server-side Web service Windows Azure, or downloaded to a local file / folder on the drive of the VPC client Azure Local files or upload the folder to A...

Azure Table Encryption via Attribute

SSL isn't enough when storing data in the cloud. You need to protect data-at-rest from anyone who has access to your store. In addition your SSL data may be vulnerable to a man-in-the-middle technology or IT shops that inspect and log the SSL contents. Bluecoat is one examp...

azure-webjobs-sdk - Azure WebJobs SDK

The Azure WebJobs SDK is a framework that simplifies the task of writing background processing code that runs in Azure. The Azure WebJobs SDK includes a declarative binding and trigger system that works with Azure Storage Blobs, Queues and Tables as well as Service Bus. The binding system makes it incredibly easy to write code that reads or writes Azure Storage objects. The trigger system automatically invokes a function in your code whenever any new data is received in a queue or blob.In addition to the built in triggers/bindings, the WebJobs SDK is fully extensible, allowing new types of triggers/bindings to be created and plugged into the framework in a first class way. See Azure WebJobs SDK Extensions for details. Many useful extensions have already been created and can be used in your applications today. Extensions include a File trigger/binder, a Timer/Cron trigger, a WebHook HTTP trigger, as well as a SendGrid email binding.

GraphView - GraphView is a DLL library that enables users to use SQL Server or Azure SQL Database to efficiently manage graphs

GraphView is a DLL library that enables users to use SQL Server or Azure SQL Database to manage graphs. It connects to a SQL database locally or in the cloud, stores graph data in tables and queries graphs through a SQL-extended language. It is not an independent database, but a middleware that accepts graph operations and translates them to T-SQL executed in SQL Server or Azure SQL Database. As such, GraphView can be viewed as a special connector to SQL Server/Azure SQL Database. Developers will experience no differences than the default SQL connector provided by the .NET framework (i.e., SqlConnection), only except that this new connector accepts graph-oriented statements.GraphView is a DLL library through which you manage graph data in SQL Server (version 2008 and onward) and Azure SQL Database (v12 and onward). It provides features a standard graph database is expected to have. In addition, since GraphView relies on SQL databases, it inherits many features in the relational world that are often missing in native graph databases.

Azure Storage Inspector

Azure Inspector is a web interface for managing your azure storage data. It allows you to works with Azure tables, blobs and queues.