Displaying 1 to 20 from 38 results

DataScienceR - a curated list of R tutorials for Data Science, NLP and Machine Learning

  •    R

This repo contains a curated list of R tutorials and packages for Data Science, NLP and Machine Learning. This also serves as a reference guide for several common data analysis tasks. Curated list of Python tutorials for Data Science, NLP and Machine Learning.

Skater - Python Library for Model Interpretation/Explanations

  •    Python

Skater is a unified framework to enable Model Interpretation for all forms of model to help one build an Interpretable machine learning system often needed for real world use-cases(** we are actively working towards to enabling faithful interpretability for all forms models). It is an open source python library designed to demystify the learned structures of a black box model both globally(inference on the basis of a complete data set) and locally(inference about an individual prediction). The project was started as a research idea to find ways to enable better interpretability(preferably human interpretability) to predictive "black boxes" both for researchers and practioners. The project is still in beta phase.

CleverCSV - CleverCSV is a Python package for handling messy CSV files

  •    Python

CleverCSV provides a drop-in replacement for the Python csv package with improved dialect detection for messy CSV files. It also provides a handy command line tool that can standardize a messy file or generate Python code to import it. Click here to go to the introduction with more details about CleverCSV. If you're in a hurry, below is a quick overview of how to get started with the CleverCSV Python package and the command line interface.

industry-machine-learning - A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)

  •    Jupyter

Animated Investment Management Research at Sov.ai — Sponsoring open source AI, Machine learning, and Data Science initiatives. Have a look at the newly started FirmAI Medium publication where we have experts of AI in business, write about their topics of interest.




AIDL-Series - :books: Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc

  •    

:books: Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc. 💫 人工智能与深度学习实战,机器学习篇 | Tensoflow 篇

Vegas - The missing MatPlotLib for Scala + Spark

  •    Scala

Vegas aims to be the missing MatPlotLib for the Scala and Spark world. Vegas wraps around Vega-Lite but provides syntax more familiar (and type checked) for use within Scala. And then use the following code to render a plot into a pop-up window (see below for more details on controlling how and where Vegas renders).

business-machine-learning - A curated list of practical business machine learning (BML) and business data science (BDS) applications for Accounting, Customer, Employee, Legal, Management and Operations (by @firmai)

  •    Jupyter

Animated Investment Management Research at Sov.ai — Sponsoring open source AI, Machine learning, and Data Science initiatives. A curated list of applied business machine learning (BML) and business data science (BDS) examples and libraries. The code in this repository is in Python (primarily using jupyter notebooks) unless otherwise stated. The catalogue is inspired by awesome-machine-learning.

modin - Modin: Speed up your Pandas workflows by changing a single line of code

  •    Python

Modin uses Ray to provide an effortless way to speed up your pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing pandas code. Even using the DataFrame constructor is identical. To use Modin, you do not need to know how many cores your system has and you do not need to specify how to distribute the data. In fact, you can continue using your previous pandas notebooks while experiencing a considerable speedup from Modin, even on a single machine. Once you’ve changed your import statement, you’re ready to use Modin just like you would pandas.


tech.ml.dataset - A Clojure high performance data processing system

  •    Clojure

tech.ml.dataset is a Clojure library for data processing and machine learning. Datasets are currently in-memory columnwise databases and we support parsing from file or input-stream. We support these formats: raw/gzipped csv/tsv, xls, xlsx, json, and sequences of maps as input sources. SQL bindings are provided as a separate library. Data size in memory is minimized (primitive arrays), datetime types are often converted to an integer representation and strings are loaded into string tables. These features together dramatically decrease the working set size in memory. Because data is stored in columnar fashion columnwise operations on the dataset are very fast.

jupyter-notify - A Jupyter Notebook magic for browser notifications of cell completion

  •    Python

This package provides a Jupyter notebook cell magic %%notify that notifies the user upon completion of a potentially long-running cell via a browser push notification. Use cases include long-running machine learning models, grid searches, or Spark computations. This magic allows you to navigate away to other work (or even another Mac desktop entirely) and still get a notification when your cell completes. Clicking on the body of the notification will bring you directly to the browser window and tab with the notebook, even if you're on a different desktop (clicking the "Close" button in the notification will keep you where you are). The extension has currently been tested in Chrome (Version: 58.0.3029) and Firefox (Version: 53.0.3).

melusine - Melusine is a high-level library for emails classification and feature extraction "dédiée aux courriels français"

  •    Jupyter

Melusine is a high-level Python library for email classification and feature extraction, written in Python and capable of running on top of Scikit-Learn, Tensorflow 2 and Keras. Integrated models runs with Tensorflow 2.2. It is developed with a focus on emails written in French. Melusine is compatible with Python >= 3.6.

knyfe - knyfe is a python utility for rapid exploration of datasets.

  •    Python

knyfe is a python utility for rapid exploration of datasets. Use it when you have some kind of dataset and you want to get a feel for how it is composed, run some simple tests on it, or prepare it for further processing. The great thing about knyfe is that you don't have to know much about how your dataset is designed. You shouldn't have to remember in which variable resides in which column of your data matrix or how your structs are nested. Just get shit done.

krangl - krangl is a {K}otlin DSL for data w{rangl}ing

  •    Kotlin

krangl is a {K}otlin library for data w{rangl}ing. By implementing a grammar of data manipulation using a modern functional-style API, it allows to filter, transform, aggregate and reshape tabular data. krangl is heavily inspired by the amazing dplyr for R. krangl is written in Kotlin, excels in Kotlin, but emphasizes as well on good java-interop. It is mimicking the API of dplyr, while carefully adding more typed constructs where possible.

kravis - A {K}otlin g{ra}mmar for data {vis}ualization

  •    Kotlin

Visualizing tabular and relational data is the core of data-science. kravis implements a grammar to create a wide range of plots using a standardized set of verbs. You can also use JitPack with Maven or Gradle to build the latest snapshot as a dependency in your project.

docker-ocaml-jupyter-datascience - Dockerfiles for data science in OCaml on Jupyter

  •    Jupyter

A ready-to-use environment of Jupyter (IPython notebook) and OCaml Jupyter (OCaml kernel) with libraries for data science and machine learning. First, launch a Jupyter server as follows.

ocaml-jupyter - An OCaml kernel for Jupyter (IPython) notebook

  •    Jupyter

An OCaml kernel for Jupyter notebook. This provides an OCaml REPL with a great user interface such as markdown/HTML documentation, LaTeX formula by MathJax, and image embedding.

predict-opioid-prescribers - A pattern focusing on how to use scikit learn and python in Watson Studio to predict opioid prescribers based off of a 2014 kaggle dataset

  •    Jupyter

Read this in other languages: 日本語. Data Science Experience is now Watson Studio. Although some images in this code pattern may show the service as Data Science Experience, the steps and processes will still work.

my-awesome-AI-bookmarks - Curated list of my reads, implementations and core concepts of Artificial Intelligence, Deep Learning, Machine Learning by best folk in the world

  •    

Curated list of my reads, implementations and core concepts of Artificial Intelligence, Deep Learning, Machine Learning by best folk in the world. Have anything in mind that you think is awesome and would fit in this list? Feel free to send a pull request.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.