- 85

Stop plotting your data - annotate your data and let it visualize itself. HoloViews is an open-source Python library designed to make data analysis and visualization seamless and simple. With HoloViews, you can usually express what you want to do in very few lines of code, letting you focus on what you are trying to explore and convey, not on the process of plotting.

http://holoviews.orghttps://github.com/ioam/holoviews

Tags | visualization analysis bokeh matplotlib interactive data-science exploratory-data-analysis pandas |

Implementation | Python |

License | Public |

Platform | Windows Linux |

Chris Fonnesbeck is an Assistant Professor in the Department of Biostatistics at the Vanderbilt University School of Medicine. He specializes in computational statistics, Bayesian methods, meta-analysis, and applied decision analysis. He originally hails from Vancouver, BC and received his Ph.D. from the University of Georgia. This tutorial will introduce the use of Python for statistical data analysis, using data stored as Pandas DataFrame objects. Much of the work involved in analyzing data resides in importing, cleaning and transforming data in preparation for analysis. Therefore, the first half of the course is comprised of a 2-part overview of basic and intermediate Pandas usage that will show how to effectively manipulate datasets in memory. This includes tasks like indexing, alignment, join/merge methods, date/time types, and handling of missing data. Next, we will cover plotting and visualization using Pandas and Matplotlib, focusing on creating effective visual representations of your data, while avoiding common pitfalls. Finally, participants will be introduced to methods for statistical data modeling using some of the advanced functions in Numpy, Scipy and Pandas. This will include fitting your data to probability distributions, estimating relationships among variables using linear and non-linear models, and a brief introduction to bootstrapping methods. Each section of the tutorial will involve hands-on manipulation and analysis of sample datasets, to be provided to attendees in advance.

This introductory data science course that is our (working) answer to these questions. The courses focuses on data acquisition and wrangling, exploratory data analysis, data visualization, and effective communication and approaching statistics from a model-based, instead of an inference-based, perspective. A heavy emphasis is placed on a consitent syntax (with tools from the tidyverse), reproducibility (with R Markdown) and version control and collaboration (with git/GitHub). We help ease the learning curve by avoiding local installation and supplementing out-of-class learning with interactive tools (like learnr tutorials). By the end of the semester teams of students work on fully reproducible data analysis projects on data they acquired, answering questions they care about. This repository serves as a "data science course in a box" containing all materials required to teach (or learn from) the course described above.

rstats r education teaching data-scienceThe JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ. Implemented in TypeScript, used in JavaScript ES5+ or TypeScript.

data-wrangling data-forge data data-analysis nodejs pandas visualization data-visualization data-management data-manipulation data-munging data-cleaning data-cleansing csv json data-science data-clensingCourse materials for General Assembly's Data Science course in Washington, DC (8/18/15 - 10/29/15).

data-science machine-learning scikit-learn data-analysis pandas jupyter-notebook course linear-regression logistic-regression model-evaluation naive-bayes natural-language-processing decision-trees ensemble-learning clustering regular-expressions web-scraping data-visualization data-cleaningBokeh is a fiscally sponsored project of NumFOCUS, a nonprofit dedicated to supporting the open-source scientific computing community. If you like Bokeh and would like to support our mission, please consider making a donation. Bokeh is an interactive visualization library for Python that enables beautiful and meaningful visual presentation of data in modern web browsers. With Bokeh, you can quickly and easily create interactive plots, dashboards, and data applications.

bokeh interactive-plots visualization plotting plotsRead about the series, and view all of the videos on one page: Easier data analysis in Python with pandas.

data-science jupyter-notebook pandas tutorial data-analysis data-cleaningIPython Notebook(s) demonstrating deep learning functionality.IPython Notebook(s) demonstrating scikit-learn functionality.

machine-learning deep-learning data-science big-data aws tensorflow theano caffe scikit-learn kaggle spark mapreduce hadoop matplotlib pandas numpy scipy kerasxarray (formerly xray) is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures. Our goal is to provide a pandas-like and pandas-compatible toolkit for analytics on multi-dimensional arrays, rather than the tabular data for which pandas excels. Our approach adopts the Common Data Model for self- describing scientific data in widespread use in the Earth sciences: xarray.Dataset is an in-memory representation of a netCDF file.

scientific-computing netcdf numpy data-science pandas dataframes data-analysis pydataData Science is a new "sexy" buzzword without specific meaning but often used to substitute Statistics, Scientific Computing, Text and Data Mining and Visualization, Machine Learning, Data Processing and Warehousing as well as Retrieval Algorithms of any kind. This curated list comprises awesome tutorials, libraries, information sources about various Data Science applications using the Ruby programming language.

data-science data-visualization data-analysis data-mining data-analytics visualization awesome awesome-list list rubydatascienceZipline is a Pythonic algorithmic trading library. It is an event-driven system that supports both backtesting and live-trading. Zipline is currently used in production as the backtesting and live-trading engine powering Quantopian -- a free, community-centered, hosted platform for building and executing trading strategies.Note: Installing Zipline via pip is slightly more involved than the average Python package. Simply running pip install zipline will likely fail if you've never installed any scientific Python packages before.

algorithmic-trading trading machine-learning stock-analysispandas is a Python package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal. Binary installers for the latest released version are available at the Python package index and on conda.

data-analysis pandas flexible alignmentMisc data visualization projects, examples, and demos: mostly Python (pandas + matplotlib) and JavaScript (leaflet).

These series of tutorials on Data Science engineering will try to compare how different concepts in the discipline can be implemented in the two dominant ecosystems nowadays: R and Python. We will do this from a neutral point of view. Our opinion is that each environment has good and bad things, and any data scientist should know how to use both in order to be as prepared as posible for job market or to start personal project.

data-science data-science-engineering tutorial data-frame exploratory-data-analysis r jupyter notebook machine-learningpandas is a Python library for doing data analysis. It's really fast and lets you do exploratory work incredibly quickly. The goal of this cookbook is to give you some concrete examples for getting started with pandas. The docs are really comprehensive. However, I've often had people tell me that they have some trouble getting started, so these are examples with real-world data, and all the bugs and weirdness that entails.

Orange is a component-based data mining software. It includes a range of data visualization, exploration, preprocessing and modeling techniques. It supports . interactive data analysis workflows with a large toolbox.

data-mining data-science machine-learning data-visualization text-miningThis is the list of published articles on medium.com 🇬🇧, habr.com 🇷🇺, and jqr.com 🇨🇳. Icons are clickable. Also, links to Kaggle Kernels (in English) are given. This way one can reproduce everything without installing a single package. Assignments will be announced each week. Meanwhile, you can pratice with demo versions. Solutions will be discussed in the upcoming run of the course.

machine-learning data-analysis data-science pandas algorithms numpy scipy matplotlib seaborn plotly scikit-learn kaggle-inclass vowpal-wabbit ipynb docker mathSeaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.Online documentation is available at seaborn.pydata.org. Installation requires numpy, scipy, pandas, and matplotlib. Some functions will optionally use statsmodels if it is installed.

data-visualization visualization statisticsExploratory data analysis toolkit for Python. Dora is a Python library designed to automate the painful parts of exploratory data analysis.

Practice and tutorial-style notebooks covering wide variety of machine learning techniques

numpy statistics pandas matplotlib regression scikit-learn classification principal-component-analysis clustering decision-trees random-forest dimensionality-reduction neural-network deep-learning artificial-intelligence data-science machine-learning k-nearest-neighbours naive-bayesDex : The data explorer is a data visualization tool written in Java/JavaFX capable of powerful ETL and data visualization. There are 2 main ways to install Dex.

data-science data-visualization visualization data-analysis data-mining javafx d3 dataviz datavis datavisualization d3js
We have large collection of open source products. Follow the tags from
Tag Cloud >>

Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
**Add Projects.**