- 66

Repository to store sample python programs for python learning

https://github.com/codebasics/pyTags | pandas pandas-dataframe pandas-tutorial numpy numpy-arrays numpy-tutorial python-tutorial python-tutorials python-pandas jupyter-notebook jupyter jupyter-notebooks jupyter-tutorial |

Implementation | Jupyter Notebook |

License | Public |

Platform |

Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of pandas' power. Since pandas is a large library with many different specialist features and functions, these excercises focus mainly on the fundamentals of manipulating data (indexing, grouping, aggregating, cleaning), making use of the core DataFrame and Series objects. Many of the excerises here are straightforward in that the solutions require no more than a few lines of code (in pandas or NumPy - don't go using pure Python!). Choosing the right methods and following best practices is the underlying goal.

pandas numpy data-analysisA wind rose is a graphic tool used by meteorologists to give a succinct view of how wind speed and direction are typically distributed at a particular location. It can also be used to describe air quality pollution sources. The wind rose tool uses Matplotlib as a backend. Data can be passed to the package using Numpy arrays or a Pandas DataFrame. Windrose is a Python library to manage wind data, draw windroses (also known as polar rose plots), and fit Weibull probability density functions.

numpy pandas speed wind matplotlib windroseThis repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks. Run the code using the Jupyter notebooks available in this repository's notebooks directory.

scikit-learn numpy jupyter-notebook matplotlib pandasRead about the series, and view all of the videos on one page: Easier data analysis in Python with pandas.

data-science jupyter-notebook pandas tutorial data-analysis data-cleaningChris Fonnesbeck is an Assistant Professor in the Department of Biostatistics at the Vanderbilt University School of Medicine. He specializes in computational statistics, Bayesian methods, meta-analysis, and applied decision analysis. He originally hails from Vancouver, BC and received his Ph.D. from the University of Georgia. This tutorial will introduce the use of Python for statistical data analysis, using data stored as Pandas DataFrame objects. Much of the work involved in analyzing data resides in importing, cleaning and transforming data in preparation for analysis. Therefore, the first half of the course is comprised of a 2-part overview of basic and intermediate Pandas usage that will show how to effectively manipulate datasets in memory. This includes tasks like indexing, alignment, join/merge methods, date/time types, and handling of missing data. Next, we will cover plotting and visualization using Pandas and Matplotlib, focusing on creating effective visual representations of your data, while avoiding common pitfalls. Finally, participants will be introduced to methods for statistical data modeling using some of the advanced functions in Numpy, Scipy and Pandas. This will include fitting your data to probability distributions, estimating relationships among variables using linear and non-linear models, and a brief introduction to bootstrapping methods. Each section of the tutorial will involve hands-on manipulation and analysis of sample datasets, to be provided to attendees in advance.

Practice and tutorial-style notebooks covering wide variety of machine learning techniques

numpy statistics pandas matplotlib regression scikit-learn classification principal-component-analysis clustering decision-trees random-forest dimensionality-reduction neural-network deep-learning artificial-intelligence data-science machine-learning k-nearest-neighbours naive-bayesThis tutorial was presented by Kevin Markham at PyCon on May 2, 2019. Watch the complete tutorial video on YouTube. The pandas library is a powerful tool for multiple phases of the data science workflow, including data cleaning, visualization, and exploratory data analysis. However, the size and complexity of the pandas library makes it challenging to discover the best way to accomplish any given task.

data-science tutorial pandas vizualisationFed up with a ton of tutorials but no easy way to find exercises I decided to create a repo just with exercises to practice pandas. Don't get me wrong, tutorials are great resources, but to learn is to do. So unless you practice you won't learn. My suggestion is that you learn a topic in a tutorial or video and then do exercises. Learn one more topic and do exercises. If you got the answer wrong, don't go directly to the solution with code.

pandas exercise practice tutorial data-analysisThis repository explains how to make a map of the solar system using open-source code and data from NASA. Software used includes Python 3.7.1, NASA HORIZONS, Illustrator CC 2019 and Photoshop CC 2019. If you have comments or suggestions for this tutorial, please let me know on my blog! You can buy the finished map here. Python dependencies: matplotlib astropy numpy pandas os time urllib. Dependencies can be installed with pip install -r requirements.txt.

dataviz astronomy cartography orbitThis repository contains lecture transcripts and homework assignments as Jupyter Notebooks for the first of three Kadenze Academy courses on Creative Applications of Deep Learning w/ Tensorflow. It also contains a python package containing all the code developed during all three courses. The first course makes heavy usage of Jupyter Notebook. This will be necessary for submitting the homeworks and interacting with the guided session notebooks I will provide for each assignment. Follow along this guide and we'll see how to obtain all of the necessary libraries that we'll be using. By the end of this, you'll have installed Jupyter Notebook, NumPy, SciPy, and Matplotlib. While many of these libraries aren't necessary for performing the Deep Learning which we'll get to in later lectures, they are incredibly useful for manipulating data on your computer, preparing data for learning, and exploring results.

jupyter-notebook neural-network tensorflow deep-learning mooc dockerfile machine-learning tutorial workshopAlphalens is a Python Library for performance analysis of predictive (alpha) stock factors. Alphalens works great with the Zipline open source backtesting library, and Pyfolio which provides performance and risk analysis of financial portfolios.Check out the example notebooks for more on how to read and use the factor tear sheet.

finance pandas numpy algorithmic-trading jupyterSparkmagic is a set of tools for interactively working with remote Spark clusters through Livy, a Spark REST server, in Jupyter notebooks. The Sparkmagic project includes a set of magics for interactively running Spark code in multiple languages, as well as some kernels that you can use to turn Jupyter into an integrated Spark environment. There are two ways to use sparkmagic. Head over to the examples section for a demonstration on how to use both models of execution.

spark kernel cluster livy magic sql-query pandas-dataframe jupyter pyspark kerberos notebook jupyter-notebookMars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and many other libraries. More details about installing Mars can be found at installation section in Mars document.

machine-learning tensorflow numpy scikit-learn pandas pytorch xgboost lightgbm tensor dask ray dataframe statsmodels joblibThis rep is a growing list of Python cheat sheets, tailored for Data Science. If you want to install a package individually, go into the corresponding <package-name>.md file for instructions on how to install.

numpy python-cheat-sheets data-science pandas scikit-learnIPython Notebook(s) demonstrating deep learning functionality.IPython Notebook(s) demonstrating scikit-learn functionality.

machine-learning deep-learning data-science big-data aws tensorflow theano caffe scikit-learn kaggle spark mapreduce hadoop matplotlib pandas numpy scipy kerasAnimated investment research at Sov.ai, sponsoring open source initiatives. PandaPy software, similar to the original Pandas project, is developed to improve the usability of python for finance. Structured datatypes are designed to be able to mimic ‘structs’ in the C language, and share a similar memory layout. PandaPy currently houses more than 30 functions. Structured NumPy are meant for interfacing with C code and for low-level manipulation of structured buffers, for example for interpreting binary blobs. For these purposes they support specialized features such as subarrays, nested datatypes, and unions, and allow control over the memory layout of the structure.

finance data-science machine-learning numpy pandas data-structures arrays structured-data algorithmic-tradingxarray (formerly xray) is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures. Our goal is to provide a pandas-like and pandas-compatible toolkit for analytics on multi-dimensional arrays, rather than the tabular data for which pandas excels. Our approach adopts the Common Data Model for self- describing scientific data in widespread use in the Earth sciences: xarray.Dataset is an in-memory representation of a netCDF file.

scientific-computing netcdf numpy data-science pandas dataframes data-analysis pydatapynamical uses pandas, numpy, and numba for fast simulation, and matplotlib for visualizations and animations to explore system behavior. Compatible with Python 2 and 3. Pynamical comes packaged with the logistic map, the Singer map, and the cubic map predefined. The models may be run with a range of parameter values over a set of time steps, and the resulting numerical output is returned as a pandas DataFrame. Pynamical can then visualize this output in various ways, including with bifurcation diagrams, two-dimensional phase diagrams, three-dimensional phase diagrams, and cobweb plots.

chaos nonlinear fractal logistic visualization modeling animation math physics pandas numba numpy matplotlib ipynb bifurcation-diagram fractals systems phase-diagram cobweb-plotThe first instance of this tutorial was delivered at PyCon 2015 in Montréal, but I hope that many other people will be able to benefit from it over the next few years — both on occasions on which I myself get to deliver it, and also when other instructors are able to do so. To make it useful to as many people as possible, I hereby release it under the MIT license (see the accompanying LICENSE.txt file) and I have tried to make sure that this repository contains all of the scripts needed to download and set up the data set that we used.

Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar interface to query data living in other data storage systems. We point blaze to a simple dataset in a foreign database (PostgreSQL). Instantly we see results as we would see them in a Pandas DataFrame.

We have large collection of open source products. Follow the tags from
Tag Cloud >>

Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
**Add Projects.**