sparkmagic - Jupyter magics and kernels for working with remote Spark clusters

  •        32

Sparkmagic is a set of tools for interactively working with remote Spark clusters through Livy, a Spark REST server, in Jupyter notebooks. The Sparkmagic project includes a set of magics for interactively running Spark code in multiple languages, as well as some kernels that you can use to turn Jupyter into an integrated Spark environment. There are two ways to use sparkmagic. Head over to the examples section for a demonstration on how to use both models of execution.

https://github.com/jupyter-incubator/sparkmagic

Tags
Implementation
License
Platform

   




Related Projects

spark-py-notebooks - Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

  •    Jupyter

This is a collection of IPython notebook/Jupyter notebooks intended to train the reader on different Apache Spark concepts, from basic to advanced, by using the Python language. If Python is not your language, and it is R, you may want to have a look at our R on Apache Spark (SparkR) notebooks instead. Additionally, if your are interested in being introduced to some basic Data Science Engineering, you might find these series of tutorials interesting. There we explain different concepts and applications using Python and R.

jupyter-notify - A Jupyter Notebook magic for browser notifications of cell completion

  •    Python

This package provides a Jupyter notebook cell magic %%notify that notifies the user upon completion of a potentially long-running cell via a browser push notification. Use cases include long-running machine learning models, grid searches, or Spark computations. This magic allows you to navigate away to other work (or even another Mac desktop entirely) and still get a notification when your cell completes. Clicking on the body of the notification will bring you directly to the browser window and tab with the notebook, even if you're on a different desktop (clicking the "Close" button in the notification will keep you where you are). The extension has currently been tested in Chrome (Version: 58.0.3029) and Firefox (Version: 53.0.3).

pandas-videos - Jupyter notebook and datasets from the pandas Q&A video series

  •    Jupyter

Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas.

IRkernel - R kernel for Jupyter

  •    Jupyter

Now both R versions are available as an R kernel in the notebook. If you have Jupyter installed, you can create a notebook using IRkernel from the dropdown menu.


jupyter-scala - Lightweight Scala kernel for Jupyter / IPython 3

  •    Scala

Jupyter Scala is a Scala kernel for Jupyter. It aims at being a versatile and easily extensible alternative to other Scala kernels or notebook UIs, building on both Jupyter and Ammonite. The current version is available for Scala 2.11. Support for Scala 2.10 could be added back, and 2.12 should be supported soon (via ammonium / Ammonite).

PythonDataScienceHandbook - Python Data Science Handbook: full text in Jupyter Notebooks

  •    Jupyter

This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks. Run the code using the Jupyter notebooks available in this repository's notebooks directory.

jupyter-vim-binding - Jupyter meets Vim. Vimmer will fall in love.

  •    Javascript

Do you use Vim? And you need to use Jupyter Notebook? This is a Jupyter Notebook (formerly known as IPython Notebook) extension to enable Vim like environment powered by CodeMirror's Vim. I'm sure that this plugin helps to improve your QOL. While I changed my job, I don't use jupyter notebook and I can't make enough time to maintain this plugin.

ansible-jupyter-kernel - Jupyter Notebook Kernel for running Ansible Tasks and Playbooks

  •    Python

The Ansible Jupyter Kernel adds a kernel backend for Jupyter to interface directly with Ansible and construct plays and tasks and execute them on the fly. ansible-kernel is available to be installed from pypi but you can also install it locally. The setup package itself will register the kernel with Jupyter automatically.

100-pandas-puzzles - 100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

  •    Jupyter

Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of pandas' power. Since pandas is a large library with many different specialist features and functions, these excercises focus mainly on the fundamentals of manipulating data (indexing, grouping, aggregating, cleaning), making use of the core DataFrame and Series objects. Many of the excerises here are straightforward in that the solutions require no more than a few lines of code (in pandas or NumPy - don't go using pure Python!). Choosing the right methods and following best practices is the underlying goal.

jupyter-dark-theme - Dark theme for Jupyter Notebook (iPython 4) UI

  •    CSS

This is a completely dark theme for the Jupyter Notebook interface. Jupyter includes iPython 4 as its default kernel (which, confusingly, supports both Python 2.x and 3.x). Since the iPython 3 to 4 transition, it has gained better support for other interpreters like R and Ruby. It is possible to upgrade iPython 2 or 3 to Jupyter + iPython 4. Source code coloring is based on the Twilight theme for Textmate. Print preview output for notebooks retains a white background with printable foreground colors.

dashboards - Jupyter Dashboards Layout Extension

  •    Jupyter

The dashboards layout extension is an add-on for Jupyter Notebook. It lets you arrange your notebook outputs (text, plots, widgets, ...) in grid- or report-like layouts. It saves information about your layouts in your notebook document. Other people with the extension can open your notebook and view your layouts. For a sample of what's possible with the dashboard layout extension, have a look at the demo dashboard-notebooks in this repository.

clojupyter - a Jupyter kernel for Clojure

  •    Clojure

A Jupyter kernel for Clojure. This will let you run Clojure code from the Jupyter console and notebook. This will install a clojupyter executable and a configuration file to tell Jupyter how to use clojupyter in from jupyter's user kernel location ( ~/.local/share/jupyter/kernels on linux and ~/Library/Jupyter/kernels on Mac).

spark-nlp - Natural Language Understanding Library for Apache Spark.

  •    Jupyter

John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment. This library has been uploaded to the spark-packages repository https://spark-packages.org/package/JohnSnowLabs/spark-nlp .

IfSharp - F# for Jupyter Notebooks

  •    Jupyter

This is the F# implementation for Jupyter. View the Feature Notebook for some of the features that are included.You can use Jupyter F# Notebooks for free (with free server-side execution) at Azure Notebooks. If you select "Show me some samples", then there is an "Introduction to F#" which guides you through the language and its use in Jupyter.

beakerx - Beaker Extensions for Jupyter Notebook

  •    Java

BeakerX is a collection of JVM kernels and interactive widgets for plotting, tables, autotranslation, and other extensions to Jupyter Notebook. BeakerX is in beta and under active development. The documentation consists of tutorial notebooks on GitHub. You can try it in the cloud for free with Binder. And here is the cheatsheet.

CADL - Course materials/Homework materials for the FREE MOOC course on "Creative Applications of Deep Learning w/ Tensorflow" #CADL

  •    Jupyter

This repository contains lecture transcripts and homework assignments as Jupyter Notebooks for the first of three Kadenze Academy courses on Creative Applications of Deep Learning w/ Tensorflow. It also contains a python package containing all the code developed during all three courses. The first course makes heavy usage of Jupyter Notebook. This will be necessary for submitting the homeworks and interacting with the guided session notebooks I will provide for each assignment. Follow along this guide and we'll see how to obtain all of the necessary libraries that we'll be using. By the end of this, you'll have installed Jupyter Notebook, NumPy, SciPy, and Matplotlib. While many of these libraries aren't necessary for performing the Deep Learning which we'll get to in later lectures, they are incredibly useful for manipulating data on your computer, preparing data for learning, and exploring results.

Agile_Data_Code_2 - Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

  •    Jupyter

Like my work? I am Principal Consultant at Data Syndrome, a consultancy offering assistance and training with building full-stack analytics products, applications and systems. Find us on the web at datasyndrome.com. There is now a video course using code from chapter 8, Realtime Predictive Analytics with Kafka, PySpark, Spark MLlib and Spark Streaming. Check it out now at datasyndrome.com/video.