Pandas - Powerful Python Data Analysis Toolkit

  •        0

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions. It supports aggregating or transforming data with a powerful group by engine allowing split-apply-combine operations on data sets, High performance merging and joining of data sets, Time series-functionality, Hierarchical axis indexing and lot more.



comments powered by Disqus

Related Projects

R Language - Project for Statistical Computing

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.

Scikit Learn - Machine Learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy. It is simple and efficient tools for data mining and data analysis. It supports automatic classification, clustering, model selection, pre processing and lot more.


Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

HBase - Hadoop database

HBase provides support to handle BigTable - billions of rows X millions of columns. It is a scalable, distributed, versioned, column-oriented store modeled after Google's Bigtable and runs on top of HDFS (Hadoop Distributed Filesystem). It features compression, in-memory operation per-column. Data could be replicated between the nodes. HBase is used in Facebook and Twitter.


GNU Regression, Econometrics and Time-series Library

SciPy - software for Mathematics, Science, and Engineering

SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. The SciPy library is built to work with NumPy arrays, and provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization.


WebSearch.Net is an open-source research platform that provides uniform data source access, data modeling, feature calculation, data mining, etc.

Live Graph - Plot and explore your data in real-time

LiveGraph is a framework for real-time data visualisation, analysis and logging. It has a real time plotter that can automatically update graphs of your data while it is still being computed by your application. LiveGraph reads files in a simple CSV-style format. For applications developed in Java, LiveGraph additionally provides an API that handles all data logging and persistency issues.

Insight Segmentation and Registration Toolkit

ITK is an open-source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. Developed through extreme programming methodologies, ITK employs leading-edge algorithms for registering and segmenting multidimensional data.


Sphinix is free open-source SQL full-text search engine. How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.