VIAME - Video and Image Analytics for Marine Environments

  •        91

VIAME is a computer vision application designed for do-it-yourself artificial intelligence including object detection, object tracking, image/video annotation, image/video search, image mosaicing, size measurement, rapid model generation, and tools for the evaluation of different algorithms. Originally targetting marine species analytics, it now contains many common algorithms and libraries, and is also useful as a generic computer vision toolkit. It contains a number of standalone tools for accomplishing the above, a pipeline framework which can connect C/C++, python, and matlab nodes together in a multi-threaded fashion, and, lastly, multiple algorithms resting on top of the pipeline infrastructure. Both a desktop and web version exist for deployments in different types of environments. For a full installation guide and description of the various flavors of VIAME, see the quick-start guide, above. The desktop version is provided as either a .msi, .zip or .tar file. Alternatively, docker files are available for both VIAME Desktop and Web (below). A sample instance of VIAME Web is also online, hosted at viame.kitware.com. For desktop installs, extract the binaries (or use the msi Windows installation wizard) and place them in a directory of your choosing, for example /opt/noaa/viame on Linux or C:\Program Files\VIAME on Windows. If using packages built with GPU support, make sure to have sufficient video drivers installed, version 451.82 or higher. The best way to install drivers depends on your operating system, see below. Lastly, run through some of the examples to validate the installation. The binaries are quite large, in terms of disk space, due to the inclusion of multiple default model files and programs, but if just building your desired features from source (e.g. for embedded apps) they are much smaller.

http://www.viametoolkit.org/
https://github.com/VIAME/VIAME

Tags
Implementation
License
Platform

   




Related Projects

cvat - Computer Vision Annotation Tool (CVAT) is a web-based tool which helps to annotate video and images for Computer Vision algorithms

  •    Javascript

CVAT is completely re-designed and re-implemented version of Video Annotation Tool from Irvine, California tool. It is free, online, interactive video and image annotation tool for computer vision. It is being used by our team to annotate million of objects with different properties. Many UI and UX decisions are based on feedbacks from professional data annotation team. Code released under the MIT License.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

labelme - Image Polygonal Annotation with Python (polygon, rectangle, line, point and image-level flag annotation)

  •    Python

Labelme is a graphical image annotation tool inspired by http://labelme.csail.mit.edu. It is written in Python and uses Qt for its graphical interface. Fig 2. VOC dataset example of instance segmentation.

SerpentAI - Game Agent Framework. Helping you create AIs / Bots to play any game you own!

  •    Jupyter

The framework features a large assortment of supporting modules that provide solutions to commonly encountered scenarios when using video games as environments as well as CLI tools to accelerate development. It provides some useful conventions but is absolutely NOT opiniated about what you put in your agents: Want to use the latest, cutting-edge deep reinforcement learning algorithm? ALLOWED. Want to use computer vision techniques, image processing and trigonometry? ALLOWED. Want to randomly press the Left or Right buttons? sigh ALLOWED. To top it all off, Serpent.AI was designed to be entirely plugin-based (for both game support and game agents) so your experiments are actually portable and distributable to your peers and random strangers on the Internet. You'll also be glad to hear that all 3 major OSes are supported: Linux, Windows & macOS.

Accord.NET - Machine learning, Computer vision, Statistics and general scientific computing for .NET

  •    CSharp

The Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile.


vatic - Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces. IJCV 2012

  •    HTML

VATIC is an online video annotation tool for computer vision research that crowdsources work to Amazon's Mechanical Turk. Our tool makes it easy to build massive, affordable video data sets. Note: VATIC has only been tested on Ubuntu with Apache 2.2 HTTP server and a MySQL server. This document will describe installation on this platform, however it should work any operating system and with any server.

jina - Cloud-native neural search framework for ๐™–๐™ฃ๐™ฎ kind of data

  •    Python

Jina๐Ÿ”Š is a neural search framework that empowers anyone to build SOTA & scalable deep learning search applications in minutes. ๐ŸŒŒ All data types - Large-scale indexing and querying of any kind of unstructured data: video, image, long/short text, music, source code, PDF, etc.

GPUImage2 - GPUImage 2 is a BSD-licensed Swift framework for GPU-accelerated video and image processing

  •    Swift

GPUImage 2 is the second generation of the GPUImage framework, an open source project for performing GPU-accelerated image and video processing on Mac, iOS, and now Linux. The original GPUImage framework was written in Objective-C and targeted Mac and iOS, but this latest version is written entirely in Swift and can also target Linux and future platforms that support Swift code. The objective of the framework is to make it as easy as possible to set up and perform realtime video processing or machine vision against image or video sources. By relying on the GPU to run these operations, performance improvements of 100X or more over CPU-bound code can be realized. This is particularly noticeable in mobile or embedded devices. On an iPhone 4S, this framework can easily process 1080p video at over 60 FPS. On a Raspberry Pi 3, it can perform Sobel edge detection on live 720p video at over 20 FPS.

PixelAnnotationTool - Annotate quickly images.

  •    C++

Software that allows you to manually and quickly annotate images in directories. The method is pseudo manual because it uses the algorithm watershed marked of OpenCV. The general idea is to manually provide the marker with brushes and then to launch the algorithm. If at first pass the segmentation needs to be corrected, the user can refine the markers by drawing new ones on the erroneous areas (as shown on video below). Donating is very simple - and secure. Please click here to make a donation.

GPUImage3 - GPUImage 3 is a BSD-licensed Swift framework for GPU-accelerated video and image processing using Metal

  •    Swift

GPUImage 3 is the third generation of the GPUImage framework, an open source project for performing GPU-accelerated image and video processing on Mac and iOS. The original GPUImage framework was written in Objective-C and targeted Mac and iOS, the second iteration rewritten in Swift using OpenGL to target Mac, iOS, and Linux, and now this third generation is redesigned to use Metal in place of OpenGL. The objective of the framework is to make it as easy as possible to set up and perform realtime video processing or machine vision against image or video sources. Previous iterations of this framework wrapped OpenGL (ES), hiding much of the boilerplate code required to render images on the GPU using custom vertex and fragment shaders. This version of the framework replaces OpenGL (ES) with Metal. Largely driven by Apple's deprecation of OpenGL (ES) on their platforms in favor of Metal, it will allow for exploring performance optimizations over OpenGL and a tighter integration with Metal-based frameworks and operations.

neural-doodle - Turn your two-bit doodles into fine artworks with deep neural networks, generate seamless textures from photos, transfer style from one image to another, perform example-based upscaling, but wait

  •    Python

Use a deep neural network to borrow the skills of real artists and turn your two-bit doodles into masterpieces! This project is an implementation of Semantic Style Transfer (Champandard, 2016), based on the Neural Patches algorithm (Li, 2016). Read more about the motivation in this in-depth article and watch this workflow video for inspiration. The doodle.py script generates a new image by using one, two, three or four images as inputs depending what you're trying to do: the original style and its annotation, and a target content image (optional) with its annotation (a.k.a. your doodle). The algorithm extracts annotated patches from the style image, and incrementally transfers them over to the target image based on how closely they match.

FILTER

  •    Javascript

This is a library for processing images/video in pure JavaScript using HTML5 features like Canvas, WebWorkers, WebGL and SVG (in progress) or analogs in Node.js. Some filters code has been adapted from open source libraries (mostly c, java and flash, plus a couple from javascript libraries), see the comments in the code for details.

sod - An Embedded Computer Vision & Machine Learning Library (CPU Optimized & IoT Capable)

  •    C

SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices. SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

BotSharp - The Open Source AI Chatbot Platform Builder in 100% C# Running in

  •    CSharp

BotSharp is an open source machine learning framework for AI Bot platform builder. This project involves natural language understanding, computer vision and audio processing technologies, and aims to promote the development and application of intelligent robot assistants in information systems. Out-of-the-box machine learning algorithms allow ordinary programmers to develop artificial intelligence applications faster and easier. It's witten in C# running on .Net Core that is full cross-platform framework. C# is a enterprise grade programming language which is widely used to code business logic in information management related system. More friendly to corporate developers. BotSharp adopts machine learning algrithm in C# directly. That will facilitate the feature of the typed language C#, and be more easier when refactoring code in system scope.

DeepVideoAnalytics - A distributed visual search and visual data analytics platform.

  •    Python

Deep Video Analytics is a platform for indexing and extracting information from videos and images. With latest version of docker installed correctly, you can run Deep Video Analytics in minutes locally (even without a GPU) using a single command. Deep Video Analytics implements a client-server architecture pattern, where clients can access state of the server via a REST API. For uploading, processing data, training models, performing queries, i.e. mutating the state clients can send DVAPQL (Deep Video Analytics Processing and Query Language) formatted as JSON. The query represents a directed acyclic graph of operations.

RUBRIX - Python framework to explore, label, and monitor data for NLP

  •    Python

Rubrix is a production-ready Python framework for exploring, annotating, and managing data in NLP projects. Most annotation tools treat data collection as a one-off activity at the beginning of each project. In real-world projects, data collection is a key activity of the iterative process of ML model development. Once a model goes into production, you want to monitor and analyze its predictions, and collect more data to improve your model over time. Rubrix is designed to close this gap, enabling you to iterate as much as you need.

lightly - A python library for self-supervised learning on images.

  •    Python

Lightly is a computer vision framework for self-supervised learning. We, at Lightly, are passionate engineers who want to make deep learning more efficient. That's why - together with our community - we want to popularize the use of self-supervised methods to understand and curate raw image data. Our solution can be applied before any data annotation step and the learned representations can be used to visualize and analyze datasets. This allows to select the best core set of samples for model training through advanced filtering.

Objectron - Objectron is a dataset of short, object-centric video clips

  •    Jupyter

Objectron is a dataset of short object centric video clips with pose annotations. The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different angles. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. The dataset consists of 15K annotated video clips supplemented with over 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes. In addition, to ensure geo-diversity, our dataset is collected from 10 countries across five continents. Along with the dataset, we are also sharing a 3D object detection solution for four categories of objects — shoes, chairs, mugs, and cameras. These models are trained using this dataset, and are released in MediaPipe, Google's open source framework for cross-platform customizable ML solutions for live and streaming media.

Surface-Defect-Detection - ๐Ÿ“ˆ Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance

  •    Python

At present, surface defect equipment based on machine vision has widely replaced artificial visual inspection in various industrial fields, including 3C, automobiles, home appliances, machinery manufacturing, semiconductors and electronics, chemical, pharmaceutical, aerospace, light industry and other industries. Traditional surface defect detection methods based on machine vision often use conventional image processing algorithms or artificially designed features plus classifiers. Generally speaking, imaging schemes are usually designed by using the different properties of the inspected surface or defects. A reasonable imaging scheme helps to obtain images with uniform illumination and clearly reflect the surface defects of the object. In recent years, many defect detection methods based on deep learning have also been widely used in various industrial scenarios. Compared with the clear classification, detection and segmentation tasks in computer vision, the requirements for defect detection are very general. In fact, its requirements can be divided into three different levels: "what is the defect" (classification), "where is the defect" (positioning) And "How many defects are" (split).






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.