daily-paper-computer-vision - 记录每天整理的计算机视觉/深度学习/机器学习相关方向的论文

  •        17

记录每天整理的计算机视觉/深度学习/机器学习相关方向的论文

https://github.com/amusi/daily-paper-computer-vision

Tags
Implementation
License
Platform

   




Related Projects

sod - An Embedded Computer Vision & Machine Learning Library (CPU Optimized & IoT Capable)

  •    C

SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices. SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

raster-vision - deep learning for aerial/satellite imagery

  •    Python

Note: this project is under development and may be difficult to use at the moment. The overall goal of Raster Vision is to make it easy to train and run deep learning models over aerial and satellite imagery. At the moment, it includes functionality for making training data, training models, making predictions, and evaluating models for the task of object detection implemented via the Tensorflow Object Detection API. It also supports running experimental workflows using AWS Batch. The library is designed to be easy to extend to new data sources, machine learning tasks, and machine learning implementation.

Accord.NET - Machine learning, Computer vision, Statistics and general scientific computing for .NET

  •    CSharp

The Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile.

luminoth - Deep Learning toolkit for Computer Vision

  •    Python

Luminoth is an open source toolkit for computer vision. Currently, we support object detection, but we are aiming for much more. It is built in Python, using TensorFlow and Sonnet. Read the full documentation here.


CatPapers - Cool vision, learning, and graphics papers on Cats!

  •    HTML

As reported by Cisco, 90% of net traffic will be visual, and indeed, most of the visual data are cat photos and videos. Thus, understanding, modeling and synthesizing our feline friends becomes a more and more important research problem these days, especially for our cat lovers. Cat Paper Collection is an academic paper collection that includes computer graphics, computer vision, machine learning and human-computer interaction papers that produce experimental results related to cats. If you want to add/remove a paper, please send an email to Jun-Yan Zhu (junyanz at berkeley dot edu).

facenet - Face recognition using Tensorflow

  •    Python

This is a TensorFlow implementation of the face recognizer described in the paper "FaceNet: A Unified Embedding for Face Recognition and Clustering". The project also uses ideas from the paper "Deep Face Recognition" from the Visual Geometry Group at Oxford. The code is tested using Tensorflow r1.7 under Ubuntu 14.04 with Python 2.7 and Python 3.5. The test cases can be found here and the results can be found here.

opencv4nodejs - Asynchronous OpenCV 3

  •    C++

By its nature, JavaScript lacks the performance to implement Computer Vision tasks efficiently. Therefore this package brings the performance of the native OpenCV library to your Node.js application. This project targets OpenCV 3 and provides an asynchronous as well as an synchronous API. The ultimate goal of this project is to provide a comprehensive collection of Node.js bindings to the API of OpenCV and the OpenCV-contrib modules. An overview of available bindings can be found in the API Documentation. Furthermore, contribution is highly appreciated. If you want to get involved you can have a look at the contribution guide.

T-CNN - ImageNet 2015 Object Detection from Video (VID)

  •    Python

The TCNN framework is a deep learning framework for object detection in videos. This framework was orginally designed for the ImageNet VID chellenge in ILSVRC2015. If you are using the T-CNN code in you project, please cite the following works.

jeelizWeboji - JavaScript/WebGL real-time face tracking and expression detection library

  •    Javascript

With this library, you can build your own animoji embedded in Javascript/WebGL applications. You do not need any specific device except a standard webcam. By default a webcam feedback image is displayed with the face detection frame. The face detection is quite robust to all lighting conditions, but the evaluation of expression can be noisy if the lighting is too directional, too weak or if there is an important backlight. So the webcam feedback image is useful to see the quality of the input video feed.

ihog - Visualizing Object Detection Features. ICCV 2013

  •    C++

This software package contains tools to invert and visualize HOG features. It implements the Paired Dictionary Learning algorithm described in our paper "HOGgles: Visualizing Object Detection Features" [1]. If you run into trouble compiling the SPAMS code, you might try opening the file /path/to/ihog/spams/compile.m and adjusting the settings for your computer.

jetson-inference - Guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson

  •    C++

Welcome to our training guide for inference and deep vision runtime library for NVIDIA DIGITS and Jetson Xavier/TX1/TX2. This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded platform, improving performance and power efficiency using graph optimizations, kernel fusion, and half-precision FP16 on the Jetson.

cvat - Computer Vision Annotation Tool (CVAT) is a web-based tool which helps to annotate video and images for Computer Vision algorithms

  •    Javascript

CVAT is completely re-designed and re-implemented version of Video Annotation Tool from Irvine, California tool. It is free, online, interactive video and image annotation tool for computer vision. It is being used by our team to annotate million of objects with different properties. Many UI and UX decisions are based on feedbacks from professional data annotation team. Code released under the MIT License.

PreciseRoIPooling - Precise RoI Pooling with coordinate gradient support, proposed in the paper "Acquisition of Localization Confidence for Accurate Object Detection" (https://arxiv

  •    Cuda

This repo implements the Precise RoI Pooling (PrRoI Pooling), proposed in the paper Acquisition of Localization Confidence for Accurate Object Detection published at ECCV 2018 (Oral Presentation). For a better illustration, we illustrate RoI Pooling, RoI Align and PrRoI Pooing in the following figure. More details including the gradient computation can be found in our paper.

fashion-detection - Fashion Detection in the Wild (Deep Clothes Detector)

  •    Matlab

Deep Clothes Detector is a clothes detection framework based on Fast R-CNN. Given a fashion image, this software finds and localizes potential upper-body clothes, lower-body clothes and full-body clothes in it, respectively. Further information please contact Ziwei Liu.

3dmatch-toolbox - 3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds

  •    C++

Matching local geometric features on real-world depth images is a challenging task due to the noisy, low-resolution, and incomplete nature of 3D scan data. These difficulties limit the performance of current state-of-art methods, which are typically based on histograms over geometric properties. In this paper, we present 3DMatch, a data-driven model that learns a local volumetric patch descriptor for establishing correspondences between partial 3D data. To amass training data for our model, we propose an unsupervised feature learning method that leverages the millions of correspondence labels found in existing RGB-D reconstructions. Experiments show that our descriptor is not only able to match local geometry in new scenes for reconstruction, but also generalize to different tasks and spatial scales (e.g. instance-level object model alignment for the Amazon Picking Challenge, and mesh surface correspondence). Results show that 3DMatch consistently outperforms other state-of-the-art approaches by a significant margin. This code is released under the Simplified BSD License (refer to the LICENSE file for details).

pyannote-audio - Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding

  •    Python

Open Phd/postdoc positions at LIMSI combining machine learning, NLP, speech processing, and computer vision. If you use pyannote.audio in your research, please use the following citations.

openpose - OpenPose: Real-time multi-person keypoint detection library for body, face, and hands estimation

  •    C++

OpenPose represents the first real-time multi-person system to jointly detect human body, hand, and facial keypoints (in total 135 keypoints) on single images. For further details, check all released features and release notes.