2018-10-22 18:19:39 qq_37717661 阅读数 8526

计算机视觉算法在图像识别方面的一些难点:

1)视角变化:同一物体,摄像头可以从多个角度来展现;
2)大小变化:物体可视的大小通常是会变化的(不仅是在图片中,在真实世界中大小也是由变化的);
3)形变:很多东西的形状并非一成不变,会有很大变化;
4)遮挡:目标物体可能被遮挡。有时候只有物体的一部分(可以小到几个像素)是可见的; 5)光照条件:在像素层面上,光照的影响非常大;
6)背景干扰:物体可能混入背景之中,使之难以被辨认;
7)类内差异:一类物体的个体之间的外形差异很大,如椅子。这一类物体有许多不同的对象,每个都有自己的外形

人脸识别算法主要包含三个模块:

  • 人脸检测(Face Detection):确定人脸在图像中的大小和位置,也就是在图像中预测anchor;

  • 人脸对齐(Face Alignment):它的原理是找到人脸的若干个关键点(基准点,如眼角,鼻尖,嘴角等),然后利用这些对应的关键点通过相似变换(Similarity Transform,旋转、缩放和平移)将人脸尽可能变换到标准人脸;
    在这里插入图片描述

  • 人脸特征表征(Feature Representation):它接受的输入是标准化的人脸图像,通过特征建模得到向量化的人脸特征,最后通过分类器判别得到识别的结果。关键点是怎样得到不同人脸的有区分度的特征,比如:鼻子、嘴巴、眼睛等。

早期算法:

  • 子空间(线性降维)
    PCA(主成成分分析) :尽量多地保留原始数据的保留主要信息,降低冗余信息;
    LDA(线性判别分析):增大类间差距,减小类内差距。

  • 非线性降维: 流形学习、加入核函数。

  • ICA(独立成分分析):比PCA效果好,比较依赖于训练测试场景,且对光照、人脸的表情、姿态敏感,泛化能力不足。

  • HMM(隐马尔科夫) : 和前面这些算法相比,它对光照变化、表情和姿态的变化更鲁棒。

早期:数据和模型结构;
后期:loss,从而得到不同人脸的有区分度的特征。

常用算法总结

计算机视觉中的相关算法的源代码
计算机视觉常用算法博客

  1. 特征提取(找到若干个关键点)
    (1) SIFT (尺度不变特征变换) 具有尺度不变性,可在图像中检测出关键点。
    (2) SURF(加速稳健特征,SIFT加速版)
    核心:构建Hessian矩阵,判别当前点是否为比邻域更亮或更暗的点,由此来确定关键点的位置。
    优:特征稳定;
    缺:对于边缘光滑的目标提取能力较弱。
    (3) ORB
  • 结合Fast与Brief算法,并给Fast特征点增加了方向性,使得特征点具有旋转不变性,并提出了构造金字塔方法,解决尺度不变性.
  • ORB算法的速度是sift的100倍,是surf的10倍。

经显示观察到,ORB算法在特征点标记时数量较少,如图:

SIFT、SURF、ORB实现

(4) FAST角点检测
FAST的方法主要是考虑像素点附近的圆形窗口上的16个像素
在这里插入图片描述
在这里插入图片描述
如果要提高检测速度的话,只需要检测四个点就可以了,首先比较第1和第9个像素,如果两个点像素强度都在中心像素强度t变化范围内(及都同中心点相似),则说明这不是角点,如果接下来检测第5和13点时,发现上述四点中至少有三个点同中心点不相似,则可以说明这是个角点。

非极大值抑制:如果存在多个关键点,则删除角响应度较小的特征点。

(5) HOG (方向梯度直方图)

(6) LBP(局部二值特征)论述了高维特征和验证性能存在着正相关的关系,即人脸维度越高,验证的准确度就越高。

(7)Haar

人脸识别相关论文阅读

ICCV2017开放论文

一文带你了解人脸识别算法演化史!

2019-01-26 12:38:46 qq_37899132 阅读数 945

简介:

《计算机视觉:算法与应用》探索了用于分析和解释图像的各种常用技术,描述了具有一定挑战性的视觉应用方面的成功实例,兼顾专业的医学成像和图像编辑与拼接之类有趣的大众应用,以便学生能够将其应用于自己的照片和视频,从中获得成就感和乐趣。《计算机视觉:算法与应用》从科学的角度介绍基本的视觉问题,将成像过程的物理模型公式化,然后在此基础上生成对场景的逼真描述。作者还运用统计模型来分析和运用严格的工程方法来解决这些问题。
    《计算机视觉:算法与应用》作为本科生和研究生“计算机视觉”课程的理想教材,适合计算机和电子工程专业学生使用,重点介绍现实中行之有效的基本技术,通过大量应用和练习来鼓励学生大胆创新。此外,本书的精心设计和编排,使其可以作为计算机视觉领域中一本独特的基础技术参考和*新研究成果文献。&nb等

关注下方微信公众号,免费获取海量电子书资源(关注公众号,点击电子书,获取下载链接)

2018-12-13 16:40:53 u011344545 阅读数 530

博主github:https://github.com/MichaelBeechan   

博主CSDN:https://blog.csdn.net/u011344545   

计算机视觉数据集:https://github.com/MichaelBeechan/CV-Datasets

 Image Processing

1、OpenCV

        OpenCV (Open Source Computer Vision Library) is released under a BSD license and hence it’s free for both academic and commercial use. It has C++, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android.

2、PCL

        The Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing.   PCL is released under the terms of the BSD license, and thus free for commercial and research use.       

3、Machine Vision Toolbox

        The Machine Vision Toolbox (MVTB) provides many functions that are useful in machine vision and vision-based control.  It is a somewhat eclectic collection reflecting my personal interest in areas of photometry, photogrammetry, colorimetry.  It includes over 100 functions spanning operations such as image file reading and writing, acquisition, display, filtering, blob, point and line feature extraction,  mathematical morphology, homographies, visual Jacobians, camera calibration and color space conversion.        

4、ImLab 3.1

        ImLab is a free open source graphical application for Scientific Image Processing that runs in Windows, Linux and many other UNIX systems. It supports multiple windows, data types including 32 bit integers, 32 bit real numbers and complex numbers. It is implemented in C++ and C to provide a very simple way to add new functions. It has many image operations and supports several file formats.

5、The CImg Library

        The CImg Library provides a minimal set of C++ classes that can be used to perform common operations on generic 2D/3D images. It is simple-to-use, efficient, and portable. It's a really pleasant toolkit for developing image processing stuffs in C++.

6、MRPT

        Mobile Robot Programming Toolkit provides developers with portable and well-tested applications and libraries covering data structures and algorithms employed in common robotics research areas. It is open source, released under the BSD license.

7、SimpleCV

        SimpleCV is an open source framework for building computer vision applications. With it, you get access to several high-powered computer vision libraries such as OpenCV – without having to first learn about bit depths, file formats, color spaces, buffer management, eigenvalues, or matrix versus bitmap storage. This is computer vision made easy.

Features Extraction & Matching

1、ASIFT

        ASIFT: An Algorithm for Fully Affine Invariant Comparison

2、OpenSURF

3、Autonomous Vision Group

4、libviso2

        LIBVISO2 (Library for Visual Odometry 2) is a very fast cross-platfrom (Linux, Windows) C++ library with MATLAB wrappers for computing the 6 DOF motion of a moving mono/stereo camera. The stereo version is based on minimizing the reprojection error of sparse feature matches and is rather general (no motion model or setup restrictions except that the input images must be rectified and calibration parameters are known). The monocular version is still very experimental and uses the 8-point algorithm for fundamental matrix estimation. It further assumes that the camera is moving at a known and fixed height over ground (for estimating the scale). Due to the 8 correspondences needed for the 8-point algorithm, many more RANSAC samples need to be drawn, which makes the monocular algorithm slower than the stereo algorithm, for which 3 correspondences are sufficent to estimate parameters.

5、KLT:An Implementation of the Kanade-Lucas-Tomasi Feature Tracker

        KLT is an implementation, in the C programming language, of a feature tracker for the computer vision community.  The source code is in the public domain, available for both commercial and non-commerical use.

The tracker is based on the early work of Lucas and Kanade [1], was developed fully by Tomasi and Kanade [2], and was explained clearly in the paper by Shi and Tomasi [3]. Later, Tomasi proposed a slight modification which makes the computation symmetric with respect to the two images -- the resulting equation is derived in the unpublished note by myself [4].  Briefly, good features are located by examining the minimum eigenvalue of each 2 by 2 gradient matrix, and features are tracked using a Newton-Raphson method of minimizing the difference between the two windows. Multiresolution tracking allows for relatively large displacements between images.  The affine computation that evaluates the consistency of features between non-consecutive frames [3] was implemented by Thorsten Thormaehlen several years after the original code and documentation were written.

6、MultiCameraTracking

        A multi-camera tracking algorithm using OpenCV. It depends on OpenCv 2.3.1 (http://opencv.org/), Boost libraries (http://www.boost.org/) and Intel IPP (https://software.intel.com/en-us/intel-ipp).

SLAM

1、PTAM

        PTAM (Parallel Tracking and Mapping) is a camera tracking system for augmented reality. It requires no markers, pre-made maps, known templates, or inertial sensors. If you're unfamiliar with PTAM have a look at some videos made with PTAM.

2、Andrew Davison: Software

        For up-to-date SLAM software from my research group please visit the Dyson Robotics Lab Webpage or the older Robot Vision Group Software Page.

3、ORB-SLAM 1 2

4、CoSLAM   Collaborative Visual SLAM in Dynamic Environments

5、GTSAM     Georgia Tech Smoothing and Mapping library

GTSAM is a library of C++ classes that implement smoothing and mapping (SAM) in robotics and vision, using factor graphs and Bayes networks as the underlying computing paradigm rather than sparse matrices.

On top of the C++ library, GTSAM includes a MATLAB interface (enable GTSAM_INSTALL_MATLAB_TOOLBOX in CMake to build it). A Python interface is under development.

6、StructSLAM    Visual SLAM with Building Structure Lines

Indoor scenes usually exhibiting strong structural regularities. In this talk, Danping will introduce a novel visual SLAM method that adopts building structure lines as novel features for localization and mapping. The advantages of the proposed method are twofold: 1) line features are easier to be detected and tracked in texture-less scenes; 2) the orientation information encoded in the structure line feature can reduce drift error significantly. Experimental results show that the proposed method performs better than existing methods in terms of stability and accuracy. The talk will present the motivation and implementation in details.

7、LSD-SLAM     Large-Scale Direct Monocular SLAM

8、DSO: Direct Sparse Odometry

DSO is a novel direct and sparse formulation for Visual Odometry. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry - represented as inverse depth in a reference frame - and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead sampling pixels evenly throughout the images. DSO does not depend on keypoint detectors or descriptors, thus it can naturally sample pixels from across all image regions that have intensity gradient, including edges or smooth intensity variations on mostly white walls. The proposed model integrates a full photometric calibration, accounting for exposure time, lens vignetting, and non-linear response functions. We thoroughly evaluate our method on three different datasets comprising several hours of video. The experiments show that the presented approach significantly outperforms state-of-the-art direct and indirect methods in a variety of real-world settings, both in terms of tracking accuracy and robustness.

9、DTAM     Dense tracking and mapping in real-time

10、KinectFusion

11、Monocular SLAM

Other SLAM and Computer Vision Algorithm find in github

2016-12-06 11:32:39 zhuquan945 阅读数 3489

一、特征提取Feature Extraction:


二、图像分割Image Segmentation:

  • Normalized Cut [1] [Matlab code]

  • Gerg Mori’ Superpixel code [2] [Matlab code]

  • Efficient Graph-based Image Segmentation [3] [C++ code] [Matlab wrapper]

  • Mean-Shift Image Segmentation [4] [EDISON C++ code] [Matlab wrapper]

  • OWT-UCM Hierarchical Segmentation [5] [Resources]

  • Turbepixels [6] [Matlab code 32bit] [Matlab code 64bit] [Updated code]

  • Quick-Shift [7] [VLFeat]

  • SLIC Superpixels [8] [Project]

  • Segmentation by Minimum Code Length [9] [Project]

  • Biased Normalized Cut [10] [Project]

  • Segmentation Tree [11-12] [Project]

  • Entropy Rate Superpixel Segmentation [13] [Code]

  • Fast Approximate Energy Minimization via Graph Cuts[Paper][Code]

  • Efficient Planar Graph Cuts with Applications in Computer Vision[Paper][Code]

  • Isoperimetric Graph Partitioning for Image Segmentation[Paper][Code]

  • Random Walks for Image Segmentation[Paper][Code]

  • Blossom V: A new implementation of a minimum cost perfect matching algorithm[Code]

  • An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Computer Vision[Paper][Code]

  • Geodesic Star Convexity for Interactive Image Segmentation[Project]

  • Contour Detection and Image Segmentation Resources[Project][Code]

  • Biased Normalized Cuts[Project]

  • Max-flow/min-cut[Project]

  • Chan-Vese Segmentation using Level Set[Project]

  • A Toolbox of Level Set Methods[Project]

  • Re-initialization Free Level Set Evolution via Reaction Diffusion[Project]

  • Improved C-V active contour model[Paper][Code]

  • A Variational Multiphase Level Set Approach to Simultaneous Segmentation and Bias Correction[Paper][Code]

  • Level Set Method Research by Chunming Li[Project]

  • ClassCut for Unsupervised Class Segmentation[code]

  • SEEDS: Superpixels Extracted via Energy-Driven Sampling [Project][other]


三、目标检测Object Detection:

  • A simple object detector with boosting [Project]

  • INRIA Object Detection and Localization Toolkit [1] [Project]

  • Discriminatively Trained Deformable Part Models [2] [Project]

  • Cascade Object Detection with Deformable Part Models [3] [Project]

  • Poselet [4] [Project]

  • Implicit Shape Model [5] [Project]

  • Viola and Jones’s Face Detection [6] [Project]

  • Bayesian Modelling of Dyanmic Scenes for Object Detection[Paper][Code]

  • Hand detection using multiple proposals[Project]

  • Color Constancy, Intrinsic Images, and Shape Estimation[Paper][Code]

  • Discriminatively trained deformable part models[Project]

  • Gradient Response Maps for Real-Time Detection of Texture-Less Objects: LineMOD [Project]

  • Image Processing On Line[Project]

  • Robust Optical Flow Estimation[Project]

  • Where's Waldo: Matching People in Images of Crowds[Project]

  • Scalable Multi-class Object Detection[Project]

  • Class-Specific Hough Forests for Object Detection[Project]

  • Deformed Lattice Detection In Real-World Images[Project]

  • Discriminatively trained deformable part models[Project]


四、显著性检测Saliency Detection:

  • Itti, Koch, and Niebur’ saliency detection [1] [Matlab code]

  • Frequency-tuned salient region detection [2] [Project]

  • Saliency detection using maximum symmetric surround [3] [Project]

  • Attention via Information Maximization [4] [Matlab code]

  • Context-aware saliency detection [5] [Matlab code]

  • Graph-based visual saliency [6] [Matlab code]

  • Saliency detection: A spectral residual approach. [7] [Matlab code]

  • Segmenting salient objects from images and videos. [8] [Matlab code]

  • Saliency Using Natural statistics. [9] [Matlab code]

  • Discriminant Saliency for Visual Recognition from Cluttered Scenes. [10] [Code]

  • Learning to Predict Where Humans Look [11] [Project]

  • Global Contrast based Salient Region Detection [12] [Project]

  • Bayesian Saliency via Low and Mid Level Cues[Project]

  • Top-Down Visual Saliency via Joint CRF and Dictionary Learning[Paper][Code]

  • Saliency Detection: A Spectral Residual Approach[Code]


五、图像分类、聚类Image Classification, Clustering

  • Pyramid Match [1] [Project]

  • Spatial Pyramid Matching [2] [Code]

  • Locality-constrained Linear Coding [3] [Project] [Matlab code]

  • Sparse Coding [4] [Project] [Matlab code]

  • Texture Classification [5] [Project]

  • Multiple Kernels for Image Classification [6] [Project]

  • Feature Combination [7] [Project]

  • SuperParsing [Code]

  • Large Scale Correlation Clustering Optimization[Matlab code]

  • Detecting and Sketching the Common[Project]

  • Self-Tuning Spectral Clustering[Project][Code]

  • User Assisted Separation of Reflections from a Single Image Using a Sparsity Prior[Paper][Code]

  • Filters for Texture Classification[Project]

  • Multiple Kernel Learning for Image Classification[Project]

  • SLIC Superpixels[Project]


六、抠图Image Matting

  • A Closed Form Solution to Natural Image Matting [Code]

  • Spectral Matting [Project]

  • Learning-based Matting [Code]


七、目标跟踪Object Tracking:

  • A Forest of Sensors - Tracking Adaptive Background Mixture Models [Project]

  • Object Tracking via Partial Least Squares Analysis[Paper][Code]

  • Robust Object Tracking with Online Multiple Instance Learning[Paper][Code]

  • Online Visual Tracking with Histograms and Articulating Blocks[Project]

  • Incremental Learning for Robust Visual Tracking[Project]

  • Real-time Compressive Tracking[Project]

  • Robust Object Tracking via Sparsity-based Collaborative Model[Project]

  • Visual Tracking via Adaptive Structural Local Sparse Appearance Model[Project]

  • Online Discriminative Object Tracking with Local Sparse Representation[Paper][Code]

  • Superpixel Tracking[Project]

  • Learning Hierarchical Image Representation with Sparsity, Saliency and Locality[Paper][Code]

  • Online Multiple Support Instance Tracking [Paper][Code]

  • Visual Tracking with Online Multiple Instance Learning[Project]

  • Object detection and recognition[Project]

  • Compressive Sensing Resources[Project]

  • Robust Real-Time Visual Tracking using Pixel-Wise Posteriors[Project]

  • Tracking-Learning-Detection[Project][OpenTLD/C++ Code]

  • the HandVu:vision-based hand gesture interface[Project]

  • Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities[Project]


八、Kinect:


九、3D相关:

  • 3D Reconstruction of a Moving Object[Paper] [Code]

  • Shape From Shading Using Linear Approximation[Code]

  • Combining Shape from Shading and Stereo Depth Maps[Project][Code]

  • Shape from Shading: A Survey[Paper][Code]

  • A Spatio-Temporal Descriptor based on 3D Gradients (HOG3D)[Project][Code]

  • Multi-camera Scene Reconstruction via Graph Cuts[Paper][Code]

  • A Fast Marching Formulation of Perspective Shape from Shading under Frontal Illumination[Paper][Code]

  • Reconstruction:3D Shape, Illumination, Shading, Reflectance, Texture[Project]

  • Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers[Code]

  • Learning 3-D Scene Structure from a Single Still Image[Project]


十、机器学习算法:

  • Matlab class for computing Approximate Nearest Nieghbor (ANN) [Matlab class providing interface toANN library]

  • Random Sampling[code]

  • Probabilistic Latent Semantic Analysis (pLSA)[Code]

  • FASTANN and FASTCLUSTER for approximate k-means (AKM)[Project]

  • Fast Intersection / Additive Kernel SVMs[Project]

  • SVM[Code]

  • Ensemble learning[Project]

  • Deep Learning[Net]

  • Deep Learning Methods for Vision[Project]

  • Neural Network for Recognition of Handwritten Digits[Project]

  • Training a deep autoencoder or a classifier on MNIST digits[Project]

  • THE MNIST DATABASE of handwritten digits[Project]

  • Ersatz:deep neural networks in the cloud[Project]

  • Deep Learning [Project]

  • sparseLM : Sparse Levenberg-Marquardt nonlinear least squares in C/C++[Project]

  • Weka 3: Data Mining Software in Java[Project]

  • Invited talk "A Tutorial on Deep Learning" by Dr. Kai Yu (余凯)[Video]

  • CNN - Convolutional neural network class[Matlab Tool]

  • Yann LeCun's Publications[Wedsite]

  • LeNet-5, convolutional neural networks[Project]

  • Training a deep autoencoder or a classifier on MNIST digits[Project]

  • Deep Learning 大牛Geoffrey E. Hinton's HomePage[Website]

  • Multiple Instance Logistic Discriminant-based Metric Learning (MildML) and Logistic Discriminant-based Metric Learning (LDML)[Code]

  • Sparse coding simulation software[Project]

  • Visual Recognition and Machine Learning Summer School[Software]


十一、目标、行为识别Object, Action Recognition:

  • Action Recognition by Dense Trajectories[Project][Code]

  • Action Recognition Using a Distributed Representation of Pose and Appearance[Project]

  • Recognition Using Regions[Paper][Code]

  • 2D Articulated Human Pose Estimation[Project]

  • Fast Human Pose Estimation Using Appearance and Motion via Multi-Dimensional Boosting Regression[Paper][Code]

  • Estimating Human Pose from Occluded Images[Paper][Code]

  • Quasi-dense wide baseline matching[Project]

  • ChaLearn Gesture Challenge: Principal motion: PCA-based reconstruction of motion histograms[Project]

  • Real Time Head Pose Estimation with Random Regression Forests[Project]

  • 2D Action Recognition Serves 3D Human Pose Estimation[

  • A Hough Transform-Based Voting Framework for Action Recognition[

  • Motion Interchange Patterns for Action Recognition in Unconstrained Videos[

  • 2D articulated human pose estimation software[Project]

  • Learning and detecting shape models [code]

  • Progressive Search Space Reduction for Human Pose Estimation[Project]

  • Learning Non-Rigid 3D Shape from 2D Motion[Project]


十二、图像处理:

  • Distance Transforms of Sampled Functions[Project]

  • The Computer Vision Homepage[Project]

  • Efficient appearance distances between windows[code]

  • Image Exploration algorithm[code]

  • Motion Magnification 运动放大 [Project]

  • Bilateral Filtering for Gray and Color Images 双边滤波器 [Project]

  • A Fast Approximation of the Bilateral Filter using a Signal Processing Approach [


十三、一些实用工具:

  • EGT: a Toolbox for Multiple View Geometry and Visual Servoing[Project] [Code]

  • a development kit of matlab mex functions for OpenCV library[Project]

  • Fast Artificial Neural Network Library[Project]


十四、人手及指尖检测与识别:

  • finger-detection-and-gesture-recognition [Code]

  • Hand and Finger Detection using JavaCV[Project]

  • Hand and fingers detection[Code]


十五、场景解释:

  • Nonparametric Scene Parsing via Label Transfer [Project]


十六、光流Optical flow:

  • High accuracy optical flow using a theory for warping [Project]

  • Dense Trajectories Video Description [Project]

  • SIFT Flow: Dense Correspondence across Scenes and its Applications[Project]

  • KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker [Project]

  • Tracking Cars Using Optical Flow[Project]

  • Secrets of optical flow estimation and their principles[Project]

  • implmentation of the Black and Anandan dense optical flow method[Project]

  • Optical Flow Computation[Project]

  • Beyond Pixels: Exploring New Representations and Applications for Motion Analysis[Project]

  • A Database and Evaluation Methodology for Optical Flow[Project]

  • optical flow relative[Project]

  • Robust Optical Flow Estimation [Project]

  • optical flow[Project]


十七、图像检索Image Retrieval:

  • Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval [Paper][code]


十八、马尔科夫随机场Markov Random Fields:

  • Markov Random Fields for Super-Resolution [Project]

  • A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors [Project]


十九、运动检测Motion detection: