2016-03-18 16:04:36 chentravelling 阅读数 10277

openCV光流法追踪运动物体

email:chentravelling@163.com

一、光流简介

摘自:zouxy09

        光流的概念是Gibson在1950年首先提出来的。它是空间运动物体在观察成像平面上的像素运动的瞬时速度,是利用图像序列中像素在时间域上的变化以及相邻帧之间的相关性来找到上一帧跟当前帧之间存在的对应关系,从而计算出相邻帧之间物体的运动信息的一种方法。一般而言,光流是由于场景中前景目标本身的移动、相机的运动,或者两者的共同运动所产生的。

研究光流场的目的就是为了从图片序列中近似得到不能直接得到的运动场。运动场,其实就是物体在三维真实世界中的运动;光流场,是运动场在二维图像平面上(人的眼睛或者摄像头)的投影。

       那通俗的讲就是通过一个图片序列,把每张图像中每个像素的运动速度和运动方向找出来就是光流场。那怎么找呢?咱们直观理解肯定是:第t帧的时候A点的位置是(x1, y1),那么我们在第t+1帧的时候再找到A点,假如它的位置是(x2,y2),那么我们就可以确定A点的运动了:(ux, vy) = (x2, y2) - (x1,y1)。

        那怎么知道第t+1帧的时候A点的位置呢? 这就存在很多的光流计算方法了。

       1981年,Horn和Schunck创造性地将二维速度场与灰度相联系,引入光流约束方程,得到光流计算的基本算法。人们基于不同的理论基础提出各种光流计算方法,算法性能各有不同。Barron等人对多种光流计算技术进行了总结,按照理论基础与数学方法的区别把它们分成四种:基于梯度的方法、基于匹配的方法、基于能量的方法、基于相位的方法。近年来神经动力学方法也颇受学者重视。

OpenCV中实现了不少的光流算法。

可参考:http://www.opencv.org.cn/opencvdoc/2.3.2/html/modules/video/doc/motion_analysis_and_object_tracking.html

1)calcOpticalFlowPyrLK

通过金字塔Lucas-Kanade 光流方法计算某些点集的光流(稀疏光流)。理解的话,可以参考这篇论文:”Pyramidal Implementation of the Lucas Kanade Feature TrackerDescription of the algorithm”

2)calcOpticalFlowFarneback

用Gunnar Farneback 的算法计算稠密光流(即图像上所有像素点的光流都计算出来)。它的相关论文是:"Two-Frame Motion Estimation Based on PolynomialExpansion"

3)CalcOpticalFlowBM

通过块匹配的方法来计算光流。

4)CalcOpticalFlowHS

用Horn-Schunck 的算法计算稠密光流。相关论文好像是这篇:”Determining Optical Flow”

5)calcOpticalFlowSF

这一个是2012年欧洲视觉会议的一篇文章的实现:"SimpleFlow: A Non-iterative, Sublinear Optical FlowAlgorithm",工程网站是:http://graphics.berkeley.edu/papers/Tao-SAN-2012-05/  在OpenCV新版本中有引入。

       稠密光流需要使用某种插值方法在比较容易跟踪的像素之间进行插值以解决那些运动不明确的像素,所以它的计算开销是相当大的。而对于稀疏光流来说,在他计算时需要在被跟踪之前指定一组点(容易跟踪的点,例如角点),因此在使用LK方法之前我们需要配合使用cvGoodFeatureToTrack()来寻找角点,然后利用金字塔LK光流算法,对运动进行跟踪。

二、代码

#include <opencv2\opencv.hpp>
#include <iostream>

using namespace std;
using namespace cv;
const int MAX_CORNERS = 100;

int main()
{
	IplImage* preImage = cvLoadImage("image133.pgm",CV_LOAD_IMAGE_GRAYSCALE);
	IplImage* curImage = cvLoadImage("image134.pgm",CV_LOAD_IMAGE_GRAYSCALE);
	IplImage* curImage_ = cvLoadImage("image134.pgm");
	CvPoint2D32f* preFeatures = new CvPoint2D32f[ MAX_CORNERS ];//前一帧中特征点坐标(通过cvGoodFeaturesToTrack获得)  
	CvPoint2D32f* curFeatures = new CvPoint2D32f[ MAX_CORNERS ];//在当前帧中的特征点坐标(通过光流法获得)
	double qlevel; //特征检测的指标  
	double minDist;//特征点之间最小容忍距离  
	vector<uchar> status; //特征点被成功跟踪的标志  
	vector<float> err; //跟踪时的特征点小区域误差和 
	CvSize      img_sz    = cvGetSize(preImage);  
	IplImage* eig_image = cvCreateImage( img_sz, IPL_DEPTH_32F, 1 );//缓冲区  
	IplImage* tmp_image = cvCreateImage( img_sz, IPL_DEPTH_32F, 1 );
	int num = MAX_CORNERS;
	cvGoodFeaturesToTrack(//获取特征点
		preImage,
		eig_image,
		tmp_image,
		preFeatures,
		&num,
		0.01,  
		5.0,  
		0,  
		3,  
		0,  
		0.04  
		);
	CvSize pyr_sz = cvSize( curImage->width+8, curImage->height/3 );  
	IplImage* pyrA = cvCreateImage( pyr_sz, IPL_DEPTH_32F, 1 );  
	IplImage* pyrB = cvCreateImage( pyr_sz, IPL_DEPTH_32F, 1 );  
	
	char features_found[ MAX_CORNERS ];  
	float feature_errors[ MAX_CORNERS ];  

	
	cvCalcOpticalFlowPyrLK(//计算光流
		preImage,
		curImage,
		pyrA,
		pyrB,
		preFeatures,
		curFeatures,
		MAX_CORNERS,
		cvSize(10,10),
		5,  
		features_found,  
		feature_errors,  
		cvTermCriteria( CV_TERMCRIT_ITER | CV_TERMCRIT_EPS, 20, .3 ),  
		0  
		);

	
	for(int i=0;i<MAX_CORNERS;i++)//画线
	{
	cvLine(
		curImage_,
		Point(preFeatures[i].x,preFeatures[i].y),
		Point(curFeatures[i].x,curFeatures[i].y),
		CV_RGB(255,0,0),
		4,
		CV_AA,
		0);
	}
	
	cvShowImage("outPutImage",curImage_);
	//cvSaveImage("outPutImage.pgm",curImage);
	waitKey(0);
	return 0;
}

最终结果:



2017-08-16 22:12:01 Terrenceyuu 阅读数 4384

监控视频的行人追踪

概述

  • 要求:根据提供的监控视频图像,追踪视频中行人并对其运动轨迹做出预判。

实现

  • 视频图像的行人认定为图像的前景区域,识别新人即为分割图像的前景背景,故可使用knn实现分割
  • 利用opencv的BackgroundSubtractorKNN实现的分割效果如下图所示:

这里写图片描述

  • 对分割出的前景区域,可以计算该区域的HSV颜色模型并计算反投影,再利用camshift( 原理是均值漂移算法 )实现对前景区域即行人的追踪。
  • 同样,利用opencv的cv2.CamShift() 实现效果如下:
  • 图中,红色矩形框即为camshift计算出的目标区域。

这里写图片描述

  • 最后,预判行人的运动轨迹,这里可以使用卡尔曼滤波来实现。
  • 设定卡尔曼滤波所需测量的维度是2,即为目标区域的x,y坐标。同时,设定卡尔曼滤波的维度是4,即坐标x, y以及在xy坐标下的速度vx, vy。行人的vx, vy均可认为是匀速运动(当然需要加上各自的噪声项)。
  • 根据knn实现的背景分割矩形中心来校正卡尔曼滤波器,滤波器预测的结果即为目标区域矩形的中心点。
  • 根据上述假设,利用opencv的cv2.KalmanFilter()函数可建立卡尔曼滤波器模型。
  • 下面是算法的实现效果(图中绿点即为预测结果):

这里写图片描述

2011-09-23 11:48:45 zhubenfulovepoem 阅读数 4005

Author:朱本福

一、理论研究:

视频图像中的运动追踪的首要工作是确定场景中存在的运动目标,即运动目标的检测。并且检测要达到以下要求:

(1)对环境的缓慢变化(如光照变化等)不敏感;

(2)对于复杂背景和复杂目标有效;

(3)能适应场景中个别物体运动的干扰(如树木的摇晃,水面的波动等);

(4)能够去除目标阴影的影响;

(5)检测和分割的结果应满足后续处理(如跟踪等)的精度要求。

 

运动目标检测常用的方法有:

(1)帧间差分法

帧间差分法是检测相邻图像之间变化的最简单方法,主要利用视频序列中连续的两帧或几帧图像的差异来进行目标检测和提取。

帧间差分法的特点:

用帧间差分法进行目标检测和分割,算法复杂度低,便于实时使用。由于相邻帧的时间间隔一般较短,因此该方法对场景光线的变化一般不太敏感,稳定性较好。但是检测中依赖于选择的帧时间间隔,对快速运动的物体,需要选择较小的时间间隔。如果选择不合适,当物体在前后两帧中没有重叠时会被检测为两个分开的物体;而对慢速运动的物体应该选择较大的时间差,如果此时选择不适当,物体在前后两帧中几乎完全重叠,则检测不到物体。且一般都不能完全提取出所有的特征像素点,这样在运动实体内部容易产生空洞现象,即使进行了必要的数字图像的形态学处理和连通性分析,也不能得到运动目标的完整轮廓,给以后的运动目标识别和检测带来了困难。

 

(2)背景相减法

这种方法能够较完整的提取目标点,但对场景的动态变化如光照或外部条件引起的场景变化较为敏感。如果参考图像选取适当,这种方法的优点是可以准确地分割出运动物体。但是,该方法成功与否依赖于所采用的背景图像。如何获取一个相对准确的背景图像是背景相减法的一个重点内容,通常利用背景更新的方法以弥补动态场景中的光线变化等因素带来的不利影响。

引起背景变化的原因

①光照变化:包括连续的光照变化,通常是室外环境;突然的光照变化,通常是室内环境中关灯或者室外晴天出现乌云等情况;投影到背景中的阴影,这既可能是背景自己产生的阴影,如大楼、树,也可能是前景目标自身的阴影。

②背景扰动:背景中存在的如风中树叶的摇动、水面波光的闪动、车窗玻璃的反光以及天气变化等许多细微活动,都会影响到运动目标的检测。再如室外摄像机受风吹而抖动同样会影响运动目标的检测。

③背景变化:运动目标引起的背景变化包括人将某个东西带入或带出背景,汽车驶入或驶出,或者人或物在场景中停留一阵后又运动的情况。

④遮挡问题:遮挡也是运动目标检测过程中一个难以解决的问题,在运动目标前方的遮挡物能会作为目标的一部分被提取出来,从而造成检测目标变形严重,甚至还会检测目标的失败。这些因素都会大大提高背景建模及背景更新的难度,影响到检测的效果。

 

运动目标检测前需要对视频数据进行处理,常用的处理方法有:

(1)统计平均法:

该方法通过对连续的图像序列进行统计平均来获得背景图像,即连续采集N帧图像累加求平均,式中N为图像帧数。


这种更新方法在每次更新时考虑利用当前帧信息而废弃过时信息,从而有利于背景自适应地跟随新环境的变化而变化。算法简单易实现,由于采用递归算法,故时间开销小。

该算法的不足之处是需要保存帧图像的数据,增加了空间的开销。这种方法通常用于场景内目标滞留时间较小、目标出现不频繁的情况。但是统计平均法无法满足实时性要求。

 

(2)中值滤波器法

通常在背景模型提取阶段,运动目标可以在监视区域运动,但不会长时间停留在某一个位置上。对视频画面中某一像素进行一段时间的观测,可以发现,只有在前景目标通过该点时,它的亮度值才发生大的变化。中值滤波法的思想是先建

立一个视频流滑窗用来缓存L张视频帧,然后把缓存中所有视频帧同位置像素的中值作为背景中该处像素的值。



L值的大小由运动目标通过的速度和摄像机拍摄时的采样速率等因素决定。

 

(3)GMM方法

在一段较长时间内,对图像中同一位置的像素点进行采集,该像素点作为背景图像上的一点,在理想状态下,采样值应该是某一个固定值,但是由于光照等自然因素的影响存在偏差。在统计意义下,这些采样值是服从单高斯分布模型的,它们主要集中在概率大的高斯模型的中心位置上,而在该点上出现的运动目标则是属于发生概率小的远离高斯模型中心的部分。因此我们可以计算这个像素点的平均亮度μ0,以及它的方差σ02,用各点像素的亮度均值组成的具有高斯分布的图像作为背景图像B0,其中


但是,用单高斯模型来描述背景是不够的。例如,在某帧中一个像素可能表示天空,但在另一帧中则可能表示树叶,而在第3帧中则又可能表示树枝。显然,每一种状态下的像素亮度(颜色)值是不同的,因此可以对这些多模态情形使用多个高斯模型来混合建模。设用来描述每个点像素值的高斯分布共有k个,分别记为η(x, u,μit,σit),i =1,2,...,k,则有




如果没有任何高斯模型与当前亮度值匹配,则将权重最小的高斯分量以一个新的高斯模型替代。新高斯模型的内核密度均值为It ,偏差为最大初始偏差,权重为最小初始权重。其他高斯分布模型不变,只降低它们的权重,即


混合高斯法在背景建模过程中由于允许运动目标存在,因此尤其适合室外有光线和天气变化的小而速度快的运动目标的检测。它的缺点首先是计算复杂,不仅对背景建模,实际上也对前景建模,其计算复杂度与高斯模型的个数成正比,而且模型参数难调。其次,它对大而慢的目标检测效果也不好,特别是会将纹理少或对比度低的目标当作背景。第三就是它对全局亮度的突然变化非常敏感,如果场景很长时间没变,则背景分量的变化就非常小,但是全局亮度的突然变化会将

整个视频帧认为是前景。

 

二、工程实践:

(1)开发平台的选择,安装及配置:

Visual C++6.0:强大的MFC编程;

Directshow9.0:视频处理;

OpenCV1.0:开源视觉函数库;

Speech SDK5.1:语音函数库。

安装以上软件,并对Visual C++ 6.0进行

1. 用VC编译DirectShow的标准连接库。打开工程文件baseclasses.dsw,分别编译Debug和Release版本。如果资源文件的安装时选择的是默认目录,则baseclasses.dsw的地址为:

C:\DXSDK\Samples\C++\DirectShow\BaseClasses

 

2. 全局设置

菜单Tools->Options->Directories:先设置lib路径,选择Libraryfiles,在下方填入路径:

C:\ProgramFiles\OpenCV\lib

 

3. 设置VisualC++编译环境。在Visual C++中,点击菜单“工具”,选择“选项”。在弹出的选项卡对话框中选择“目录”卡片。

“目录:”Include files,添加如下路径:

C:\PROGRAM FILES\MICROSOFT SPEECH SDK5.1\INCLUDE

C:\Program Files\OpenCV\cxcore\include

C:\Program Files\OpenCV\cv\include

C:\Program Files\OpenCV\cvaux\include

C:\Program Files\OpenCV\ml\include

C:\Program Files\OpenCV\otherlibs\highgui

C:\ProgramFiles\OpenCV\otherlibs\cvcam\include

C:\DXSDK\Include

C:\DXSDK\SAMPLES\C++\DIRECTSHOW\BASECLASSES

将添加的两个路径移至顶端。

“目录:”Library files,添加如下路径:

C:\PROGRAM FILES\MICROSOFT SPEECH SDK5.1\LIB\I386

C:\DXSDK\Lib

C:\DXSDK\SAMPLES\C++\DIRECTSHOW\BASECLASSES\DEBUG

C:\DXSDK\SAMPLES\C++\DIRECTSHOW\BASECLASSES\RELEASE

“目录:”source files,在下方填入路径:

C:\Program Files\OpenCV\cv\src

C:\Program Files\OpenCV\cxcore\src

C:\Program Files\OpenCV\cvaux\src

C:\Program Files\OpenCV\otherlibs\highgui

C:\ProgramFiles\OpenCV\otherlibs\cvcam\src\windows

以上具体设置参考自己软件安装的路径设置。

 

4. 项目设置

每创建一个将要使用OpenCV的VC Project,都需要给它指定需要的lib。菜单:Project->Settings,然后将Setting for选为AllConfigurations,然后选择右边的link标签,在Object/librarymodules附加上

cxcore.libcv.lib ml.lib cvaux.lib highgui.lib cvcam.lib

 

(2)建立基本的软件仿真实验开发平台:

1.Imgcx学习版:是一套完全可执行的Visual C++界面源代码,界面内容包括图像的表示、读入和保存,该软件给读者提供了图像表示、读入、保存的框架,笔者在Imgcx上通过编写算法对单张图片进行处理来验证算法。

软件基本框架:



2. FstarVideo:Directshow建立视频处理开发平台,该平台在博创科技未来之星的软件框架下进行修改,而来,笔者通过对视频图像处理来验证自己的算法。

主要操作代码1:

for (int x =0;x< m_nWidth;x++)      //m_nWidth为视频的宽度

              {

                     for(int y =0;y<m_nHeight;y++)  //m_nHeight为视频的宽度

                     {

                     //     Median(x,y,prgb);     // 中值滤波

                     //     prewitt(x,y,prgb,2.0);  //prewitt算子

                            GetColor(x,y,prgb);   //获得xy坐标处的颜色值

                            //亮度空间转换                   

                                   tY= 0.299*prgb->rgbtRed + 0.587*prgb->rgbtGreen + 0.114*prgb->rgbtBlue;                    //将RGB颜色空间转换为亮度空间

                                   if(tY>230)  //追踪亮度    //tY为亮度阈值

                                   {

                                          //SetColor(x,y,black);   //二值化,将坐标xy处像素值设为黑色

                                         

                                          m_pParam->Yx+= x;  

                                          m_pParam->Yy+= y;

                                          m_pParam->Ysum++;

                                          //计算被标记的点数量(不绘制总标记重心则可以去掉)

                                          m_nCX =m_pParam->Yx/m_pParam->Ysum;

                                          m_nCY=m_pParam->Yy/m_pParam->Ysum;

                                          prgb->rgbtBlue=0;

                                          prgb->rgbtGreen= 0xff;  

                                          prgb->rgbtRed= 0;

                                          DrawFocus(x,y,prgb);   //以绿色绘制满足tY的像素值

                                                                              

                                   }

                                   else

                                          SetColor(x,y,white);

                            }

主要操作代码2:

//为该区域像素变红色

       for (int m =400;m<426;m++)

       {

              for (int n=75;n<86;n++)

              {

                     SetColor(m,n,red);

              }

       }                   

}                          

       //对每个像素进行操作

       for (int p=400;p<409;p++)  //查询像素值

       {

              for (int q=400;q<409;q++)

              {

      prgb->rgbtBlue= m_pImageBuf[q*m_nWidth*3 + p*3];

       prgb->rgbtGreen =m_pImageBuf[q*m_nWidth*3 + p*3 + 1];

       prgb->rgbtRed =m_pImageBuf[q*m_nWidth*3 + p*3 + 2];

       //操作像素值

       if(prgb->rgbtBlue==0&& prgb->rgbtGreen==0xff && prgb->rgbtRed==0)   //绿色像素进行操作

       {

           SetColor(p,q,black);

       }

              }

       }



后续完成了,在上传总结文档……

2015-12-18 12:50:59 u013476464 阅读数 3775

Tutorials on topics in 2D image analysis, computer vision

http://t.cn/R4AEWgN

包括图像处理、分析、识别、应用的课程笔记,PPT/PDF课件,计算机视觉中的机器学习、目标识别、分割、文本识别、fMRI 分析、运动和追踪等在线视频教程。

http://t.cn/R42SjhG

Online Written Course Notes

General
Dealing with Imprecise Spatial Information in Cognitive Vision (Isabelle Bloch)
Neural nets in Vision (Roger Boyle, David Hogg)
Vision Systems (A D Marshall)
Vision Computacional (L. E. Sucar and G. Gomez - in Spanish)
Vision Through Optimization (N. A. Thacker and T. F. Cootes)
Computer Vision Online Course (Andrew Wallace)
Textured Motion and Complex Motion Modeling (Yizhou Wang)
Physics/Devices
High dynamic range imaging for digital still camera: an overview (S. Battiato, A. Castorina, M. Mancuso)
Tsai Camera Calibration Method Revisited (Interior Orientation) (Berthold K. P. Horn)
Observations on the physics of imaging and image coding (Julio Marten)
Representation/recognition
Object Categorization (Axel Pinz)
Mathematics/Geometry
An Introduction to Projective Geometry (for computer vision) (Stan Birchfield)
Tutorial on Rectification of Stereo Images (Andrea Fusiello)
The Essential Matrix … (Coplanarity Condition) (Berthold K. P. Horn)
Quaternions and Rotation (Berthold K. P. Horn)
Resources for Discrete Geometry (IAPR TC 18)
Digital Geometry and Mathematical Morphology (Christer Kiselman)
Linear inverse problems: A discrete presentation (Ali Mohammad-Djafari)
Transforme de Radon et ses applications (French) (Ali Mohammad-Djafari)
Inference bayesienne pour les problemes inverses (French) (Ali Mohammad-Djafari)
Detection-Estimation (Ali Mohammad-Djafari)
Problemes inverses (French) (Ali Mohammad-Djafari)
Projective Geometry for Image Analysis (Roger Mohr, Bill Triggs)
Monte Carlo methods (Jonathan Pengelly)
Visual 3D Modeling from Images (Marc Pollefeys)
Principal Components Analysis (Lindsay Smith)
Fourier Analysis (Yerin Yoo)
Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting (Zhengyou Zhang)
Geometric Framework for Vision I: Single View and Two-View Geometry (Andrew Zisserman)
Image processing/analysis
Fixed Flow (Constant Optical Flow) (Berthold K. P. Horn)
Optical Flow with Fixed Translation and Rotation (Berthold K. P. Horn)
Principles for automatic scale selection (Tony Lindeberg)
Automatic scale selection as a pre-processing stage for interpreting the visual world (Tony Lindeberg)
Scale-space theory: A basic tool for analysing structures at different scales (Tony Lindeberg)
A Gentle Introduction to Bilateral Filtering and its Applications (Sylvain Paris, Pierre Kornprobst, Jack Tumblin, and Frédo Durand )
Applications
Particle Filtering for Visual Tracking (Pedram Azad)
Spatial Augmented Reality (Oliver Bimber, Ramesh Raskar, Masahiko Inami)
A Tutorial on CAPTCHA - Completely Automated Public Turing test to tell Computers and Humans Apart (Theo Pavlidis)
Modern Techniques In Remote Sensing (Maria Petrou)
Supervised Neural Networks in Machine Vision (Neil Thacker)
Performance Characterisation in Computer Vision: The Role of Statistics in Testing and Design (Neil Thacker)
An Empirical Design Methodology for the Construction of Machine Vision Systems (Neil Thacker)
Performance Characterisation in Computer Vision: A Guide to Best Practices (Neil Thacker)
New Trends in 3D Video (Christian Theobalt, Stephan Wuermlin, Edilson de Aguiar, Christoph Niederberger)
Introduction to Computer Vision from Automatic Face Analysis Viewpoint (Erno Makinen)
Online Tutorial PPT/PDF Slides

General
The Monogenic Framework: A Systematic Approach to Image Processing and Computer Vision (Michael Felsberg)
Introduction to some aspects of biological vision (Li Zhaoping)
Visual Recognition in Primates and Machines (Tomaso Poggio, Thomas Serre)
Computational Photography (Ramesh Raskar, Jack Tumblin)
Tutorial on Coded Light Projection Techniques (Joaquim Salvi, Jordi Pagis)
The business case for implementing machine vision (PPT) - selecting a vision system from a management perspective (Nello Zuech)
Physics/Devices
Optics (George Barbastathis)
Workshop on Diffusion Tensor Imaging (National Alliance for Medical Image Computing)
Spectral color imaging (Jussi Parkkinen)
Spectral image applications (Jussi Parkkinen)
8 lectures on color (Jussi Parkkinen)
Representation/recognition
Recognizing and Learning Object Categories (Li Fei-Fei, Rob Fergus, Antonio Torralba)
Object Recognition (Li Fei-Fei)
Geometric Model Acquisition (PPT) (Steve Maybank)
Probabilisitic models of visual object categories (Andrew Zisserman)
Template Matching Techniques in Computer Vision (Roberto Brunelli)
Mathematics/Geometry
Gradient Domain Manipulation Techniques in Vision and Graphics (Amit Agrawal, Ramesh Raskar)
Bayesian Techniques in Vision and Perception (Olivier Aycard and Luis Enrique Sucar)
Graph Cuts (Andrew Blake)
Graph-Cuts versus Level-Sets (Yuri Boykov, Daniel Cremers, Vladimir Kolmogorov )
Bayesian Methods for Multimedia Signal Processing (A. Taylan Cemgil)
Clustering and Robust Techniques in Computer Vision (Per-Erik Forssen)
Deep Belief Nets (Geoffrey Hinton)
Lectures on Discrete Geometry (IAPR TC 18)
Hidden Markov Models (Philip Jackson)
Discrete Optimization in Computer Vision (Nikos Komodakis, Philip Torr, Vladimir Kolmogorov, Yuri Boykov)
Hidden Markov Models (PPT) (Dimitrios Makris)
Bayes Optimality in Pattern Recognition (Aleix Martinez)
Stereo Vision: Algorithms and Applications (Stefano Mattoccia)
Tensor Voting: A Perceptual Organization Approach to Computer Vision and Machine Learning (Philippos Mordohai)
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics (Amnon Shashua)
Image processing/analysis
Graph Based Image Segmentation (Jianbo Shi, Charless Fowlkes, David Martin, Eitan Sharon)
Lectures on Image Processing (Richard Peters)
Texture (Mike Chantler)
Applications
Kernel Methods in Remote Sensing: Introduction, Applications and Research Opportunities (Gustavo Camps-Valls)
Visual servoing (Francois Chaumette)
Image Processing in Biomedical Applications (S. Colantonio, D. Moroni, O. Salvetti)
Tutorial Medical Image Retrieval (Thomas Deselaers, Henning Muller)
Iterative Methods for Image Reconstruction (Jeffrey Fessler)
Computer Vision for Augmented Reality (Jan-Michael Frahm)
3D Camera Tracking, Reconstruction and View Synthesis at Interactive Frame Rates (Jan-Michael Frahm, Reinhard Koch, Jan-Friso Evers-Senne)
Integration of Vision and Graphics, with Sports Applications part 1, part 2 (Oliver Grau)
Improving Image Quality in Image Compositing (Mark Grundland)
Open-source Insight Toolkit (ITK) for medical image segmentation and registration (Luis Ibanez, Lydia Ng, Joshua Cates, Stephen Aylward, Bill Lorensen, Julien Jomier)
Face Detection and Recognition using Machine Learning (Sebastien Marcel)
Face detection and recognition (Sebastien Marcel)
Face Recognition (Sebastien Marcel)
Computer Vision for Wearable Visual Interface (Walterio Mayol, Takeshi Kurata)
Animate Vision concepts for image retrieval (Vincenzo Moscato)
Spatio-temporal Data Mining (Shashi Shekhar)
Multimedia Information Retrieval (Alan Smeaton)
Multimedia Information Retrieval Evaluation Initiatives (Alan Smeaton)
Designing multi-scale algorithms medical image analysis (Bart M. ter Haar Romeny)
Biometrics for Surveillance (S. Kevin Zhou, Rama Chellappa)
Online Video Tutorials

General
Computational models of vision (Shimon Ullman, 1:30’)
Advanced Vision Course (Bob Fisher, 16:00)
Computer Vision (Andrew Blake, 3:00)
Applications to Machine Vision (Andrew Blake, 1:00)
Computer vision (Richard Hartley, 3:00)
Topics in image and video processing (Andrew Blake, 3:11’)
The Future of Image Search (Jitendra Malik, 1:00)
Introduction to Vision and Robotics Course (Bob Fisher, ~8:00 video + 20:00 audio)
Machine Learning in Computer Vision
Learning in Computer Vision (Simon Lucey, 5:31’)
Machine learning and kernel methods for computer vision (Francis R. Bach, 0:40’)
Multiple kernel learning for multiple sources (Francis R. Bach, 0:45’)
Machine Learning in Vision (Bill Triggs, 2:40’)
Learning shared representations for object recognition (Antonio Torralba, 1:20’)
Learning Visual Distance Function for Object Identification from one Example (Frederic Jurie, 0:20’)
Energy-based models & Learning for Invariant Image Recognition (Yann LeCun, 3:40’)
Toward Learning Mixture-of-Parts Pictorial Structures (Alan Fern, 0:36’)
Enhancing functional magnetic resonance imaging with supervised learning (Stephen LaConte, 0:26’)
Learning with spectral representations and use of MDL principles (Edwin Hancock, 0:41’ )
Object Recognition
Large-Scale Object Recognition Systems (Cordelia Schmid, 0:36’)
Global and Efficient Self-Similarity for Object Classification and Detection (Thomas Deselaers, 0:18’)
Object Recognition by Discriminative Combinations of Line Segments and Ellipses (Alex Chia, 0:18’)
Stereo Vision for Obstacle Detection: a Graph-Based Approach (Alessandro Limongiello, 0:40’)
101 Visual object classes - Introduction (Andrew Zisserman, 0:52’)
Recognising Animals (Allan Hanbury, 0:20’)
Generative Models for Visual Objects and Object Recognition via Bayesian Inference (Fei-Fei Li, 01:00)
Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities (Fei-Fei Li, 0:19’)
Detecting Motifs Under Uniform Scaling (Dragomir Yankov, 0:18’)
Research 17: Integrating and Querying Parallel Leaf Shape Descriptions (Shenghui Wang, 0:28’)
Object Identification by Statistical Methods (Hans-Joachim Lenz, 1:22’)
Feature extraction & content description I (Nicu Sebe, 2:27’)
Feature extraction & content description II (Milind Naphade, 2:40’)
Understanding Visual Scenes (Antonio Torralba, 2:00)
On Detection of Multiple Object Instances using Hough Transforms (Olga Barinova, 0:17’)
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions (Bangpeng Yao, 0:18’)
Food Recognition Using Statistics of Pairwise Local Features (Shulin (Lynn) Yang, 0:14’)
Multimodal semi-supervised learning for image classification (Matthieu Guillaumin, 0:19’)
Object-Graphs for Context-Aware Category Discovery (Yong Jae Lee, 0:18’)
Common Visual Pattern Discovery via Spatially Coherent Correspondences (Hairong Liu, 0:17’)
Cascade Object Detection with Deformable Part Models (Ross B. Girshick, 0:17’)
A Novel Riemannian Framework for Shape Analysis of 3D Objects (Sebastian Kurtek, 0:16’)
Segmentation
Graph-Based Perceptual Segmentation of Stereo Vision 3D Images at Multiple Abstraction Levels (Rodrigo Moreno, 3:30’)
Adaptive Feature Selection in Image Segmentation (Volker Roth, 0:26’)
Object Recognition by Discriminative Combinations of Line Segments and Ellipses (Alex Chia, 0:18’)
Text Recognition
Book-Adaptive and Book-Dependent Models to Accelerate Digitization of Early Music (Douglas Eck, 0:09’)
Text Recognition Evaluation (Padmanabhan Soundararajan, 0:16’)
Videotext Recognition System (Rahid Prasad, 0:17’)
Detecting Text in Natural Scenes with Stroke Width Transform (Boris Epshtein, 0:17’)
The chains model for detecting parts by their context (Leonid Karlinsky, 0:17’)
fMRI Analysis
Multimodal Imaging: EEG-fMRI integration (Tom Eichele, 0:57’)
EEG/fMRI correlation analysis. A data and model driven approach (Jan de Munck, 1:00)
Introduction and overview of FMRI concepts and terminology (John-Dylan Haynes, 0:27’)
Scanning the brain and probing the mind (Nigel Leigh, Blaz Koritnik, 1:31’)
Implications of decoding for theories of neural representation (James Haxby, 0:36’)
Exploring human object-vision with hi-res fMRI and information-based analysis (Nikolaus Kriegeskorte, 0:30’)
Hierarchical Gaussian Naive Bayes Classifier for Multiple-Subject fMRI Data (Indrayana Rustandi, 0:11’)
Overview of decoding of mental states and processes (Tom Mitchell, 0:25’)
Symbolic Dynamics of Neurophysiological Data (Peter beim Graben 0:50’)
Unsupervised fMRI Analysis (David R. Hardoon, 0:12’)
From functional elements to networks in fMRI (Ricardo Vigario, 0:52’)
Classifying single trial fMRI: What can machine learning learn? (Paul Mazaika, 0:12’)
fMRI-based decoding of the modified default-mode network in mild cognitive impairment (Fabian Theis, 0:15’)
Generative Models for Decoding Real-Valued Natural Experience in FMRI (Greg Stephens, 0:15’)
Motion and Tracking
Visual localization and tracking (Andrew Blake, 3:00)
Dynamical Binary Latent Variable Models for 3D Human Pose Tracking (Graham Taylo, 0:18’)
Projective Kalman Filter: Multiocular Tracking of 3D Locations Towards Scene Understanding (Cristian Ferrer Canton, 0:15’)
Visual Tracking Decomposition (Junseok Kwon, 0:18’)
Tracking the Invisible: Learning Where the Object Might Be (Helmut Grabner, 0:18’)
Other Subjects
Qualitative Spatial Relationships for Image Interpretation by using Semantic Graph (Aline Deruyver, 0:20’)
Using computer vision in a rehabilitation method of a human hand (Jaka Katrasnik, 0:09’)
Large Scale Scene Matching for Graphics and Vision (James Hays, 01:00)
Graph-based Methods for Retinal Mosaicing and Vascular Characterization (M. Elena Martinez-Perez, 0:30’)
Grouping Using Factor Graphs: an Approach for Finding Text with a Camera Phone (Huiying Shen, 0:22’)
Introduction to Multimedia Digital Libraries (James Wang, 3:00)
Visual Categorization with Bags of Keypoints (Christopher Dance, 0:50’)
Spatiotemporal classification (Janaina Mourao-Miranda, 0:24’)
Image Classification Using Marginalized Kernels for Graphs (Emanuel Aldea, 0:20’)
Graph Cuts (Andrew Blake, ~1:00)
Graph Cuts (Shorter) (Andrew Blake, ~0:30 )
Probabilistic Relaxation Labeling by Fokker-Planck Diffusion on a Graph (Edwin Hancock, 0:21’ )
Graph Spectral Image Smoothing (Edwin Hancock, 0:28’ )
Videolectures->Computer Science->Computer Vision (videolectures.net)

2015-05-20 20:47:19 u011630458 阅读数 5067

简介

  在本篇中,我们分别使用opencv提供的方法:BackgroundSubtractorMOG 和 CvBGCodeBookModel两种方式来实现video的运动跟踪。

BackgroundSubtractorMOG

  在这里,首先要感谢:http://blog.csdn.net/yang_xian521/article/details/6991002 这篇blog的博主,非常感谢该博主,这里使用的实例来源于它。

具体代码

#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/video/video.hpp>
 
#include <iostream>
 
using namespace cv;
using namespace std;
 
int main(int argc, char **argv){
	Mat frame; 
	Mat foreground;	// 前景图片
	VideoCapture capture(argv[1]);
 
	if (!capture.isOpened())
	{
		return 0;
	}
 
	namedWindow("Extracted Foreground");
	namedWindow("Source Video");
	// 混合高斯物体
	BackgroundSubtractorMOG mog;
	bool stop(false);
	while (!stop)
	{
		if (!capture.read(frame))
		{
			break;
		}
		// 更新背景图片并且输出前景
		mog(frame, foreground, 0.01);
		// 输出的前景图片并不是2值图片,要处理一下显示  
		threshold(foreground, foreground, 128, 255, THRESH_BINARY_INV);
		// show foreground
		imshow("Extracted Foreground", foreground);
		imshow("Source Video", frame);
		if (waitKey(10) == 27)
		{
			stop = true;
		}
	}
}

效果演示

  具体的代码讲解,请看之前提到的那个博客上有详细的描述。
具体效果如下:
                

CvBGCodeBookModel

  这个是实例是有opencv官方提供的实例bgfg_codebook.cpp简化而来。

具体代码

#include "opencv2/core/core.hpp"
#include "opencv2/video/background_segm.hpp"
#include "opencv2/imgproc/imgproc_c.h"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/legacy/legacy.hpp"
 
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
 
using namespace std;
using namespace cv;
 
CvBGCodeBookModel* model = 0;
const int NCHANNELS = 3;
bool ch[NCHANNELS]={true,true,true}; // This sets what channels should be adjusted for background bounds
 
int main(int argc, const char** argv){
 
    int nframesToLearnBG = <a href="http://www.opengroup.org/onlinepubs/%3Cspan%20class=" nu19"="" style="text-decoration: none; color: rgb(11, 0, 128); background-image: none; background-position: initial initial; background-repeat: initial initial;">009695399/functions/atoi.html">atoi(argv[1]);
    string filename = argv[2];
    IplImage* rawImage = 0, *yuvImage = 0; //yuvImage is for codebook method
    IplImage *ImaskCodeBook = 0,*ImaskCodeBookCC = 0;
    CvCapture* capture = 0;
 
    int c, n, nframes = 0;
 
    model = cvCreateBGCodeBookModel(); //codebook方法中,初始化
 
    //Set color thresholds to default values
    model->modMin[0] = 3;
    model->modMin[1] = model->modMin[2] = 3;
    model->modMax[0] = 10;
    model->modMax[1] = model->modMax[2] = 10;
    model->cbBounds[0] = model->cbBounds[1] = model->cbBounds[2] = 10;
 
    bool pause = false;
    bool singlestep = false;
 
    capture = cvCreateFileCapture( filename.c_str() );
    if( !capture ){
        return -1;
    }
 
    //MAIN PROCESSING LOOP:
    for(;;){
        if(!pause){
            rawImage = cvQueryFrame(capture);
            ++nframes;
            if(!rawImage)
                break;
        }
        if(singlestep)
            pause = true;
 
        //First time:
        if(nframes == 1 && rawImage){
            yuvImage = cvCloneImage(rawImage);
            ImaskCodeBook = cvCreateImage(cvGetSize(rawImage), IPL_DEPTH_8U, 1 );
            ImaskCodeBookCC = cvCreateImage(cvGetSize(rawImage), IPL_DEPTH_8U, 1 );
            cvSet(ImaskCodeBook,cvScalar(255));
 
            cvNamedWindow( "Raw", 1 );
            cvNamedWindow( "ForegroundCodeBook",1);
            cvNamedWindow( "CodeBook_ConnectComp",1);
        }
 
        if(rawImage){
            cvCvtColor(rawImage, yuvImage, CV_BGR2YCrCb);//YUV For codebook method
            if(!pause && nframes-1 < nframesToLearnBG)
                cvBGCodeBookUpdate( model, yuvImage ); //codebook方法中,更新背景模型
 
            if(nframes-1 == nframesToLearnBG)
                cvBGCodeBookClearStale( model, model->t/2 ); //清除消极的codebook
 
            if( nframes-1 >= nframesToLearnBG){
                // Find foreground by codebook method
                cvBGCodeBookDiff(model, yuvImage, ImaskCodeBook);//codebook方法中,背景减除
                // This part just to visualize bounding boxes and centers if desired
                cvCopy(ImaskCodeBook,ImaskCodeBookCC);
                cvSegmentFGMask(ImaskCodeBookCC);
            }
            //Display
            cvShowImage("Raw", rawImage);
            cvShowImage("ForegroundCodeBook",ImaskCodeBook);
            cvShowImage("CodeBook_ConnectComp",ImaskCodeBookCC);
        }
 
        // User input:
        c = cvWaitKey(100)&0xFF;
    }
 
    cvReleaseCapture( &capture );
    cvDestroyWindow( "Raw" );
    cvDestroyWindow( "ForegroundCodeBook");
    cvDestroyWindow( "CodeBook_ConnectComp");
    return 0;
}

代码讲解

  1、运行时候需要传入两个参数:(1)背景计算的帧数. (2)使用的video文件。
    int nframesToLearnBG = <a href="http://www.opengroup.org/onlinepubs/%3Cspan%20class=" nu19"="" style="text-decoration: none; color: rgb(11, 0, 128); background-image: none; background-position: initial initial; background-repeat: initial initial;">009695399/functions/atoi.html">atoi(argv[1]);
    string filename = argv[2];
  2、初始化codebook,设置它计算使用的的相关阀值。
    CvBGCodeBookModel* model = 0;
 
    model = cvCreateBGCodeBookModel(); //codebook方法中,初始化
 
    //Set color thresholds to default values
    model->modMin[0] = 3;
    model->modMin[1] = model->modMin[2] = 3;
    model->modMax[0] = 10;
    model->modMax[1] = model->modMax[2] = 10;
    model->cbBounds[0] = model->cbBounds[1] = model->cbBounds[2] = 10;
  3、打开传入的video文件。
    capture = cvCreateFileCapture( filename.c_str() );
    if( !capture ){
        <a href="http://www.opengroup.org/onlinepubs/%3Cspan%20class=" nu19"="" style="text-decoration: none; color: rgb(11, 0, 128); background-image: none; background-position: initial initial; background-repeat: initial initial;">009695399/functions/printf.html">printf( "Can not initialize video capturing\n\n" );
        return -1;
    }
  4、在for的死循环中,不断的从video文件中去出一帧数据。
    rawImage = cvQueryFrame(capture);
    ++nframes;
    if(!rawImage)
        break;
  5、如果是处理第一帧数据,初始化之后会使用的图像:yuvImage ImaskCodeBook ImaskCodeBookCC和创建相关的显示窗口。
if(nframes == 1 && rawImage){
    yuvImage = cvCloneImage(rawImage);
    ImaskCodeBook = cvCreateImage(cvGetSize(rawImage), IPL_DEPTH_8U, 1 );
    ImaskCodeBookCC = cvCreateImage(cvGetSize(rawImage), IPL_DEPTH_8U, 1 );
    cvSet(ImaskCodeBook,cvScalar(255));
 
    cvNamedWindow( "Raw", 1 );
    cvNamedWindow( "ForegroundCodeBook",1);
    cvNamedWindow( "CodeBook_ConnectComp",1);
}
  6、首先将处理图像转换成yuv,然后如果当前处理的帧数小于之前传入的背景计算帧数,则使用cvBGCodeBookUpdate进行背景模型跟踪。如果当前帧数等于之前传入的背景
计算帧数,则使用cvBGCodeBookClearStale清除消极的codebook,到了这一步就相当于背景部分训练计算完成。
     cvCvtColor(rawImage, yuvImage, CV_BGR2YCrCb);//YUV For codebook method
     if(!pause && nframes-1 < nframesToLearnBG)
          cvBGCodeBookUpdate( model, yuvImage ); //codebook方法中,更新背景模型
 
     if(nframes-1 == nframesToLearnBG)
          cvBGCodeBookClearStale( model, model->t/2 ); //清除消极的codebook
  7、使用cvBGCodeBookDiff来进行背景减除,获得运动部分的图像,保存在ImaskCodeBook中。接着使用cvSegmentFGMask做连通域分割,从而获得更好的运动图像,保存在
ImaskCodeBookCC中。
     if( nframes-1 >= nframesToLearnBG){
          // Find foreground by codebook method
          cvBGCodeBookDiff(model, yuvImage, ImaskCodeBook);//codebook方法中,背景减除
          // This part just to visualize bounding boxes and centers if desired
          cvCopy(ImaskCodeBook,ImaskCodeBookCC);
          cvSegmentFGMask(ImaskCodeBookCC);
      }
  8、最后将原始图像(rawImage)、背景减去之后图像(ImaskCodeBook)和连通域分割后图像(ImaskCodeBookCC)分别显示出来。
     //Display
     cvShowImage("Raw", rawImage);
     cvShowImage("ForegroundCodeBook",ImaskCodeBook);
     cvShowImage("CodeBook_ConnectComp",ImaskCodeBookCC);

效果演示

  最后对应的效果演示如下:
                                    原始图像                                                                   
                          
                                  背景减去之后图像
                           
                                 连通域分割后图像
                       
                            
没有更多推荐了,返回首页