Object tracking with OpenCv’s TemplateMatching

I have recently been playing with around tracking objects using my computers webcam. In this article I am going to talk about how to track object in video streams using OpenCv and cvMatchTemplate.

I have been interesting in computer vision stuff for a while now, but only recently have I really started playing around with it. Trying to make a computer “see” in any sense of the word seems like a daunting task. Fortunately, there is a great open source project out there that makes computer vision accessible to those of us who don’t understand greek. I speak, of course, of OpenCV. Really smart people have come up with really clever algorithms and the nice people at OpenCV have encapsulated them into a nice easy to use framework.  When I say easy to use, I mean use, not install. I had a rather difficult time trying to get OpenCv set up and working on my machine, so I ended up using openFrameworks and its OpenCV addon instead.

With OpenCV there are more than a few ways to approach object tracking. Here I will be discussing a relatively simple method that uses Template Matching to do the tracking. The idea behind template matching is to take a picture of the thing you want to track and then try to find it in the webcam’s video frames. So lets say we are trying to track my face in the video we would take a picture of my face and give it to the cvMatchTemplate method which then takes that image and slides it across the video frame pixel by pixel and figuring out how close of a match it is. This outputs a grayscale image where how bright each pixel corresponds to how close of a match it is.  This means that all you have to do is find the brightest spot in the resulting image to find where your face is in that frame (don’t worry though, OpenCV has got a handy little method for finding this bright spot)

Enough with the theory. Lets see how to actually use this. I am going to be working with the OpenFrameworks toolkit, so you will need to get that set up before any of the rest of this will work. If you need help and are going to be using Visual Studio 2010 to do your programming I have a small write up to help here. If your not using Visual Studio check out their download page for instructions

Basic webcam access

To access the webcam in the first place we are going to use a ofVideoGrabber. To do OpenCV stuff on the images we are going to use a ofxCvColorImage object to store the current video frame. So in your app’s .h file add these variables the public section:

ofVideoGrabber vidGrabber;
ofxCvColorImage colorImg;

and these constants to the top:

const int camWidth = 320;
const int camHeight = 240;

In your app’s setup() method we need to initialize the video grabber and allocate some space for the images so add this code:


Then in your app’s update() method we need to get the latest frame data from the vidGrabber and push it into colorImg. So lets update your update() method to look like this:

void testApp::update()
   if (vidGrabber.isFrameNew())
      colorImg.setFromPixels(vidGrabber.getPixels(), camWidth,camHeight);

Once we have the image data we need to draw  it to the screen so we have know what we are dealing with. So in the app’s draw() method add this:

colorImg.draw(0, 0);

That covers basic webcam access. If you run your project now you should see the feed from your webcam on the screen. The fact that we can get access to the webcam’s frame and draw them so easily is amazing. I have used DirectShow in the past to do this, and I can’t explain how much simpler this is.

Defining a template image

Where you get your template image (the thing that we want to track) is up to you, but for our purposes here, I am going to let the user select it out of the current video feed by just highlighting it in the video feed with the mouse. The code for selecting a rectangular region on the screen is boring and long winded so I am not going to show it here, but if you need it, it is in the example project that you can download at the bottom of the page. The fun bit that I am going to show is how to pull out a crop from the video feed.

subjectImg.allocate(subjectFrame.width, subjectFrame.height); //Allocate space for the template
colorImg.setROI(subjectFrame); //Set region of interest (ROI)
subjectImg = colorImg; //Copy the specific area to the subject image
colorImg.resetROI(); //Reset the ROI or everything downstream will go crazy

The fun bits

Alright now that we have the webcam streaming video, and we have a template image to look for, all that is left to do is actually run the fun bits to find the template image in the current video frame. If you recall the explanation about how cvMatchTemplate works you will probably remember that cvMatchTemplate outputs a grayscale image where each pixel is basically a value indicating how likely it is that location is where the template is. This means we need to allocate an image for to put its data into. In your update() method add this:

IplImage *result = cvCreateImage(cvSize(camWidth – subjectImg.width + 1, camHeight – subjectImg.height + 1), 32, 1);

This gives us a spot to store the results. Directly under this line add this line

cvMatchTemplate(colorImg.getCvImage(), subjectImg.getCvImage(), result, CV_TM_SQDIFF);

This is the magic line of code. It is what finds the what we are after. Quick note about that CV_TM_SQDIFF bit there. This is the method that we are telling cvMatchTemplate to use calculate how similar our template is to each portion of the video frame. There are a few other options that you can read about in the OpenCV documentation.

cvMatchTemplate has found the location of our template for us. Unfortunately, it, like a long winded tech blogger, it has also said a whole bunch of extra stuff. To pick out the actual screen coordinates we need to find the brightest spot in the image. And as it happens OpenCV has this handy little function cvMinMaxLoc that will look at the grayscale image and gives us back the locations and brightness values of the brightest and darkest spots of the image. Which we can use like this:

double minVal, maxVal;
CvPoint minLoc, maxLoc;
cvMinMaxLoc(result, &minVal, &maxVal, &minLoc, &maxLoc, 0);

Remember when I said that we were looking for the brightest spot in the result image to tell us the location of our tracked object? When you use CV_TM_SQDIFF it is actually the darkest spot. So the location of our tracked object is actually now located in the variable minLoc which we can transfer over into our subjectLocation variable like so:

subjectLocation.x = minLoc.x;
subjectLocation.y = minLoc.y;

Now that we have the location of our object lets just draw a box around it so we can see that its working. So in your draw() method just add this:

ofRect(subjectLocation.x, subjectLocation.y, subjectFrame.width, subjectFrame.height);


Would you look at that? Now the computer can see our smiling face actually do something with it.  This is cool but there are somethings to know about it. You have to give an image to start with, and that image has to match really closely. It isn’t very robust. If you are trying to track your face, and the lighting changes dramatically it will probably loose tracking. If the size of the face changes much, like when you lean in or out from the camera, it will loose you. It also doesn’t handle rotation very well.

The performance of this algorithm isn’t bad, but if you have it search the whole video frame for your face you might notice a slowdown. To combat this you can limit the search area by calling colorImg.SetROI() before calling cvMatchTemplate. If you do this dont forget to reset the ROI afterwards and to adjust the amount of space you allocate for the result image. Also the resulting location will need to be offset to account. The general idea would be only search the area around the object when you last saw it. So after frame set the search window to a rectangle that is two or three times larger than you template image and center it on the tracked location. This will speed up processing time as well as possibly preventing false positives.

Source Code

The source code for this project is available at: http://project-greengiant.googlecode.com/svn/trunk/Blog/TrackingWithTemplateMatching

Please note that to run it will need to install it in the right location relative to your instance of OpenFrameworks. More information about that can be found here:

Tagged , , ,

15 thoughts on “Object tracking with OpenCv’s TemplateMatching

  1. joshan says:

    hello sir… yurs s really good… i ve implemented template matching wit reference image for video… but i don know how to start by taking 1st frame as reference from video and comparing wit other frame…. can yu plz guide me wit this… i ll be thankful for yur help..

  2. DUDESON says:

    Hi Matt, I wanted to use this as a basis for one of my uni projects but I can’t seem to get it running as there is no file found :/ I did the setup as you said to aswell but it doesn’t seem to be working at all – any help?

  3. Woon says:

    I’m currently doing template matching using SURF, and the matching result is not accurate.
    i’d tried control it with the max,min loc. but its still giving faults result.

    i’m planning to actually set a ROI on the template location, thinking that it will be more specific and straight forward to do the template matching, and save time.
    am i correct?

  4. Tom says:

    FYI, this method is outdated now. Look here for a more current way to template match: http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html

  5. Akshaya Moorthy says:

    How to know a template is not matching using cvMatchTemplate,as its return type is 0

  6. lenteken says:

    Good Day,

    I am newbie to opencv and tried to make same as your template matching app. but the one I made is very slow. Can you give me the source code you made?

    Thanks in Advance.

  7. doel says:

    hi.. I am currently study for template matching using open cv .I have seen your video .. really good..Can you give me the source code you made..as a reference and support for my research ..this is my emal if you want sending…thanks

  8. Akash says:

    Can I use template matching for tracking facial expressions ?

  9. Sarah Wever says:

    First I just wanted to say THANK YOU for this post! with some modifications I was able to get this to work with an uploaded image of a pattern instead of a selected one!

    Second, in your video there are moments where it doesn’t detect the face and the box is just sort of floating around searching. Do you know if there is a way to create a boolean so that when the image isn’t detected the box turns off?

    • Matt Bell says:

      I haven’t tried it but you could probably check the value at the location provided by the subject location as see if it matches a threshold and treat it like a confidence rating

  10. […] 最近看到一般數位相機有類似的功能,於是去找了一下.以下是示意影片.完整網址在這裡,不過他有用到OpenFramework去操控OpenCV個人覺得不好使用.比較推薦直接使用OpenCV其實程式會很小. […]

  11. Nicely done!. Bravo! One word of caution though: You need to release IplImage at the end of the update method or will get a memory leak and the application will crash in a about a minute

    That worked for me:

    cvReleaseImage( & result );

  12. Tamercan says:

    Hi Matt. Source code link was broken.Can you send me source code please?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: