Emmanuel Pastor bio photo

Emmanuel Pastor

Software polymath, application architect, code craftsman.

Twitter

LinkedIn

Github

Having fun with Face Tracking

Continuing our BabyCam series, the next step after Creating a C++ program to access your web camera is to find and track the baby’s face.

   

Face recognition is an amazing technology that has been around for a while now, from Wikipedia:

A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame from a video source. One of the ways to do this is by comparing selected facial features from the image and a facial database.

OpenCV provides a very powerful object detection mechanism using Haar feature-based cascade classifiers, from the OpenCV Documentation:

Object Detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, “Rapid Object Detection using a Boosted Cascade of Simple Features” in 2001. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images.

In other words, using Machine Learning techniques we can train OpenCV to recognize any kind of object we want (OpenCV provides both a Trainer and a Detector classes for this purpose), the learnings from such training are then being saved in xml files called “Haar Cascade Files”. OpenCV comes with many pre-trained classifiers for face, eyes, smile and others, the xml files for those classifiers is included in the opencv/data/haarcascades/ folder.

   

For this exercise we’ll be basically consuming a video stream (from the Web Camera) searching for human faces in real time and if any is found then we’ll also try to find the eyes in the matching region, finally, we’ll also be drawing geometrical shapes around all matches to identify faces and eyes. We will be able to reuse most of the code we created on the previous post, I’ll paste the final result next and then we’ll go over the new code for extra clarification.

#include <iostream>
#include "opencv/cv.h"
#include "opencv/highgui.h"
using namespace std;
using namespace cv;

/** Function Headers */
void findAndRender( Mat frame );

String faceCascadeFile = "haarcascade_frontalface_default.xml";
CascadeClassifier faceClassifier;

String eyesCascadeFile = "haarcascade_eye.xml";
CascadeClassifier eyesClassifier;

int main()
{
// Create the webcam window.
cvNamedWindow( "CAMERA_STREAM", CV_WINDOW_AUTOSIZE );
// Open the video stream using any connected cam.
CvCapture* stream = cvCaptureFromCAM( CV_CAP_ANY );

if ( !stream )
{
cout << "ERROR: The stream is null!\n";
return -1;
}

IplImage* frame = NULL;

if( !faceClassifier.load( faceCascadeFile ) ){ printf("--(!)Error loading\n"); return -1; };
if( !eyesClassifier.load( eyesCascadeFile ) ){ printf("--(!)Error loading\n"); return -1; };

char keypress;
bool quit = false;

while( !quit )
{
// Get a color frame from the cam.
frame = cvQueryFrame( stream );
// Find faces in the stream and render indicators.
findAndRender( frame );
// Wait 20ms
keypress = cvWaitKey(20);
// Turn on the exit flag if the user presses escape.
if (keypress == 27) quit = true;
}

// Cleaning up.
cvReleaseImage( &frame );
cvDestroyAllWindows();
}

void findAndRender( Mat frame )
{
std::vector<Rect> faces;
Mat bwframe;

// Get a black & white version of the frame
cvtColor( frame, bwframe, CV_BGR2GRAY );

// Equalize the image histogram to improve contrast.
equalizeHist( bwframe, bwframe );

// Try to find faces in the frame, discard any matches with a size < 300 x 300
faceClassifier.detectMultiScale( bwframe, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size( 300, 300 ) );

for( size_t i = 0; i < faces.size(); i++ )
{
// Find the center of each face match, draw an ellipse around them.
Point center( faces[i].x + faces[i].width * 0.5, faces[i].y + faces[i].height * 0.5 );
ellipse( frame, center, Size( faces[i].width * 0.5, faces[i].height * 0.5), 0, 0, 360, Scalar( 255, 0, 0 ), 4, 8, 0 );

Mat faceRegion = bwframe( faces[i] );
std::vector<Rect> eyes;

// Try to find eyes in each face region, discard matches with a size < 30 x 30
eyesClassifier.detectMultiScale( faceRegion, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(80, 80) );

for( size_t j = 0; j < eyes.size(); j++ )
{
// Find the borders of each eye, draw a box around them.
Point point1 = Point(faces[i].x + eyes[j].x, faces[i].y + eyes[j].y);
Point point2 = Point(faces[i].x + eyes[j].x + eyes[j].width, faces[i].y + eyes[j].y + eyes[j].height);
rectangle( frame, point1, point2, Scalar( 0, 0, 255 ), 4, 8, 0 );
}
}
// Render the processed frame on screen
imshow( "CAMERA_STREAM", frame );
}

Fork me in Github!


Alright, it’s a little bit more complicated than the last post’s code but it isn’t that bad either, let’s go over the new stuff, these are the Haar Cascade files we’ll be using, the frontalface and the eye classifiers.

String faceCascadeFile = "haarcascade_frontalface_default.xml";
CascadeClassifier faceClassifier;

String eyesCascadeFile = "haarcascade_eye.xml";
CascadeClassifier eyesClassifier;

Next, we’ll re-use the code from the Previous Post to grab the camera stream and then send it to the findAndRender method which is in charge of finding and marking faces. We’ll need to make the stream black and white before using a Haar Cascade Classifier on it:

cvtColor( frame, bwframe, CV_BGR2GRAY );

We’ll then equalize the image histogram to improve contrast:

equalizeHist( bwframe, bwframe );

And then we’re finally ready to run the face classifier on the stream:

faceClassifier.detectMultiScale( bwframe, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size( 300, 300 ) );

The parameters the detectMultiScale method can take are:

void CascadeClassifier::detectMultiScale(const Mat& image, vector<Rect>& objects, double scaleFactor=1.1, int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size())
  • cascade – Haar classifier cascade (OpenCV 1.x API only). It can be loaded from XML or YAML file using Load(). When the cascade is not needed anymore, release it using cvReleaseHaarClassifierCascade(&cascade).
  • image – Matrix of the type CV_8U containing an image where objects are detected.
  • objects – Vector of rectangles where each rectangle contains the detected object.
  • scaleFactor – Parameter specifying how much the image size is reduced at each image scale.
  • minNeighbors – Parameter specifying how many neighbors each candidate rectangle should have to retain it.
  • flags – Parameter with the same meaning for an old cascade as in the function cvHaarDetectObjects. It is not used for a new cascade.
  • minSize – Minimum possible object size. Objects smaller than that are ignored.
  • maxSize – Maximum possible object size. Objects larger than that are ignored.

Next, if we find any face in the stream we draw an ellipsis around it:

ellipse( frame, center, Size( faces[i].width * 0.5, faces[i].height * 0.5), 0, 0, 360, Scalar( 255, 0, 0 ), 4, 8, 0 );

The rest of the code is pretty straight forward, rinse and repeat; If we do find any faces then try to find eyes in the region and draw rectangles around them:

eyesClassifier.detectMultiScale( faceRegion, eyes, 1.1, 2, 0 |CV_HAAR_SCALE_IMAGE, Size(80, 80) );

Just like before, let’s try to compile the file:

$ g++ -bind_at_load `pkg-config --cflags opencv` tracking.cpp -o tracking `pkg-config --libs opencv`

And then run the binary:

$ ./tracking

Boom! If the light condition in the room is decent the app should be able to recognize your face. The included face classifier works well but the fact that it is made to recognize generic faces means it’s not perfect, the good news is that since we only care if OpenCV recognizes our baby, we can train it to do so specifically and create our own Cascade Classifier for the greatest accuracy.

small tracking

   

That’s it for now, in the next post we’ll start playing with eulerian video magnification!



comments powered by Disqus