MP2: Vision Based Person Identification (Face Recognition)

Due date: Febuary 17 before class starts.


In this machine problem, you will get a chance to develop a simple vision based person identification system. During the process you will learn the simple algorithms that are used for vision feature extraction like "PCA" and "Random Projection." You are provided with the data of four people (, with 20 snaps of each.

Feature Extraction

You are going to work with raw pixels, PCA, and random projection, respectively.

  1. The first step is to read the image files in Matlab (imread). They are color images but you are going to work with grayscale images. Conversion from a color image into a grayscale image can be done through the MATLAB function rgb2gray. You might have to convert the data into double before further processing (double). These images are of size 90x70.
  2. Feature 1 (raw pixel): First, convert each 90x70 image (using the reshape function) into a column vector of size 6300 x 1 and take these column vectors as features. Second, reduce the size of the images to N1xN2 (using the imresize function) and convert them into (N1*N2)x1 vectors. Please try with 70x90, 35x45, and 17x22. In addition, please pick up a pair of N1 and N2 so that N1xN2 is roughly equal to your choice of N in feature 2, while keep N1/N2 unchanged.)
  3. Feature 2 (PCA): Do PCA on the 80 6300 x 1 image vectors (mean, repmat, eig, diag, sort, pinv), please do NOT use Matlab’s building PCA function. Choose N Principal Components (PCs) where N is such that the amount of energy kept is 95% of the total. Then, project each of the 6300x1 image vectors onto the PCA subspace, resulting in 80 Nx1 vector. Take these vectors as features.
  4. Feature 3: Generate a random matrix of size 6300 x N (using randn). (Update: N should be the same as in feature 2, i.e. PCA) Project each of the 6300x1 image vectors onto the random subspace, resulting in 80 Nx1 vector. Take these vectors as features.

Pattern Matching algorithm

We are going to use the nearest neighbor algorithm for face recognition. The way nearest neighbor algorithm works is: Let's say, we have some labeled data and we want to do the classification of the new sample. We compute the distance of the new data point (sample) from each point in the labeled set (sometimes also referred to as the training set) and choose the label of the nearest point to be the label of the test sample. In these experiments we are going to use the Euclidean distance. The problem of the nearest neighbor algorithm is that it is prone to error if the data is noisy. An alternative approach is to use K-nearest neighbor. The idea is the same as the one of nearest neighbor, except that now you look at the k-nearest point and choose the label that has the maximal count of the data samples. In case of tie, you may decide to choose a few more points (increase k) so as to resolve the tie.


Same as in MP1. Here you are going to do just the person recognition. Pick one feature vector (corresponding to say image A1.jpg). Compute its distance from all other images. Find the label of the closest image and see if you did the classification correctly or not. Repeat this for all the feature representations. Also repeat with the K (=5, or above in case of a tie) nearest neighbor algorithm.

Extra credit

  1. Redo face recognition using different number of Principal Components (PCs) such that the amount of energy is 90%, and 98% of the total, respectively.
  2. Report the face recognition accuracy for 10 more random projections. Report the average performance on these projections.
  3. Do LDA on the 80 6300 x 1 image vectors. You may choose N=1, 2, or 3, whichever gives you the best performance. Then, project each of the 6300x1 image vectors onto the LDA subspace, resulting in 80 Nx1 vector. Take these vectors as features.
  4. Apply a machine learning technique (other than PCA and nearest neighbor classifier) that you used in your own research to this face recognition task, and briefly explain the algorithm and report the performance.
  5. Run your best algorithm on a subset of LFW ( and report the performance.

Solution submission

  • Results, In tabular form
  • Explanation and analysis of the three methods.
  • Matlab file
  • README file to tell us how we run your code to obtain the same results as you did.

Compress your items into an or xxx.tar.gz file where xxx is your Net ID. For example, if your Net ID is chang87, then your compressed file should be named or chang87.tar.gz. Send this file to with "ECE417 MP2" in the header.

Matlab related tips

Some of the commands that you may find useful are: imread, imagesc, imresize, reshape, double, rgb2gray, randn, svd.

Validation Numbers

On the first dataset you are expected to get average accuracy of roughly 88.75% with raw feature+NN, and roughly 96.25% with PCA(95% energy)+NN.

ECE 417 (Multimedia Signal Processing) covers characteristics of speech and image signals; important analysis and synthesis tools for multimedia signal processing including subspace methods, Bayesian networks, hidden Markov models, and factor graphs; applications to biometrics (person identification), human-computer interaction (face and gesture recognition and synthesis), and audio-visual databases (indexing and retrieval). Emphasis on a set of MATLAB machine problems providing hands-on experience. Prerequisite: ECE 310 and ECE 313.