The reduced image quality necessary for improving our processing speeds made our color histogram a 512 color space, meaning 512 distinct colors. Compared to 16 million distinct colors, which is the norm for most image renditions, this is very limiting and created a few problems in the implementation of our algorithm. Because of the interest and importance of facial recognition in the security industry we initially made selections around the head of our targets to be tracked. Noisy environments proved to be difficult for handling most targets because skin toned objects seemed to be more of a positive match for our algorithm. The small color space was not able to distinguish small changes in lighting and would eventually lead to false matches or even erratic guesses that made the camera seem to be out of control. In order to address this issue we made bigger region of interest selections that included the head and upper torso. This adjustment had two purposes the first was for a more unique description of our target. With these bigger regions of interest that included more of the targets clothing we were even able to track them as the faced away from the camera and were partially occluded by objects. It also helped when tracking through a crowd of people, where there were sure to be people of similar skin tones but most likely dressed differently. In order to address this problem in future works it is recommended to use a data fusion algorithm that incorporates both the color based particle filter and a feature extraction algorithm such as the scale invariant feature transform (SIFT)[2].


Computation time was the greatest concern when making improvements and handling issues in our system. When we had our system at its maximum capacity there were four cameras connected to one CPU. The processing for all four cameras, each of which have to sample hundreds of guesses, create histograms and then compare them to the initial histogram for a match caused slight fluctuations in the camera and restricted the size of our targets. The combinations of having multiple cameras processed sequentially and the random distribution of the search algorithm we used led to a delay of data transmission. This delay was apparent in the live tracking videos were the frame was not static although our target was. The random sample of guesses appeared to have small movements to the algorithm and by the time the camera was sent the information and direction to move the subsequent frame also had the same displacement in the opposite direction, so the camera would fluctuate slightly around the target as it tried to keep it centered in frame. In order to control these oscillations we defined the maximum size the target can be and reduced the standard deviation variable for the randomly distributed samples. Both modifications helped reduce the processing times for each camera so that the cycling between cameras was condensed and there was less time for the camera to see the target further from center of frame and eventually resulting in a lost target.


Although losing the target was avoided for the most part by testing different variables and making adjustments, our system required specific calibration in new environments to begin tracking. This included precise distance and angle measurements of variation from our master camera, these variables were used in Jacobian transformations in order to transpose all of the drone cameras to the initial target selecting camera. Having the cameras act like drones made target recapturing very difficult because they were only actively tracking when the target was in frame. If the target was ever lost it would return to its idle state and wait for a new target to enter its search area. The addition of a search algorithm would make the application of our system on a moving platform/base more reliable.



[1] Nummiaro, Katja, Esther Koller-Meier, and Luc Van-Gool. "An Adaptive Color-based Particle Filter." Image and Vision Computing 21 (2003): 99-110.

[2] Lowe, David G. "Distinctive Iamge Features from Scale-Invariant Keypoints." International Journal of Computer Vision (2004).

[3] Bay, Herbert, Andreas Ess, Tinne Tuytelaars, and Luc Van Gool. "Speeded-Up Robust Features (SURF)." Computer Vision and Image Understanding 110.3 (2008): 346-59.

[4] Canon Inc. "VB-C10/R Firmwaree Ver. 3.0." Canon Network Camera server HTTP WebView Protocol Specification (2003).