04-07-2012, 12:02 PM
Image Patches based Markov Random Field (MRF) Model for Multiple Object Tracking (MOT)
Image Patches based Markov Random Field (MRF).docx (Size: 914.31 KB / Downloads: 32)
INTRODUCTION
Reliable multiple object tracking (MOT) in video sequence plays an important role in various real-life applications. For example, in intelligent surveillance systems, tracking of multiple humans is particularly essential for the detection of suspicious behaviour by patterns of movement. Also, in order to recognize the patterns of play, a sports video analysis for example, trajectories of players must be obtained. In recent years, promoted by all of these cases, MOT has emerged as a very active research topic, while stays as one of the most challenging tasks in computer vision.
MOT requires locating the objects and labelling their identities.When objects are separated and do not occlude each other, these problems can be solved easily by running multiple independent trackers, such as box-basedtracking, or the hybrid appearance-guided particle filter. However, for tracking in real scenes, occlusions (or interactions) among objects necessarily occur. For instance, a person is partially or fully occluded by other persons in a typical surveillance environment. Unfortunately, these occlusions make locating the objects and maintaining their identities difficult problems. Even the partial occlusion can confound many trackers, leading to fragments or total loss of tracks.
LITERATURE SURVEY
There is a vast literature on MOT, and we restrict ourselves to recent approaches. We refer to Ref. [4] for a review of earlier methods. Following taxonomy similar to Ref. [4], we classify these techniques for MOT into two categories: the merge-split methods and the straight- through methods. In the first category, occlusions are handled according to merge-split scheme: as soon as objects are declared to be occluded, the system merges them to a single new object. From that point on, the new object is tracked as an independent object. Takala and Pietikainen [8] built a unifying distance measure with color, texture and motion cues to solve the feature correspondence problem. Experiments showed that their method can effectively track multiple objects with fragmentation and grouping.
FOREGROUND AND BACKGROUND EXTRACTION
Given an image X with known figure/ground labels L, infer the figure/ground labels L′ of a new imageX′ closely related to X. For example, we may want to extract a walking person in an image using the figure/ground mask of the same person in another image of the same sequence. Our approach is based on training a classifier from the appearance of a pixel and its surrounding context (i.e., an image patch centred at the pixel) to recognize other similar pixels across images. To apply this process to a video sequence, we also evolve the appearance model over time. A key element of our approach is the use of a prior segmentation to reduce the complexity of thesegmentation process. As argued in image segments are a more natural primitive for image modelling than pixels.
IMAGE PATCH REPRESENTATION AND MATCHING
Building stable appearance representations of images patches is fundamental to our approach. Thereare many derived features that can be used to represent the appearance of an image patch. In thispaper, we evaluate our algorithm based on: 1) an image patch’s raw RGB intensity vector, 2) meancolor vector, 3) color + texture descriptor (filter bank response or Haralick feature, and 4) PCA,LDA and NDA (Nonparametric Discriminant Analysis) features on the raw RGB vectors. Forcompleteness, we give a brief description of each of these techniques.
NONPARAMETRIC APPEARANCE REPRESENTATION
When the objects do not interact with each other, it is necessary to construct an appearance model for each object. In general, the spatial integrity of the object is maintained in most of widely used template-based appearance models which are especially suited for handling rigid objects. However, this rigid integrity rarely exists if the object deforms or undergoes severe appearance changes. We learn a ‘‘bag of patches’’ appearance model for each independently tracked object.
Modelling with MRF
Although globalspatialrelationshipsarenot maintained inthepatchmodels,weassertthatadjacent patches arelocallydependentinspace.MRF [25] facilitates ustomakedecisionusingthisspatialdependency. For thepatches-basedMRF, the eight-connected neighbourhood systemisadopted. The patches within theinteractingregionformthe observable nodes.Thecollectionofallpatch-label assignments forms thehiddennodes.In addition.
CONCLUSION AND FUTURE WORK
A new method is proposed for tracking multiple objects under occlusion, emphasizing the importance of interactions. With an adaptive patch bag to represent the object appearance, the tracker discriminates an object from its interacting objects by a classification process, which can competitively use the information about these objects.Instead of directly employing a thresholding strategy, a patches-based MAP-MRF framework is presented to infer a global optimal classification solution based on local spatial dependency, thus improving the accuracy of patch labelling. Accordingly, both the accuracy of object locations and consistency of object identities are improved.