06-09-2017, 01:24 PM
The videos are actually sequences of images, each of which is called a frame, which is displayed at a frequency fast enough so that human eyes can perceive the continuity of its contents. It is obvious that all image processing techniques can be applied to individual frames. In addition, the contents of two consecutive frames are often closely related.
Visual content can be modeled as a hierarchy of abstractions. At the first level are the raw pixels with color or brightness information. Additional processing produces features such as edges, corners, lines, curves, and color regions. A higher abstraction layer can combine and interpret these features as objects and their attributes. At the highest level are the human-level concepts that involve one or more objects and relationships between them.
The detection of objects in videos consists of verifying the presence of an object in sequences of images and possibly locate it precisely for its recognition. Object tracking is monitoring the spatial and temporal changes of an object during a video sequence, including its presence, position, size, shape, and so on. This is done by solving the problem of time correspondence, the problem of matching the target region in successive frames of a sequence of images taken at closely spaced intervals of time. These two processes are closely related because tracking usually begins with object detection, while detecting an object repeatedly in the subsequent image sequence is often necessary to assist and verify follow-up.
The MRF-MAP framework is computationally intensive due to random initialization. To reduce this load, we propose a heuristic initialization technique based on change information. The scheme requires an initially segmented framework. For the initial segmentation of the frame, the composite MRF model is used to model attributes and the MAP estimation is obtained by a hybrid algorithm [simulation annealing (SA) and iterative conditional mode (ICM)] that converges rapidly. For temporal segmentation, instead of using a gray level difference (CDM) based change detection mask, we propose a CDM based on the difference of two frame tags. The proposed scheme resulted in less effect of the silhouette. In addition, a combination of spatial and temporal segmentation is used to detect moving objects. The results of the proposed spatial segmentation approach are compared with those of the JSEG method, and the segmentation approaches without borders or edges. It is observed that the proposed approach provides better spatial segmentation compared to the other three methods.
Visual content can be modeled as a hierarchy of abstractions. At the first level are the raw pixels with color or brightness information. Additional processing produces features such as edges, corners, lines, curves, and color regions. A higher abstraction layer can combine and interpret these features as objects and their attributes. At the highest level are the human-level concepts that involve one or more objects and relationships between them.
The detection of objects in videos consists of verifying the presence of an object in sequences of images and possibly locate it precisely for its recognition. Object tracking is monitoring the spatial and temporal changes of an object during a video sequence, including its presence, position, size, shape, and so on. This is done by solving the problem of time correspondence, the problem of matching the target region in successive frames of a sequence of images taken at closely spaced intervals of time. These two processes are closely related because tracking usually begins with object detection, while detecting an object repeatedly in the subsequent image sequence is often necessary to assist and verify follow-up.
The MRF-MAP framework is computationally intensive due to random initialization. To reduce this load, we propose a heuristic initialization technique based on change information. The scheme requires an initially segmented framework. For the initial segmentation of the frame, the composite MRF model is used to model attributes and the MAP estimation is obtained by a hybrid algorithm [simulation annealing (SA) and iterative conditional mode (ICM)] that converges rapidly. For temporal segmentation, instead of using a gray level difference (CDM) based change detection mask, we propose a CDM based on the difference of two frame tags. The proposed scheme resulted in less effect of the silhouette. In addition, a combination of spatial and temporal segmentation is used to detect moving objects. The results of the proposed spatial segmentation approach are compared with those of the JSEG method, and the segmentation approaches without borders or edges. It is observed that the proposed approach provides better spatial segmentation compared to the other three methods.