Task relevant image content segmentation for compression
Abstract
This thesis is concerned with the automatic detection and segmentation of visually salient
image regions and subsequent targeted image compression in order to maintain observer
performance levels while reducing image filesize. In moving towards this goal, pertinent
issues have been addressed: the viability of "black-box" frequency transmission models,
statistical measures of the effect of image processing, observer perception of processed images
and how computer vision "feature points" correspond to visually salient image content.
We show that image feature points are distributed towards visually-salient image regions:
regions that are likely to attract observer attention. This remains true even when the "task"
of the observer is changed: observers performing a task generally direct their attention
towards image regions naturally rich in feature points.
A new algorithm based on feature points, "Visual Interest", is proposed to predict image
regions attended by observers. This method segments image content likely to attract visual
attention under a variety of viewing conditions: passive viewing and search-directed viewing
for different observer tasks. The algorithm improves the predictive power of observer eye
fixations during object search task relative to "bottom-up" models. It responds only to
image content, requiring no prior machine learning, in contrast to the scientific state-of-theart
which relies explicitly on object categorisation. "Visual Interest" can also be run with
object recognition to refine the segmentation for a particular object-category search task to
reduce the "salient" area to tighter image areas.
The resultant segmentation into salient and non-salient regions is used to generate regionof-
interest compressed images suitable for multi-task observer analysis. Using pre-blur of
JPEG we gain 15% filesize reduction beyond global JPEG application acting on image
content alone and 25% when combined with object recognition. Using JPEG2000 ROI gives
reductions of down to 25% of the original filesize while achieving gain in PSNR and SSIM
statistics over the ROI, with the benefit of ROI priority transmission.