This paper proposes the probabilistic semantic component descriptor (PSCD) for automatically extracting semantic information in a set of images. The basic idea of the PSCD is first to identify what kinds of hidden semantic concepts associated with regions in a set of images and then to construct an image-based descriptor by integrating hidden concepts of regions in an image. First, low-level features of regions in images are quantized into a set of visual words. Visual words for representing region features and high-level concepts hidden in images are linked together using the unsupervised method probabilistic latent semantic analysis. The linkage of visual words and images is built on the entire set of images, and hence a set of hidden concepts to describe each of the regions is extracted. Next, regions with unreliable concepts are eliminated, and then a PSCD for each image is constructed by propagating the probabilities of hidden concepts in the remaining regions of an image. We also present quantitative experiments to demonstrate the performance of our proposed PSCD.
ASJC Scopus subject areas