Multilabel Deep Visual-Semantic Embedding

Mei Chen Yeh*, Yi Nan Li

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)


Inspired by the great success from deep convolutional neural networks (CNNs) for single-label visual-semantic embedding, we exploit extending these models for multilabel images. We propose a new learning paradigm for multilabel image classification, in which labels are ranked according to its relevance to the input image. In contrast to conventional CNN models that learn a latent vector representation (i.e., the image embedding vector), the developed visual model learns a mapping (i.e., a transformation matrix) from an image in an attempt to differentiate between its relevant and irrelevant labels. Despite the conceptual simplicity of our approach, the proposed model achieves state-of-the-art results on three public benchmark datasets.

Original languageEnglish
Article number8691414
Pages (from-to)1530-1536
Number of pages7
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Issue number6
Publication statusPublished - 2020 Jun 1


  • Multilabel classification
  • convolutional neural networks
  • visual semantic embedding

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computational Theory and Mathematics
  • Artificial Intelligence
  • Applied Mathematics


Dive into the research topics of 'Multilabel Deep Visual-Semantic Embedding'. Together they form a unique fingerprint.

Cite this