PGNet v2: Enhancing Aesthetic Image Critique Generation with Self-Resurrecting Activation and Gaussian Gated Units

  • Meng Luen Wu*
  • , Po Cheng Yu
  • , Chiung Yao Fang
  • *Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

Abstract

With the widespread use of smartphones and digital cameras, automated aesthetic critique generation has become essential for helping users capture visually appealing photos. Traditional methods rely on rule-based techniques or handcrafted aesthetic features, which lack flexibility and struggle to provide context-aware critiques. In contrast, deep learning enables data-driven analysis, allowing models to learn complex aesthetic patterns and generate insightful feedback. However, existing deep learning approaches suffer from limited critique diversity, poor multimodal fusion, and difficulty capturing fine-grained aesthetic attributes, reducing their effectiveness. To address these challenges, we propose PGNet v2, a transformer-based aesthetic critique generation system. PGNet v2 employs SwinV2 as a visual encoder and GPT-2 as a decoder, integrating visual and textual features through a cross-attention mechanism. By the use of FFN_GEGLU, PGNet v2 enhances text modeling and expressiveness, improving critique diversity, while Self-Resurrecting Activation Unit (SRAU) strengthens multimodal fusion, ensuring more coherent and contextually relevant critiques. Additionally, SwinV2's hierarchical feature extraction enables the model to capture fine-grained aesthetic attributes, improving the depth and accuracy of generated feedback. Experimental results demonstrate that PGNet v2 outperforms PGNet v0, AMAN, and CNN-LSTM across 35 evaluation metrics, achieving a 94% improvement rate. The model excels in distinguishing high- and low-quality images, adapting to diverse lighting conditions, and generating more insightful and contextually relevant critiques. These findings confirm PGNet v2's superiority over existing methods, making it a valuable tool for real-world photography enhancement.

Original languageEnglish
Pages102-107
Number of pages6
DOIs
Publication statusPublished - 2025
Event15th International Workshop on Computer Science and Engineering, WCSE 2025 - Jeju Island, Korea, Republic of
Duration: 2025 Jun 282025 Jun 30

Conference

Conference15th International Workshop on Computer Science and Engineering, WCSE 2025
Country/TerritoryKorea, Republic of
CityJeju Island
Period2025/06/282025/06/30

Keywords

  • Aesthetic critique generation
  • computational aesthetics
  • transformer model
  • visual-text integration

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'PGNet v2: Enhancing Aesthetic Image Critique Generation with Self-Resurrecting Activation and Gaussian Gated Units'. Together they form a unique fingerprint.

Cite this