Abstract
With the widespread use of smartphones and digital cameras, automated aesthetic critique generation has become essential for helping users capture visually appealing photos. Traditional methods rely on rule-based techniques or handcrafted aesthetic features, which lack flexibility and struggle to provide context-aware critiques. In contrast, deep learning enables data-driven analysis, allowing models to learn complex aesthetic patterns and generate insightful feedback. However, existing deep learning approaches suffer from limited critique diversity, poor multimodal fusion, and difficulty capturing fine-grained aesthetic attributes, reducing their effectiveness. To address these challenges, we propose PGNet v2, a transformer-based aesthetic critique generation system. PGNet v2 employs SwinV2 as a visual encoder and GPT-2 as a decoder, integrating visual and textual features through a cross-attention mechanism. By the use of FFN_GEGLU, PGNet v2 enhances text modeling and expressiveness, improving critique diversity, while Self-Resurrecting Activation Unit (SRAU) strengthens multimodal fusion, ensuring more coherent and contextually relevant critiques. Additionally, SwinV2's hierarchical feature extraction enables the model to capture fine-grained aesthetic attributes, improving the depth and accuracy of generated feedback. Experimental results demonstrate that PGNet v2 outperforms PGNet v0, AMAN, and CNN-LSTM across 35 evaluation metrics, achieving a 94% improvement rate. The model excels in distinguishing high- and low-quality images, adapting to diverse lighting conditions, and generating more insightful and contextually relevant critiques. These findings confirm PGNet v2's superiority over existing methods, making it a valuable tool for real-world photography enhancement.
| Original language | English |
|---|---|
| Pages | 102-107 |
| Number of pages | 6 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 15th International Workshop on Computer Science and Engineering, WCSE 2025 - Jeju Island, Korea, Republic of Duration: 2025 Jun 28 → 2025 Jun 30 |
Conference
| Conference | 15th International Workshop on Computer Science and Engineering, WCSE 2025 |
|---|---|
| Country/Territory | Korea, Republic of |
| City | Jeju Island |
| Period | 2025/06/28 → 2025/06/30 |
Keywords
- Aesthetic critique generation
- computational aesthetics
- transformer model
- visual-text integration
ASJC Scopus subject areas
- Computer Networks and Communications
- Computer Science Applications
- Information Systems