ATICVis: A Visual Analytics System for Asymmetric Transformer Models Interpretation and Comparison

Jian Lin Wu, Pei Chen Chang, Chao Wang, Ko Chih Wang*

*此作品的通信作者

研究成果: 雜誌貢獻期刊論文同行評審

2 引文 斯高帕斯(Scopus)

摘要

In recent years, natural language processing (NLP) technology has made great progress. Models based on transformers have performed well in various natural language processing problems. However, a natural language task can be carried out by multiple different models with slightly different architectures, such as different numbers of layers and attention heads. In addition to quantitative indicators such as the basis for selecting models, many users also consider the language understanding ability of the model and the computing resources it requires. However, comparing and deeply analyzing two transformer-based models with different numbers of layers and attention heads are not easy because it lacks the inherent one-to-one match between models, so comparing models with different architectures is a crucial and challenging task when users train, select, or improve models for their NLP tasks. In this paper, we develop a visual analysis system to help machine learning experts deeply interpret and compare the pros and cons of asymmetric transformer-based models when the models are applied to a user’s target NLP task. We propose metrics to evaluate the similarity between layers or attention heads to help users to identify valuable layers and attention head combinations to compare. Our visual tool provides an interactive overview-to-detail framework for users to explore when and why models behave differently. In the use cases, users use our visual tool to find out and explain why a large model does not significantly outperform a small model and understand the linguistic features captured by layers and attention heads. The use cases and user feedback show that our tool can help people gain insight and facilitate model comparison tasks.

原文英語
文章編號1595
期刊Applied Sciences (Switzerland)
13
發行號3
DOIs
出版狀態已發佈 - 2023 2月

ASJC Scopus subject areas

  • 一般材料科學
  • 儀器
  • 一般工程
  • 製程化學與技術
  • 電腦科學應用
  • 流體流動和轉移過程

指紋

深入研究「ATICVis: A Visual Analytics System for Asymmetric Transformer Models Interpretation and Comparison」主題。共同形成了獨特的指紋。

引用此