TY - JOUR
T1 - Comprehensive survey on the effectiveness of sharpness aware minimization and its progressive variants
AU - Rostand, Jules
AU - Hsu, Chen Chien James
AU - Lu, Cheng Kai
N1 - Publisher Copyright:
© 2024 The Chinese Institute of Engineers.
PY - 2024
Y1 - 2024
N2 - As advancements push for larger and more complex Artificial Intelligence (AI) models to improve performance, preventing the occurrence of overfitting when training overparameterized Deep Neural Networks (DNNs) remains a challenge. Despite the presence of various regularization techniques aimed at mitigating this issue, poor generalization remains a concern, especially when handling diverse and limited data. This paper explores one of the latest and most promising strategies to address this challenge, Sharpness Aware Minimization (SAM), which concurrently minimizes loss value and sharpness-related loss. While this method exhibits substantial effectiveness, it comes with a notable trade-off in increased training time and is founded on several approximations. Consequently, several variants of SAM have emerged to alleviate these limitations and bolster model performance. This survey paper examines the significant advancements achieved by SAM, delves into its constraints, and categorizes recent progressive variants that further enhance current State-of-the-Art results.
AB - As advancements push for larger and more complex Artificial Intelligence (AI) models to improve performance, preventing the occurrence of overfitting when training overparameterized Deep Neural Networks (DNNs) remains a challenge. Despite the presence of various regularization techniques aimed at mitigating this issue, poor generalization remains a concern, especially when handling diverse and limited data. This paper explores one of the latest and most promising strategies to address this challenge, Sharpness Aware Minimization (SAM), which concurrently minimizes loss value and sharpness-related loss. While this method exhibits substantial effectiveness, it comes with a notable trade-off in increased training time and is founded on several approximations. Consequently, several variants of SAM have emerged to alleviate these limitations and bolster model performance. This survey paper examines the significant advancements achieved by SAM, delves into its constraints, and categorizes recent progressive variants that further enhance current State-of-the-Art results.
KW - Su, Shun-Feng
UR - http://www.scopus.com/inward/record.url?scp=85200330016&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200330016&partnerID=8YFLogxK
U2 - 10.1080/02533839.2024.2383592
DO - 10.1080/02533839.2024.2383592
M3 - Article
AN - SCOPUS:85200330016
SN - 0253-3839
VL - 47
SP - 795
EP - 803
JO - Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A
JF - Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A
IS - 7
ER -