Publication:
Comparatively Studying Modern Optimizers Capability for Fitting Vision Transformers

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Science and Business Media Deutschland GmbH

Research Projects

Organizational Units

Journal Issue

Abstract

The Transformer architectures have been achieving great strides in both research and industry, garnering high adoption due to their versatility and generality. These qualities, combined with the availability of internet-scale datasets, open the path to constructing deep learning systems that can target many modalities and several tasks within each modality. Throughout the years, many optimization algorithms have been proposed and utilized in fitting Deep Learning models. Although many comparative assessments were made that investigated analyzing and selecting the best optimizer to fit architectures prior to Transformers, the literature lacks such extensive assessments in relation to optimizing Transformer-based deep learning models. In this paper, we investigated modern and recently introduced deep learning optimizers and applied the comparative assessment to multiple Transformer architectures implemented for the task of image classification. It was discovered experimentally by our comparative study that the novel optimizer LION provided the best performance on the target task and datasets, proving that the algorithmic design of optimizers can compete with and surpass handcrafted optimization schemes that are normally used in fitting Transformer architectures. © 2024 Elsevier B.V., All rights reserved.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By