TY - JOUR
T1 - Low Tensor-Rank Adaptation of Kolmogorov–Arnold Networks
AU - Gao, Yihang
AU - Ng, Michael K.
AU - Tan, Vincent Y.F.
N1 - The work of Michael K. Ng was supported in part by the National Key Research and Development Program of China under Grant 2024YFE0202900, in part by RGC GRF under Grant 12300125, and in part by the Joint NSFC and RGC under Grant N-HKU769/21. The work of Vincent Y. F. Tan was supported in part by Singapore Ministry of Education AcRF Tier 2 under Grant A-8000423-00-00 and in part by the two AcRF Tier 1 under Grant A-8000980-00-00 and Grant A-8002934-00-00.
Publisher Copyright:
© 2025 IEEE.
PY - 2025/7/14
Y1 - 2025/7/14
N2 - Kolmogorov–Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptrons (MLPs) in various domains, especially for science-related tasks. However, transfer learning of KANs remains a relatively unexplored area. In this paper, inspired by Tucker decomposition of tensors and evidence on the low tensor-rank structure in KAN parameter updates, we develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs. We study the expressiveness of LoTRA based on Tucker decomposition approximations. Furthermore, we provide a theoretical analysis to select the learning rates for each LoTRA component to enable efficient training. Our analysis also shows that using identical learning rates across all components leads to inefficient training, highlighting the need for an adaptive learning rate strategy. Beyond theoretical insights, we explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs. Additionally, we propose Slim KANs that incorporate the inherent low-tensor-rank properties of KAN parameter tensors to reduce model size while maintaining superior performance. Experimental results validate the efficacy of the proposed learning rate selection strategy and demonstrate the effectiveness of LoTRA for transfer learning of KANs in solving PDEs. Further evaluations on Slim KANs for function representation and image classification tasks highlight the expressiveness of LoTRA and the potential for parameter reduction through low tensor-rank decomposition.
AB - Kolmogorov–Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptrons (MLPs) in various domains, especially for science-related tasks. However, transfer learning of KANs remains a relatively unexplored area. In this paper, inspired by Tucker decomposition of tensors and evidence on the low tensor-rank structure in KAN parameter updates, we develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs. We study the expressiveness of LoTRA based on Tucker decomposition approximations. Furthermore, we provide a theoretical analysis to select the learning rates for each LoTRA component to enable efficient training. Our analysis also shows that using identical learning rates across all components leads to inefficient training, highlighting the need for an adaptive learning rate strategy. Beyond theoretical insights, we explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs. Additionally, we propose Slim KANs that incorporate the inherent low-tensor-rank properties of KAN parameter tensors to reduce model size while maintaining superior performance. Experimental results validate the efficacy of the proposed learning rate selection strategy and demonstrate the effectiveness of LoTRA for transfer learning of KANs in solving PDEs. Further evaluations on Slim KANs for function representation and image classification tasks highlight the expressiveness of LoTRA and the potential for parameter reduction through low tensor-rank decomposition.
KW - Low tensor-rank adaptation
KW - transfer learning
KW - fine-tuning
KW - Kolmogorov-Arnold networks
KW - physics-informed machine learning
KW - partial differential equations
UR - http://www.scopus.com/inward/record.url?scp=105011179816&partnerID=8YFLogxK
U2 - 10.1109/TSP.2025.3588910
DO - 10.1109/TSP.2025.3588910
M3 - Journal article
AN - SCOPUS:105011179816
SN - 1053-587X
VL - 73
SP - 3107
EP - 3123
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
ER -