TY - JOUR
T1 - Toward Memory-Efficient and Interpretable Factorization Machines via Data and Model Binarization
AU - Geng, Yu
AU - Lan, Liang
AU - Cheung, William K.
N1 - Funding Information:
The authors would like to thank Hong Kong Baptist University for the Open Access fees based on the agreement between Hong Kong Baptist University and Institute of Electrical and Electronics Engineers (IEEE).
Publisher copyright:
© 2023 The Authors.
PY - 2023/11/7
Y1 - 2023/11/7
N2 - Factorization Machines (FM) is a general predictor that can efficiently model feature interactions in linear time, and thus has been broadly used for regression, classification and ranking tasks. Subspace Encoding Factorization Machine (SEFM) is one of the recent approaches which is proposed to enhance FM’s effectiveness by explicit nonlinear feature mapping for both individual features and feature interactions through equal-width binning per input feature. SEFM, despite its effectiveness, has a major drawback of increasing the memory cost of FM by b times where b is the number of bins adopted for the binning. To reduce the memory cost of SEFM, we propose Binarized FM (BiFM) in which each model parameter takes only a binary value (i.e., 1 or -1) and thus can be efficiently stored using one bit. We derive an algorithm which can learn the proposed FM with binary constraints using Straight Through Estimator (STE) with Adaptive Gradient Descent (Adagrad). For performance evaluation, we compare our proposed methods with a number of baselines based on eight different classification datasets. Our experimental results demonstrated that BiFM can achieve higher accuracy than SEFM at much less memory cost. BiFM also inherits the interpretability property from SEFM, and together with adaptive data binning methods can result in a more compact and interpretable set of classification rules.
AB - Factorization Machines (FM) is a general predictor that can efficiently model feature interactions in linear time, and thus has been broadly used for regression, classification and ranking tasks. Subspace Encoding Factorization Machine (SEFM) is one of the recent approaches which is proposed to enhance FM’s effectiveness by explicit nonlinear feature mapping for both individual features and feature interactions through equal-width binning per input feature. SEFM, despite its effectiveness, has a major drawback of increasing the memory cost of FM by b times where b is the number of bins adopted for the binning. To reduce the memory cost of SEFM, we propose Binarized FM (BiFM) in which each model parameter takes only a binary value (i.e., 1 or -1) and thus can be efficiently stored using one bit. We derive an algorithm which can learn the proposed FM with binary constraints using Straight Through Estimator (STE) with Adaptive Gradient Descent (Adagrad). For performance evaluation, we compare our proposed methods with a number of baselines based on eight different classification datasets. Our experimental results demonstrated that BiFM can achieve higher accuracy than SEFM at much less memory cost. BiFM also inherits the interpretability property from SEFM, and together with adaptive data binning methods can result in a more compact and interpretable set of classification rules.
KW - Binarization
KW - Factorization Machines
KW - Interpretability
KW - Memory-efficient Design
UR - http://www.scopus.com/inward/record.url?scp=85177024217&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3330779
DO - 10.1109/ACCESS.2023.3330779
M3 - Journal article
SN - 2169-3536
VL - 11
SP - 128633
EP - 128643
JO - IEEE Access
JF - IEEE Access
ER -