TY - JOUR
T1 - Feature mapping based on heterogeneous cross-company effort estimation
AU - Shen, Xiaoning
AU - Lu, Jiaqi
AU - Li, Shuxian
AU - Song, Liyan
N1 - This work was supported by State Key Laboratory of Robotics (Grant No. 2023-O11), Natural Science Foundation of Jiangsu Province of China under Grant No. BK20150924, National Natural Science Foundation of China (NSFC) under Grant Nos. 62002148, 62250710682, and 61502239, Guangdong Provincial Key Laboratory under Grant No. 2020B121201001, and Research Institute of Trustworthy Autonomous Systems (RITAS).
Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Software Effort Estimation (SEE) is a process to predict the effort required for developing a software product. It usually takes time for a company to complete software projects, and thus the collection of training examples for SEE would usually be time-consuming, often resulting in the small data problem within the local company. To address the small data problem, many studies have explored the data from external companies to enlarge the amount of local data. However, the presence of heterogeneous features in the data from external companies poses a challenge. These heterogeneous features cannot be directly adopted as additional training examples for models of the local company. To address this issue, this paper proposes the Feature Alignment via Heterogeneous Data Mapping (FAHDM) framework. FAHDM aims to align the external data with the local company by mapping the heterogeneous features into the local feature space, thereby enabling their use as additional training examples. We conducted two sets of experiments on 16 datasets to verify the effectiveness and advancement of the proposed method. The experimental results based on seven datasets show that our proposed method improves the performance of training sets with different sizes.
AB - Software Effort Estimation (SEE) is a process to predict the effort required for developing a software product. It usually takes time for a company to complete software projects, and thus the collection of training examples for SEE would usually be time-consuming, often resulting in the small data problem within the local company. To address the small data problem, many studies have explored the data from external companies to enlarge the amount of local data. However, the presence of heterogeneous features in the data from external companies poses a challenge. These heterogeneous features cannot be directly adopted as additional training examples for models of the local company. To address this issue, this paper proposes the Feature Alignment via Heterogeneous Data Mapping (FAHDM) framework. FAHDM aims to align the external data with the local company by mapping the heterogeneous features into the local feature space, thereby enabling their use as additional training examples. We conducted two sets of experiments on 16 datasets to verify the effectiveness and advancement of the proposed method. The experimental results based on seven datasets show that our proposed method improves the performance of training sets with different sizes.
KW - Data selection
KW - Feature mapping
KW - Heterogeneous feature
KW - Small data problem
KW - Software effort estimation
UR - http://www.scopus.com/inward/record.url?scp=85207285435&partnerID=8YFLogxK
UR - https://link.springer.com/article/10.1007/s11219-024-09697-x
U2 - 10.1007/s11219-024-09697-x
DO - 10.1007/s11219-024-09697-x
M3 - Journal article
AN - SCOPUS:85207285435
SN - 0963-9314
VL - 32
SP - 1717
EP - 1761
JO - Software Quality Journal
JF - Software Quality Journal
IS - 4
ER -