Abstract
Software Effort Estimation (SEE) is a process to predict the effort required for developing a software product. It usually takes time for a company to complete software projects, and thus the collection of training examples for SEE would usually be time-consuming, often resulting in the small data problem within the local company. To address the small data problem, many studies have explored the data from external companies to enlarge the amount of local data. However, the presence of heterogeneous features in the data from external companies poses a challenge. These heterogeneous features cannot be directly adopted as additional training examples for models of the local company. To address this issue, this paper proposes the Feature Alignment via Heterogeneous Data Mapping (FAHDM) framework. FAHDM aims to align the external data with the local company by mapping the heterogeneous features into the local feature space, thereby enabling their use as additional training examples. We conducted two sets of experiments on 16 datasets to verify the effectiveness and advancement of the proposed method. The experimental results based on seven datasets show that our proposed method improves the performance of training sets with different sizes.
| Original language | English |
|---|---|
| Pages (from-to) | 1717-1761 |
| Number of pages | 45 |
| Journal | Software Quality Journal |
| Volume | 32 |
| Issue number | 4 |
| Early online date | 17 Oct 2024 |
| DOIs | |
| Publication status | Published - Dec 2024 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 9 Industry, Innovation, and Infrastructure
User-Defined Keywords
- Data selection
- Feature mapping
- Heterogeneous feature
- Small data problem
- Software effort estimation
Fingerprint
Dive into the research topics of 'Feature mapping based on heterogeneous cross-company effort estimation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver