Feature mapping based on heterogeneous cross-company effort estimation

Xiaoning Shen, Jiaqi Lu, Shuxian Li, Liyan Song*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Software Effort Estimation (SEE) is a process to predict the effort required for developing a software product. It usually takes time for a company to complete software projects, and thus the collection of training examples for SEE would usually be time-consuming, often resulting in the small data problem within the local company. To address the small data problem, many studies have explored the data from external companies to enlarge the amount of local data. However, the presence of heterogeneous features in the data from external companies poses a challenge. These heterogeneous features cannot be directly adopted as additional training examples for models of the local company. To address this issue, this paper proposes the Feature Alignment via Heterogeneous Data Mapping (FAHDM) framework. FAHDM aims to align the external data with the local company by mapping the heterogeneous features into the local feature space, thereby enabling their use as additional training examples. We conducted two sets of experiments on 16 datasets to verify the effectiveness and advancement of the proposed method. The experimental results based on seven datasets show that our proposed method improves the performance of training sets with different sizes.

Original languageEnglish
Pages (from-to)1717-1761
Number of pages45
JournalSoftware Quality Journal
Volume32
Issue number4
Early online date17 Oct 2024
DOIs
Publication statusPublished - Dec 2024

Scopus Subject Areas

  • Software
  • Safety, Risk, Reliability and Quality

User-Defined Keywords

  • Data selection
  • Feature mapping
  • Heterogeneous feature
  • Small data problem
  • Software effort estimation

Fingerprint

Dive into the research topics of 'Feature mapping based on heterogeneous cross-company effort estimation'. Together they form a unique fingerprint.

Cite this