Cold-Start Active Sampling Via γ-Tube

Xiaofeng Cao, Ivor W. Tsang*, Jianliang Xu

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

2 Citations (Scopus)

Abstract

Active learning (AL) improves the generalization performance for the current classification hypothesis by querying labels from a pool of unlabeled data. The sampling process is typically assessed by an informative, representative, or diverse evaluation policy. However, the policy, which needs an initial labeled set to start, may degenerate its performance in a coldstart hypothesis. In this article, we first show that typical AL sampling can be equivalently formulated as geometric sampling over minimum enclosing balls1 (MEBs) of clusters. Following the γ -tube structure in geometric clustering, we then divide one MEB covering a cluster into two parts: 1) a γ -tube and 2) a γ - ball. By estimating the error disagreement between sampling in MEB and γ -ball, our theoretical insight reveals that γ -tube can effectively measure the disagreement of hypotheses in original space over MEB and sampling space over γ -ball. To tighten our insight, we present generalization analysis, and the results show that sampling in γ -tube can derive higher probability bound to achieve a nearly zero generalization error. With these analyses, we finally apply the informative sampling policy of AL over γ -tube to present a tube AL (TAL) algorithm against the coldstart sampling issue. As a result, the dependency between the querying process and the evaluation policy of active sampling can be alleviated. Experimental results show that by using the γ -tube structure to deal with cold-start sampling, TAL achieves the superior performance than standard AL evaluation baselines by presenting substantial accuracy improvements. Image edge recognition extends our theoretical results.

Original languageEnglish
Pages (from-to)6034-6045
Number of pages12
JournalIEEE Transactions on Cybernetics
Volume52
Issue number7
DOIs
Publication statusPublished - Jul 2022

Scopus Subject Areas

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Electrical and Electronic Engineering
  • Control and Systems Engineering
  • Computer Science Applications

User-Defined Keywords

  • Active learning (AL)
  • generalization errors
  • hypothesis
  • minimum enclosing balls (MEBs)
  • γ-tube

Fingerprint

Dive into the research topics of 'Cold-Start Active Sampling Via γ-Tube'. Together they form a unique fingerprint.

Cite this