Fraud Detection under Multi-Sourced Extremely Noisy Annotations

Chuang Zhang, Qizhou Wang, Tengfei Liu, Xun Lu, Jin Hong, Bo Han, Chen Gong*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Fraud detection in e-commerce, which is critical to protecting the capital safety of users and financial corporations, aims at determining whether an online transaction or other activity is fraudulent or not. This problem has been previously addressed by various fully supervised learning methods. However, the true labels for training a supervised fraud detection model are difficult to collect in many real-world cases. To circumvent this issue, a series of automatic annotation techniques are employed instead in generating multiple noisy annotations for each unknown activity. In order to utilize these low-quality, multi-sourced annotations in achieving reliable detection results, we propose an iterative two-staged fraud detection framework with multi-sourced extremely noisy annotations. In label aggregation stage, multi-sourced labels are integrated by voting with adaptive weights; and in label correction stage, the correctness of the aggregated labels are properly estimated with the help of a handful of exactly labeled data and the results are used to train a robust fraud detector. These two stages benefit from each other, and the iterative executions lead to steadily improved detection results. Therefore, our method is termed "Label Aggregation and Correction"(LAC). Experimentally, we collect millions of transaction records from Alipay in two different fraud detection scenarios, i.e., credit card theft and promotion abuse fraud. When compared with state-of-the-art counterparts, our method can achieve at least 0.019 and 0.117 improvements in terms of average AUC on the two collected datasets, which clearly demonstrate the effectiveness.

Original languageEnglish
Title of host publicationCIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
PublisherAssociation for Computing Machinery (ACM)
Pages2497-2506
Number of pages10
ISBN (Electronic)9781450384469
DOIs
Publication statusPublished - 26 Oct 2021
Event30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Gold Coast, Queensland, Australia
Duration: 1 Nov 20215 Nov 2021
https://www.cikm2021.org/
https://dl.acm.org/doi/proceedings/10.1145/3459637

Publication series

NameProceedings of International Conference on Information and Knowledge Management

Conference

Conference30th ACM International Conference on Information and Knowledge Management, CIKM 2021
Country/TerritoryAustralia
CityGold Coast, Queensland
Period1/11/215/11/21
Internet address

Scopus Subject Areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

User-Defined Keywords

  • crowdsourcing
  • fraud detection
  • label noise

Fingerprint

Dive into the research topics of 'Fraud Detection under Multi-Sourced Extremely Noisy Annotations'. Together they form a unique fingerprint.

Cite this