TY - JOUR
T1 - Robust Online Learning against Malicious Manipulation and Feedback Delay with Application to Network Flow Classification
AU - Li, Yupeng
AU - Liang, Ben
AU - Tizghadam, Ali
N1 - Funding Information:
Manuscript received December 16, 2020; revised March 21, 2021; accepted April 23, 2021. Date of publication June 14, 2021; date of current version July 16, 2021. This work was supported in part by TELUS, in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada, and in part by Hong Kong Baptist University (Start-up Grant and the AI-Info Communication Study Scheme). This article was presented at the IEEE INFOCOM [1]. (Corresponding author: Yupeng Li.) Yupeng Li was with the Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 1A1, Canada. He is now with the School of Communication, Hong Kong Baptist University, Hong Kong (e-mail: [email protected]).
Publisher Copyright:
© 1983-2012 IEEE.
PY - 2021/8
Y1 - 2021/8
N2 - Malicious data manipulation reduces the effectiveness of machine learning techniques, which rely on accurate knowledge of the input data. Motivated by real-world applications in network flow classification, we address the problem of robust online learning with delayed feedback in the presence of malicious data generators that attempt to gain favorable classification outcome by manipulating the data features. When the feedback delay is static, we propose online algorithms termed ROLC-NC and ROLC-C when the malicious data generators are non-clairvoyant and clairvoyant, respectively. We then consider the dynamic delay case, for which we propose online algorithms termed ROLC-NC-D and ROLC-C-D when the malicious data generators are non-clairvoyant and clairvoyant, respectively. We derive regret bounds for these four algorithms and show that they are sub-linear under mild conditions. We further evaluate the proposed algorithms in network flow classification via extensive experiments using real-world data traces. Our experimental results demonstrate that the proposed algorithms can approach the performance of an optimal static offline classifier that is not under attack, while outperforming the same offline classifier when tested with a mixture of normal and manipulated data.
AB - Malicious data manipulation reduces the effectiveness of machine learning techniques, which rely on accurate knowledge of the input data. Motivated by real-world applications in network flow classification, we address the problem of robust online learning with delayed feedback in the presence of malicious data generators that attempt to gain favorable classification outcome by manipulating the data features. When the feedback delay is static, we propose online algorithms termed ROLC-NC and ROLC-C when the malicious data generators are non-clairvoyant and clairvoyant, respectively. We then consider the dynamic delay case, for which we propose online algorithms termed ROLC-NC-D and ROLC-C-D when the malicious data generators are non-clairvoyant and clairvoyant, respectively. We derive regret bounds for these four algorithms and show that they are sub-linear under mild conditions. We further evaluate the proposed algorithms in network flow classification via extensive experiments using real-world data traces. Our experimental results demonstrate that the proposed algorithms can approach the performance of an optimal static offline classifier that is not under attack, while outperforming the same offline classifier when tested with a mixture of normal and manipulated data.
KW - feedback delay
KW - Malicious manipulation
KW - network flow classification
KW - robust online learning
UR - http://www.scopus.com/inward/record.url?scp=85110595397&partnerID=8YFLogxK
U2 - 10.1109/JSAC.2021.3087268
DO - 10.1109/JSAC.2021.3087268
M3 - Journal article
AN - SCOPUS:85110595397
SN - 0733-8716
VL - 39
SP - 2648
EP - 2663
JO - IEEE Journal on Selected Areas in Communications
JF - IEEE Journal on Selected Areas in Communications
IS - 8
ER -