TY - JOUR
T1 - Instance-Dependent Positive and Unlabeled Learning With Labeling Bias Estimation
AU - Gong, Chen
AU - Wang, Qizhou
AU - Liu, Tongliang
AU - Han, Bo
AU - You, Jane J.
AU - Yang, Jian
AU - Tao, Dacheng
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2022/8/1
Y1 - 2022/8/1
N2 - This paper studies instance-dependent Positive and Unlabeled (PU) classification, where whether a positive example will be labeled (indicated by s) is not only related to the class label y, but also depends on the observation x. Therefore, the labeling probability on positive examples is not uniform as previous works assumed, but is biased to some simple or critical data points. To depict the above dependency relationship, a graphical model is built in this paper which further leads to a maximization problem on the induced likelihood function regarding P(s,y x)P(s,y|x). By utilizing the well-known EM and Adam optimization techniques, the labeling probability of any positive example P(s=1|y=1, x)P(s=1|y=1,x) as well as the classifier induced by P(y| x)P(y|x) can be acquired. Theoretically, we prove that the critical solution always exists, and is locally unique for linear model if some sufficient conditions are met. Moreover, we upper bound the generalization error for both linear logistic and non-linear network instantiations of our algorithm, with the convergence rate of expected risk to empirical risk as O(1/√ k+1√n-k+1/√n)O(1/k+1/n-k+1/n) (k and n are the sizes of positive set and the entire training set, respectively). Empirically, we compare our method with state-of-the-art instance-independent and instance-dependent PU algorithms on a wide range of synthetic, benchmark and real-world datasets, and the experimental results firmly demonstrate the advantage of the proposed method over the existing PU approaches.
AB - This paper studies instance-dependent Positive and Unlabeled (PU) classification, where whether a positive example will be labeled (indicated by s) is not only related to the class label y, but also depends on the observation x. Therefore, the labeling probability on positive examples is not uniform as previous works assumed, but is biased to some simple or critical data points. To depict the above dependency relationship, a graphical model is built in this paper which further leads to a maximization problem on the induced likelihood function regarding P(s,y x)P(s,y|x). By utilizing the well-known EM and Adam optimization techniques, the labeling probability of any positive example P(s=1|y=1, x)P(s=1|y=1,x) as well as the classifier induced by P(y| x)P(y|x) can be acquired. Theoretically, we prove that the critical solution always exists, and is locally unique for linear model if some sufficient conditions are met. Moreover, we upper bound the generalization error for both linear logistic and non-linear network instantiations of our algorithm, with the convergence rate of expected risk to empirical risk as O(1/√ k+1√n-k+1/√n)O(1/k+1/n-k+1/n) (k and n are the sizes of positive set and the entire training set, respectively). Empirically, we compare our method with state-of-the-art instance-independent and instance-dependent PU algorithms on a wide range of synthetic, benchmark and real-world datasets, and the experimental results firmly demonstrate the advantage of the proposed method over the existing PU approaches.
KW - Generalization Bound
KW - Instance-Dependent PU Learning
KW - Labeling Bias
KW - Maximum Likelihood Estimation
KW - Solution Uniqueness
UR - http://www.scopus.com/inward/record.url?scp=85101746826&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2021.3061456
DO - 10.1109/TPAMI.2021.3061456
M3 - Journal article
AN - SCOPUS:85101746826
SN - 0162-8828
VL - 44
SP - 4163
EP - 4177
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 8
ER -