TY - JOUR
T1 - A Semisupervised Classification Approach for Multidomain Networks with Domain Selection
AU - Chen, Chuan
AU - Xin, Jingxue
AU - Wang, Yong
AU - Chen, Luonan
AU - NG, Kwok Po
N1 - Funding Information:
Manuscript received August 2, 2016; revised April 21, 2017 and November 16, 2017; accepted May 9, 2018. Date of publication June 14, 2018; date of current version December 19, 2018. This work was supported in part by the National Key Research and Development Program of China under Grant 2017YFA0505500 and Grant 2016YFB1000101, in part by the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant XDB13040700, and in part by the National Natural Science Foundation of China under Grant 91529303, Grant 31771476, Grant 81471047, Grant 91730301, Grant 61671444, and Grant 61621003. (Corresponding authors: Luonan Chen; Michael K. Ng.) C. Chen is with the National Engineering Research Center of Digital Life, School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, China, and also with the Department of Mathematics, Hong Kong Baptist University, Hong Kong (e-mail: [email protected]).
PY - 2019/1
Y1 - 2019/1
N2 - Multidomain network classification has attracted significant attention in data integration and machine learning, which can enhance network classification or prediction performance by integrating information from different sources. Despite the previous success, existing multidomain network learning methods usually assume that different views are available for the same set of instances, and thus, they seek a consistent classification result for all domains. However, in many real-world problems, each domain has its specific instance set, and one instance in one domain may correspond to multiple instances in another domain. Moreover, due to the rapid growth of data sources, different domains may not be relevant to each other, which asks for selecting domains relevant to the target/focused domain. A key challenge under this setting is how to achieve accurate prediction by integrating different data representations without losing data information. In this paper, we propose a semisupervised classification approach for a multidomain network based on label propagation, i.e., multidomain classification with domain selection (MCS), which can deal with the cross-domain information and different instance sets in domains. In particular, with sparse weight properties, the proposed MCS can automatically identify those domains relevant to our target domain by assigning them higher weights than the other irrelevant domains. This not only significantly improves a classification accuracy but also helps to obtain optimal network partition for the target domain. From the theoretical viewpoint, we equivalently decompose MCS into two simpler subproblems with analytical solutions, which can be efficiently solved by their computational procedures. Extensive experimental results on both synthetic and real-world data sets empirically demonstrate the advantages of the proposed approach in terms of both prediction performance and domain selection ability.
AB - Multidomain network classification has attracted significant attention in data integration and machine learning, which can enhance network classification or prediction performance by integrating information from different sources. Despite the previous success, existing multidomain network learning methods usually assume that different views are available for the same set of instances, and thus, they seek a consistent classification result for all domains. However, in many real-world problems, each domain has its specific instance set, and one instance in one domain may correspond to multiple instances in another domain. Moreover, due to the rapid growth of data sources, different domains may not be relevant to each other, which asks for selecting domains relevant to the target/focused domain. A key challenge under this setting is how to achieve accurate prediction by integrating different data representations without losing data information. In this paper, we propose a semisupervised classification approach for a multidomain network based on label propagation, i.e., multidomain classification with domain selection (MCS), which can deal with the cross-domain information and different instance sets in domains. In particular, with sparse weight properties, the proposed MCS can automatically identify those domains relevant to our target domain by assigning them higher weights than the other irrelevant domains. This not only significantly improves a classification accuracy but also helps to obtain optimal network partition for the target domain. From the theoretical viewpoint, we equivalently decompose MCS into two simpler subproblems with analytical solutions, which can be efficiently solved by their computational procedures. Extensive experimental results on both synthetic and real-world data sets empirically demonstrate the advantages of the proposed approach in terms of both prediction performance and domain selection ability.
KW - Domain selection
KW - multidomain classification
KW - network integration
KW - semisupervised learning
KW - sparsity
UR - http://www.scopus.com/inward/record.url?scp=85048559920&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2018.2837166
DO - 10.1109/TNNLS.2018.2837166
M3 - Journal article
C2 - 29994273
AN - SCOPUS:85048559920
SN - 2162-237X
VL - 30
SP - 269
EP - 283
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 1
M1 - 8385196
ER -