分布外数据下的可信图表征学习研究 Trustworthy Graph Representation Learning under Out-of-distribution Data

  • 韓波, Bo (PI)
  • ZHANG, Yonggang (CoI)
  • WANG, Qizhou (CoI)
  • ZHU, Jianing (CoI)
  • ZHOU, Zhanke (CoI)
  • JIANG, Xue (CoI)
  • TIAN, Hongduan (CoI)
  • PENG, Xiong (CoI)

Project: Research project

Project Details

Description

图表征学习是在图结构数据上做深度学习的一种新兴机器学习范式,其应用非常广泛,如大语言模型,药物发现和社交网络等。当前的图表征学习方法隐式地假设训练和测试图数据服从独立同分布,然而真实场景中的测试图数据往往服从分布外数据规律。当面临分布外测试图数据时,当前方法训练得到的模型性能往往会显著降低。为了解决这个问题,本项目旨在研究分布外数据下的可信图表征学习。具体说来,本项目拟从四个正交的角度,探索可信图表征学习。任务一将从不变性角度,研究针对学习因果不变图表示所需的基本理论和假设,而这些表示可推广到分布外图数据。任务二将从鲁棒性角度,基于深度学习重编程属性,探索如何针对图分布外检测做自适应。任务三将从可解释角度,研究如何为图分布外数据的预测提供解释。任务四将从表达能力的角度,探索具有更好分布外泛化能力的表达图神经架构。本项目也将所提的可信图表征学习算法应用于基础模型,药物发现和社交网络等领域。
This project aims to explore new directions in trustworthy graph representation learning (TGRL) under out-of-distribution data, with applications to foundation models (e.g., ChatGPT), drug discovery, and social networks, which Tencent focuses on. It is motivated by the fact that: most of the existing approaches in graph representation learning implicitly assume that the training and test graph data are independent and identically distributed (i.e., i.i.d.). However, such an assumption can rarely hold in real-world applications. Instead, several recent studies have shown that the performance of graph representation learning methods can dramatically decrease when encountering out-of-distribution (OOD) data. To break such limitations, this project proposes to address the above challenge by four critical tasks. The main idea is to integrate trustworthy learning with graph representation learning, which can prevent performance degradation under OOD data. Specifically, from the perspective of invariance, Task 1 will study the fundamental theories and assumptions required for learning causally invariant graph representations, which are generalizable to OOD graph data. From the perspective of robustness, Task 2 will study how to make the adaptation for graph OOD detection, which is based on the reprogramming property of deep learning. From the perspective of interpretability, Task 3 will study how to provide explanations for predictions on OOD graph data. From the perspective of expressiveness, Task 4 will study the expressive graph neural architectures that have better OOD generalization ability. More importantly, we hope to apply the proposed TGRL algorithms to foundation models, drug discovery, and social networks.
StatusActive
Effective start/end date1/01/2431/12/26

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.