Identifying High-Risk Early-Stage Chronic Liver Disease Patients: A Test-Time Training Approach

Project: Research project

Project Details


More than 30% of the world’s population suffers from early-stage of chronic liver diseases, such as chronic viral hepatitis and fatty liver, and the prevalence is increasing. Around 3% of individuals with chronic liver disease may progress to advanced stage of liver disease and hence liver complications, namely cirrhosis and liver cancer. The worldwide mortality from chronic liver diseases reaches as many as two millions annually. While patients with early-stage chronic liver disease may improve with their liver disease regress with appropriate treatment, there are often no obvious symptoms or signs at early stage of the liver disease. At the time when patients become symptomatic, it is likely that patients already suffer from advanced or even end stage of liver diseases. The chance of complete recovery decreases while the treatment cost is high. Moreover, since the numbers of patients suffering from early-stage of chronic liver patients are huge, it is impractical to treat all of them. As such, it is important to develop a new machine learning method to identify patients who are at high risk to progress from early to advanced stage of chronic liver disease based on the electronic patient records to enable timely treatment. This would help to treat patients and reduce the burden of liver disease.

With the impact of transdisciplinary research, artificial intelligence on clinical data risk prediction has become an important research topic. Chronic liver disease prediction methods using traditional machine leaning (e.g. random forest and logistic regression) and deep learning (e.g. recurrent neural networks and gated recurrent units) methods have been proposed. Experimental results have shown that these methods outperform existing clinical scoring tools. Although encouraging results have been reported, these methods have two limitations. First, there is room for improvement in the generalization ability of the trained predictor, because the accuracy of the trained predictor is reduced when it is used at other hospitals. Second, once the trained predictor has been learned, it is fixed and applied to all future test data. This means that the predictor cannot easily adapt to new and unseen data. Because clinical time-series data are used for prediction, new and new patient examinations can provide useful information about the test data distribution.

To overcome the limitations of existing methods, this project will aim to adapt the newly proposed test-time training machine learning strategy. Different from existing test-time training methods, which are designed for imaging modalities, this project will aim to propose a new test-time algorithm for clinical time-series patient records that accounts for irregular multivariate data and missing data. The key research issue is how to design a one-test-sample self-supervised learning auxiliary (pretext) task for clinical multivariate time-series data with missing values. To do this, the proposed project will explore and utilise health status information from electronic patient records to design the imputation, augmentation and auxiliary tasks. Specifically, this project will aim to develop (i) a health-status-aware algorithm that can perform imputation and augmentation simultaneously and (ii) time-series auxiliary tasks based on the augmented data uncertainty and the prediction uncertainty (risk) in augmented data used to perform self-supervised learning of clinical time-series data. Finally, a new prediction/classification method will be developed to identify high-risk individuals with early-stage chronic liver disease. To the best of our knowledge, we will be the first research group to study test-time training on clinical time-series data.
Effective start/end date1/01/23 → …


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.