A Stage-varying Methodology That Can Reconstruct the Dynamic Gene Regulations During the Process of Cancer Progression From Stage-course Transcriptional Data

  • ZHU, Hailong (PI)

Project: Research project

Project Details


Gene regulatory networks (GRNs) have been shown very promising in modeling the transcriptional regulations of biological systems. Development of GRNs is a focal point of computational biology and system biology.

Modeling of GRNs usually adopts static or dynamic approach depending on the nature of the problem and availability of information. Static approach targets to generate a pooled-average model of regulatory dependencies based on correlation or mutual information among genes. It is quite effective for small size of data and efficient for large dataset. But it cannot describe the dynamic mechanisms. Dynamic approach, on the other hand, is designed to model the details of the dynamic regulations based on the time-series transcriptional data (data acquired on a series of time snapshots) of a biological process. However, the current methodologies of GRNs are not very suitable to model the development process of human diseases, such as cancer, since time-series transcriptional data are hardly available.

In this proposal, we propose a stage-varying approach in order to capture the dynamic gene regulation during cancer progression process whilst breaking through the limitation of time-series data. Our approach is motivated by the fact that most of profiling data of cancer samples are annotated with cancer staging information. Since the clinical cancer stage is a standard and stable indicator of cancer development, according to that it is possible to align the data from different centers to form a stage-course data.

Our stage-varying GRN is based on three assumptions: the stagewise steady-speed assumption, the continuity assumption, and the i.i.d. assumption (independently and identically distributed distribution) of cancer progression time. We show that the stage-varying model is composed of a series of cascaded networks of different cancer stages. Similar to other GRNs, the node denotes the regulator or target, the edge represents the regulatory direction, and the connection parameter stands for the regulatory strength.

Development of the model includes identification of regulators and optimization of network connections. The regulators including transcriptional factors (TFs) and the regulatory circuits, will be initially identified by using correlation or mutual information, and then will be refined during the optimization of the model. The optimization of the model employs LASSO (least absolute shrinkage and selection operator) together with bootstrapping strategy in order to control the flexibility and sparseness of the model and to prevent overfitting during the training process.

The main novelties of our approach include the following aspects: 1) it first time turns the time-varying regulatory network to a stage-varying network, making the dynamic GRN applicable to model the cancer progression process; 2) our approach is based on the stagewise steady-speed assumption instead of the steady-state assumption. We show that the former assumption is a general case of the latter one. This relaxation allows a more flexible model to be produced; 3) this approach assumes that gene expression varies continuously during cancer progression. We show that the continuity assumption is more natural than the discontinuity one used in stagewise pooled-average approach. The continuity assumption allows us to connect the networks of different stages to form an integrative model.
Effective start/end date1/11/1130/04/15


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.