A systems biology approach to construct the gene regulatory network of systemic inflammation via microarray and databases mining
© Chen et al; licensee BioMed Central Ltd. 2008
Received: 22 May 2008
Accepted: 30 September 2008
Published: 30 September 2008
Inflammation is a hallmark of many human diseases. Elucidating the mechanisms underlying systemic inflammation has long been an important topic in basic and clinical research. When primary pathogenetic events remains unclear due to its immense complexity, construction and analysis of the gene regulatory network of inflammation at times becomes the best way to understand the detrimental effects of disease. However, it is difficult to recognize and evaluate relevant biological processes from the huge quantities of experimental data. It is hence appealing to find an algorithm which can generate a gene regulatory network of systemic inflammation from high-throughput genomic studies of human diseases. Such network will be essential for us to extract valuable information from the complex and chaotic network under diseased conditions.
In this study, we construct a gene regulatory network of inflammation using data extracted from the Ensembl and JASPAR databases. We also integrate and apply a number of systematic algorithms like cross correlation threshold, maximum likelihood estimation method and Akaike Information Criterion (AIC) on time-lapsed microarray data to refine the genome-wide transcriptional regulatory network in response to bacterial endotoxins in the context of dynamic activated genes, which are regulated by transcription factors (TFs) such as NF-κB. This systematic approach is used to investigate the stochastic interaction represented by the dynamic leukocyte gene expression profiles of human subject exposed to an inflammatory stimulus (bacterial endotoxin). Based on the kinetic parameters of the dynamic gene regulatory network, we identify important properties (such as susceptibility to infection) of the immune system, which may be useful for translational research. Finally, robustness of the inflammatory gene network is also inferred by analyzing the hubs and "weak ties" structures of the gene network.
In this study, Data mining and dynamic network analyses were integrated to examine the gene regulatory network in the inflammatory response system. Compared with previous methodologies reported in the literatures, the proposed gene network perturbation method has shown a great improvement in analyzing the systemic inflammation.
Recently, the employment of microarray technology has rapidly produced vast catalogs of gene expression activities. The immense data highlights the need for a systematic tool to identify and analyze the underlying gene regulatory networks [1, 2]. Several computational methods for the inference of transcriptional regulatory networks from experimental microarray data in Saccharomyces cerevisiae have been published [3, 4]. The genome-wide transcriptional responses of inflammation are usually focused on the known functional interactions of the master switch proteins, such as Rel or NF-κB proteins [5–7]. The identification of NF-κB as a key player in the pathogenesis of inflammation suggests that NF-κB-targeted therapeutics might be effective in treating diseases like rheumatoid arthritis (RA), which is a well-known disease where inflammatory response is causing the primary damage . However, inflammation is usually a life-preserving response, as reflected by the increased risk of grave infections in people with genetic deficiencies in key components of the inflammatory signaling pathways .
Although inflammation is a hallmark of many human diseases [10, 25], few studies have evaluated the genome-wide responses induced by systemic inflammation in human. DNA microarray has allowed the semi-quantitative measurement of gene expression programming in great depth and on a broad scale. However, it is a challenge to overcome the difficulties of recognizing and evaluating relevant biological processes from vast quantities of experimental data. Recently, systems biology has gained much attention due to emerging experimental and computation methods [1, 2]. Systems biology is the coordinated study of biological systems by (1) investigating the components of networks and their interactions, (2) applying experimental high-throughput and whole-genome techniques, and (3) integrating computational methods with experimental efforts . Therefore, it is more appealing to adapt a systems biology approach to study the mechanism of inflammation via high-throughput transcriptomic studies of human disease. Such systematic approach can provide insights into the regulation of immune cell activities, tolerance of innate immune system, and the susceptibility of infection in human. Based on a structured network-based approach and a statistical likelihood method, a network-based analysis of systemic inflammation in human has been given to evaluate genome-wide transcriptional responses in the context of known functional relationships among proteins, small molecules, and phenotypes [10, 25]. The genome-wide interaction network is probed to identify functional modules that are perturbed in response to endotoxin exposure. A dynamic Bayesian network approach has also been developed to predict the gene regulatory networks from time course expression data .
Gene expression is transcriptionally controlled by inducible transcription factors. The transcription factor NF-κB in particularly is pivotal in the regulation of inflammation. For example, unstimulated macrophage is kept under an inactivated condition, its NF-κB is retained in the cytoplasm through interaction with inhibitory proteins known as IκB. Cell stimulation by bacterial endotoxin will trigger a signaling pathway which results in the degradation of IκB, leading to nuclear translocation of NF-κB and activation of the transcription of various proinflammatory cytokines  (IL1A, IL1B, TNFA, IL6, IL8,...etc). Many crosstalks among the signaling pathways are recognized. It is now known that the biological functions of IL1A and TNFA overlap and complement with each others [4, 14]. Thus, blocking only one mediator may not effectively reduce the overall inflammatory responses. Both IL1B and TNFA produce effects at an early stage of inflammation and the use of their inhibitory reagents at the later stage may not be able to reverse the most damaging events initiated by them. As a result, IL1B and TNFA may not represent the best targets for intervention in systemic inflammatory response. In another study , TNFA and IL1 were shown to have positive feedback loops to TNFR and IL1R, respectively. On the other hand, the NF-κB also initiate the transcription of an inhibitory protein (A20) which can inactivate NF-κB by suppressive phosphorylation in IKK (.(.(. The other important receptors in the immune system, TLR family members (TLR2 and TLR4), which recognize pathogens by means of conserved structural features of the microbes such as LPS for Gram-negative bacteria, would involve in activating the MyD88/IRAK signaling cascade, which bifurcates and leads to NF-kB and c-Jun/ATF2/TCF activation .
Because microarray data contain vast cataloged patterns of dynamic expression of the activated genes, we need systematic tools to identify the interaction architecture and the dynamics of the underlying gene networks. Indeed, the system identification problem of the underlying dynamic gene networks falls naturally into the category of reverse engineering ; a complex genetic network underlies a mass set of gene expression data, and the task is to infer the connectivity of gene circuit through dynamic gene regulatory model . Therefore, to understand complex gene networks requires the integration of microarray data and dynamic modeling by a systematic approach. The systematic approach has to include computational dynamic modeling coupled with microarray data, data mining, dynamic view of rapid responses and network structural view arising from high-throughput analysis of the interacting species . To achieve this, a dynamic Bayesian network (DBN) method has been developed to predict gene regulatory networks from time series data . However, this study has not combined with other network algorithms and knowledge-based databases. It carries two fundamental problems which greatly reduce the effectiveness of the DBN approach. The first problem is the relatively low accuracy of prediction inherently, and the second is the excessive computation time.
Since the identification of a perturbed biological networks under the effect of bacterial endotoxin is an important topic in basic and clinical research, it is imperative to conduct systematic analysis based on the expression profiles of microarray data. An approach of combining genome-wide expression analysis with a clustering method has been introduced to identify functional networks using a GRAM (Genetic Regulatory Modules) algorithm to provide biological insights into gene regulatory networks . Because the clustering algorithms are employed to identify sets of co-expressed and potentially co-regulated genes from gene expression data, it is more suitable to find a gene module as a set of co-expressed genes to which the same set of transcription factors will bind to their promoter regions. Therefore, it is not suitable to construct the transcriptional regulatory networks as a dynamic model. It is hence essential to provide a new way to identify the perturbed biological networks. To achieve this, systems biology and computational biology methods will need to be employed to describe the biological functions from a dynamic systems perspective [20, 21].
In our present study, a systems biology approach is proposed to achieve a gradual refinement of inflammatory regulatory network. In our study, we first construct a rough gene regulatory network of inflammation by information extracted from the Ensembl database http://www.ensembl.org/index.html and JASPAR http://jaspar.genereg.net/ algorithms. We then build a dynamic regulatory model according to the rough gene network with consideration of time-delay between regulatory gene and target gene to describe the gene regulatory network. Based on the dynamic regulatory model and microarray data in [10, 25], a maximum likelihood method is used to identify the regulatory parameters of upstream regulatory genes for each target gene. Finally, we prune away the insignificant regulatory genes by AIC model order detection method in system identification  to refine the gene regulatory network of inflammatory response to bacterial endotoxin. By comparing with normal gene regulatory networks, we obtain the perturbed gene network to analyze the effect of inflammatory stimulus on the immune system. The hubs and "weak ties" are also discussed for the robust inflammatory gene network. Our study is also based on databases mining to construct a rough inflammatory regulatory network.
Construction of Rough Gene Regulatory Network of Inflammation
Total 49 genes selected from published literatures
ATP-binding cassette, sub-family F (GCN20), member 1
Adenosine A2a receptor
Adenosine A3 receptor
Interleukin-1 receptor-associated kinase 1
integrin, beta 2
Acyloxyacyl hydrolase (neutrophil)
mitogen-activated protein kinase 10
Nuclear factor of activated T-cells, cytoplasmic, calcineurin-dependent 3
Chemokine (C-C motif) ligand 18 (pulmonary and activation-regulated)
Nuclear factor of kappa light polypeptide gene enhancer in B-cells 1 (p105)
Chemokine (C-C motif) receptor 7
nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, alpha
CCAAT/enhancer binding protein (C/EBP), delta
Nuclear factor related to kappa B binding protein
Chemokine (C-X-C motif) ligand 14
Nuclear receptor subfamily 3, group C, member 1 (glucocorticoid receptor)
chemokine (C-X-C motif) ligand 2
Phospholipase A2, group IVB (cytosolic)
Cytochrome b-245, beta polypeptide (chronic granulomatous disease)
phospholipase A2-activating protein
V-fos FBJ murine osteosarcoma viral oncogene homolog
G protein-coupled receptor 132
kallikrein 7 (chymotryptic, stratum corneum)
Histone deacetylase 4
Small inducible cytokine subfamily E, member 1 (endothelial monocyte-activating)
Histone deacetylase 5
Tachykinin receptor 1
Histone deacetylase 7A
Toll-like receptor adaptor molecule 2
Histone deacetylase 9
toll-like receptor 4
toll-like receptor 7
tumor necrosis factor
Tumor necrosis factor receptor superfamily member 1A precursor
Toll interacting protein
Interleukin-1 receptor type I precursor
Our goal is to select the candidate regulators (i.e. TFs) of 49 target genes in inflammatory response to construct the rough gene regulatory network of inflammation by linking these target genes to their regulators.
We explore the Ensembl database http://www.ensembl.org/index.html to retrieve the promoter sequences of 49 target genes and then conduct sequence similarity analysis to identity candidate regulators of these target genes in JASPAR http://asp.ii.uib.no:8090/cgi-bin/jaspar2005/jaspar_db.pl, which is a high-quality transcription factor database. In this stage, we hypothesize that if some TFs are selected by the predictions of JASPAR using our criterions, the genes generating the respective TFs at the protein level could be considered as candidate regulators to the target genes.
After this step, we obtain a set of candidate regulators from the JASPAR analysis [see Additional file 2, column (A)]. However, there are still many false positive errors in our hits because the outcome has listed all possible regulators in conditions beyond inflammatory response. Some pruning methods based on microarray data of inflammatory response are necessary.
The pruning procedure is described after step 5.
We screen and select potential regulators from the JASPAR hits by Cross correlation threshold of gene expression data , which is based on the assumption that there are possible correlations between target gene and their upstream regulators, with or without time delays. We compute the cross correlations between the target genes and their own regulatory genes separately, and the cross correlation values is then used to identify the candidate regulators according to the assumption that the regulatory genes and target genes have a positively (or negatively) correlated temporal relationship if the target gene's expression profile is positively (or negatively) correlated with the regulatory genes profile, with or without time lags.
Here we make the first selection from the candidate regulators in Step 3. This implies that if the cross correlation between a candidate regulator and the target gene is more than 0.46451, it will be considered as a candidate regulator for the target gene. After selecting potential regulators by cross correlation threshold, these target genes and their candidate regulators are integrated to construct a preliminary gene regulatory network of inflammatory response. Results of the first selection are listed in the supplemental material [see Additional file 2, column (B)].
Pruning the Preliminary Gene Regulatory Network via a Dynamic Model
By this point we have constructed a preliminary network via the first five selection steps using statistical inferences. However, we have yet to consider the dynamic property of this network. To include the dynamic parameters, we apply the Akaike Information Criterion (AIC), to help us make a more comprehensive selection. The AIC algorithm is denoted as Step 7 in Figure 1. A dynamic regulatory model is proposed to parsimoniously describe the gene regulatory genetic network of inflammation. It should be mentioned that the time delays from the regulators to their target genes, which have been detected by Cross correlation prediction algorithm via correlation, are considered in the dynamic regulatory model to mimic the delay phenomenon due to the transduction relay of the metabolic and signal pathways in the real transcriptional regulatory process. Details of the pruning are presented in the following paragraph and in the Material and Methods section.
In this study, based on the possible interactions in a preliminary gene regulatory network of inflammation [10, 24, 25] obtained from the previous sections, a dynamic regulatory model for the transcription of an interested target gene of systemic inflammation is developed. This model describes how the upstream regulatory genes control their target genes to produce the output expression of mRNA through transcriptional regulatory network. From the rough gene network through database-predicted information, we construct a dynamic regulatory model for each target gene of systemic inflammation in humans. Then, according to the microarray data of genetic expression, we identify the number of connections in the dynamic regulatory model of rough gene network in the inflammatory system. Based on the degree of interaction in the regulatory network, we prune the preliminary gene regulatory network of inflammation one target gene at a time via Akaike Information Criteria (AIC). The pruning procedures to obtain a refined gene regulatory network (see Figure 1) are given in the following steps.
By combining the maximum likelihood parameter estimation method with the most parsimonious model order detection using the Akaike Information Criterion (AIC) (see Materials and Methods), we could prune the rough gene network to generate a more refined gene network through the most parsimonious gene transcription regulatory model in equation (1) i.e., the insignificant interactions (or small b i ) could be deleted by AIC. With the upstream regulatory genes as target genes, we can then trace back their upstream regulatory genes by a similar construction procedure. Iteratively, we could construct the whole gene regulatory network of systemic inflammation in the innate immune system. The results of selection are listed [see Additional file 2, column (C)].
Construction of inflammatory gene network in immune system
We further lay out the perturbed inflammatory gene regulatory network to locate the significance differential connection of the key components. We can observe many differences in normal and inflammatory conditions from the perturbed gene network. In Figure 5, the gene network contains 64 nodes with 131 edges found only in normal condition but not in inflammatory condition, and there are 4 hubs (FOXD1, SPIB, YY1 and TLR4) which appear to be highly connected. In Figure 6, the gene network contains 70 nodes with 159 edges for gene network found only in inflammatory condition but not in normal condition, and there are clearly 3 hubs (FOXL1, TFAP2A and SOX9) within this perturbed network. It is noteworthy that these highly connected hubs have been mentioned in several previous studies. For example, TFAP2A is inactivated  and SOX9 is inhibitive  in response to inflammation as shown in Figure 6. And FOXL1 is dramatically induced during hepatic stellate cell activation  and preliminary experimental data indicates that FoxL1 is involved in the regulation of the adhesion molecule ICAM-1, an important mediator of neutrophil recruitment in liver injury. The current investigation is focused on delineating the mechanism by which FOXL1 regulates inflammatory signaling in the liver.
It has been shown that the robust gene network can form a scale-free network, i.e. genes prefer to form links with other genes that already has highest number of links [36, 37]. Scale-free gene networks could tolerate random removal of nodes but are vulnerable to loss of highly interactive hubs [36, 37]. This may result in the lethal outcome in a system's behavior when highly connected hubs are targeted. In the inflammatory gene network shown in Fig. 3(A), genes such as NF-kB, TNF-α, RELA, etc. can be considered as highly connected hubs of the signaling transduction. If they are inactivated by mutation or disease, the inflammatory gene network will lead to eventual collapse of the system. In order to overcome this lethal outcome, "weak linkage" architectures are evolved by nature selection to improve the robustness of gene regulatory networks. We argue such versatile mechanisms underlie the essential regulatory process of robust gene network. As a result, some connections can easily be removed and some connections can easily be added in the gene regulatory network. Such concept is also known as "weak ties" in network theory . "Weak ties" structures in biological networks enable remove of old processes and addition of new processes to the existing core processing to improve the information exchanges and signal transductions using common versatile mechanisms that operate on diverse inputs to various stimuli . As a consequence, "weak ties" can improve network's robustness against external stimuli. Obviously, the connections of the perturbed gene network in Fig. 4(a) are presented only in the normal condition. The perturbed gene network in Fig. 4(b) can hence be considered as additional connections in the inflammatory gene network. In response to bacterial endotoxin, the connections in Fig. 4(a) are removed and the connections in Fig. 4(b) are added. Apparently, this agrees with the concept of the so-called "strength of weak ties" in network theory, where the most important interactions and information exchanges sometimes occur via nodes from otherwise unrelated networks. This implies that non-hubs may play a pivotal role in the gene regulation [36, 37]. Similarly, the "weak ties" architecture in NF-kB gene network in inflammatory condition is shown in the removal and addition of connections of gene network in Fig. 6(a) and Fig. 6(b).
In summary, the regulators of target genes are first selected by JASPAR, then truncated by the threshold of Cross correlation and finally pruned by AIC via microarray data and a dynamic model. We combine several algorithms and tools to improve the performance of the gene network construction of the target inflammatory system. All the data sources are independently produced by various research groups and the results are verified with more independent studies published previously. It is clear that the top-down procedures can predict the target genes and their candidate regulatory transcription factors well. More biological insight into the perturbed inflammatory network is given in the Discussion section below and details of the proposed gene regulatory network construction algorithm are shown in Material and Methods.
For Step 7, the identification of time delay and the estimated parameters are shown in the supplemental material [see Additional file 5 and 6]. Although we consider the effect of time lag τ i in our model, it is plausible that not all regulators have delay times on their transcription regulations. It seems that the regulation in inflammation may act so swiftly that parameter τ i can not be detected (i.e., less than one time unit of microarray data or one half hour). However, there are several time lag regulations in IL8 and its regulators, such as SOX9, MEF2A, NFIL3, ELK1, FOXF1, FOXD1, GATA2, FOXI1, REL and RELA. It is because IL8 has a more complicated regulatory mechanism through other pathways with considerable delay. The dynamic model assumes that the expression profile of a target gene results from the kinetic activity of one or more specific regulators, which bind to the downstream target gene's promoter site and initiate the transcription of that target gene to exert its effect on the inflammation network. In other words, it is possible to generate the target gene expression profile via the gene expression profiles of the upstream transcription factors using the dynamic regulatory model and its kinetic parameters in equation (1). The continuous gene expression profiles in Figure 11 [also see Additional file 5] are generated by the identified dynamic model for all target genes and their corresponding regulators, which can fit the microarray data reasonably well. Dynamic modeling of biological systems including genetic networks and cell regulatory networks has been applied in functional analyses for a long time. However, very few of the other modeling has included the time delay parameter which is comprehensively factored into our stochastic dynamic model. The findings shown in this study successfully demonstrate that we can efficiently refine the gene regulatory network of systemic inflammation in human via microarray data, and to mimic the signaling transduction delay in the transcriptional regulatory process.
Combining the cross correlation selection algorithm and the Akaike Information Criterion, we created a novel dynamic modeling algorithm to trim down the tangled regulatory genetic network of human inflammatory system without loss of biological meaning. The algorithm presented here can models all combinations of the target genes/regulators and produces the best predictions on gene expression by the dynamic regulatory model. Instead of attempting to model the whole complicated regulatory processes with the high risk of incorrect prediction, our dynamic model focuses only on a concise set of target genes with a more reliable outcome. Iteratively, we could eventually construct the whole gene regulatory network of systemic inflammation in response to bacterial endotoxin by our dynamic model through microarray data.
Essential problem with application of the multivariate procedures to the microarray gene expression data as expressed in recent publications is associated with reproducibility of the complex constructions resulting from such analyses. In order to confirm the reproducibility of the proposed method, we use our algorithm to rebuild the gene regulatory network via the microarray data published in reference . In , they found there are 19 genes with significant inflammatory responses. In this situation, we reconstruct the inflammatory gene network based on these 19 genes. After comparing the reconstructed inflammatory gene regulatory network with the one in the text, we found some similarities and differences. The same highly connected hubs are GATA2, AML1 (RUNX1) and YY1. There are more than 5 connections for these hubs in both perturbed inflammatory networks. However, for the lack of some specific gene expression data in reference , we were unable to verify a part of highly interactive genes in the text (i.e. FOXL1, TFAP2A and SOX9). Interestingly, we also found there are some hubs only present in the reconstructed network but not in the text like GATA3 and FPR, which would be involved in host defense against bacterial infection and in the clearance of damaged cells . The reason why these 19 candidate genes still discovered new hubs is because some of 19 candidate genes are not included in the previous 49 genes. For different experimental conditions, research topics and technology platforms, the data pool from different literature may be different. Therefore, the candidates of target genes we chose here differed from the text, so the computational results would not be identical [see Additional file 7].
In this study, we use multi-input/single-output regulatory model to dynamically describe our gene regulatory system (i.e. multiple regulators and one target gene) that can mimic the real gene regulation in response to inflammation. The simulation can figure out the regulatory relationship and time lag value between upstream regulator and downstream target genes using time-series microarray data. In the research of Zou et al. , they used the concept of time delay just in a static state analysis of gene network, without applying it to dynamic modeling to mimic the bona fide gene regulatory behavior. Furthermore, the apparent shortcoming of the static state analysis is the limitation on a single-input single-output system (i.e. one regulator and one target gene). Such single-input single-output system is rarely existed in actual gene regulation.
While significant improvement in network construction has been achieved by our method, there are still two drawbacks in this study. First, although we present a multi-input/single-output system, it still can not represent the actual biological conditions because they are multi-input/multi-output systems in most situations. This means when using AIC to trim the initial tangled gene regulatory network, we should prune down all data simultaneously rather than separately. However, such approach will increase the computational complexity in the combinatorial way and thus become computationally infeasible. The second drawback of all published algorithms for inference of transcriptional regulatory networks in inflammation, including this study, is that the candidate regulators are selected from the pool of potential regulators typically defined by computational prediction, either by sequence similarity analysis, or by other genome annotation methods. If a true regulator is not included in the pool, it will inevitably escape identification by the modeling approach. This type of error will likely become a very significant problem in a poorly characterized genome of a model organism.
Our dynamic modeling represents a new approach to the study of gene regulatory network in inflammatory response. It is based on databases mining to construct an inflammatory regulatory network. It is also a systems biology approach because we process the complex regulatory network of numerous genes and regulators from various data sources at the same time. The trimmed down algorithm presented here can also be extended for global gene regulatory network analysis other than the inflammatory system in the future. From the curve fitting data generated by the proposed method, it can be seen that the performance is very satisfactory. By comparing with normal gene regulatory networks, we obtain the perturbed gene network to analyze the effect of inflammatory stimulus on the immune system. The hubs and "weak ties" are also discussed for the robust inflammatory gene network. The proposed gene regulatory network is also confirmed by published evidence in the literatures. In our future research, we will investigate the dynamic networks in a host-pathogen interaction on an animal model organism. We will also consider extending the algorithm to the identification and analysis of cross-talking transcriptional regulatory networks.
Materials and methods
We used previous microarray data [10, 25] as our mRNA expression profiles. Gene expression in whole blood leukocytes was determined at 0, 2, 4, 6, 9 and 24 hours after the intravenous administration of bacterial endotoxin to four healthy human subjects. In those experiments, four additional subjects were studied under identical conditions but without endotoxin administration. The infusion of endotoxin activates innate immune responses and presents physiological responses of brief duration. It should be noted that there is an initial proinflammatory phase and a subsequent counter-regulatory phase, with resolution of virtually all clinical perturbation within 24 hours.
Construction of Rough Gene Networks of Systemic Inflammation
Here M is the maximal time lead or lag between each two genes. Because we initially do not know which are the target genes and which are the regulator genes. Since each time-interval in h is a half-hour, we allow 2 hours lead and lag and compute the correlation between a gene and a TF with all possible time lags or leads that are less than 2 hours for regulatory response.
Finally, we select the maximum correlation between two genes with different time delays or time leads as their correlation and rank them in Figure 2 for all regulatory genes. We can obtain the distribution of correlation based on their ranks. Then, we can decide a threshold for a possible regulatory relationship between regulators and their target genes (see Figure 2). In this study, a correlation larger than 30% (or 0.46451) is selected as a threshold for possible regulators, which is used to truncate all impossible regulators from the pool of regulators suggested by JASPR via DNA sequence similarity analysis. Then, we link the remainder regulators selected by cross correlation threshold with their target genes to construct a rough gene regulatory network. After the rough gene regulatory network of inflammation system is constructed by integrating target genes with their regulators selected by cross correlation, the rough gene regulatory network is modeled by dynamic equation in (1) for further pruning. Therefore, the kinetic parameters of regulatory dynamic model are identified by the maximum likelihood parameter estimation via microarray data. After parameter identification, the insignificant interaction coefficients of dynamic model are pruned by the most parsimonious Akaike Information Criterion (AIC) to refine the gene regulatory network in the inflammatory condition. The possible regulators selected by JASPAR algorithm are pruned two times in our method, once by correlation threshold via Cross correlation and again by AIC via dynamic model and microarray data. The details are described below.
Constructing a dynamic model for gene regulatory network via microarray data
where ϕ[t] denotes the regression vector which can be obtained from microarray data, and θ ∈ R p denotes the parameter vector of dimension p in regression equation (4).
where y [t + k] and ϕ[t + k] are the k-th elements of Y and Φ in (6), respectively.
where , and are obtained from (11) and the variance of is obtained from (12).
Iteratively, one target gene at a time, we can construct the overall dynamic equations of transcriptional regulatory network of inflammation, which are interconnected through the regulations of TFs.
Since some interaction coefficients of the gene regulatory network in (13) are insignificant, they should be pruned off by the parsimonious AIC criterion. This is discussed in the next section.
Pruning the Gene Regulatory Network
First, in this study, we use the JASPAR database to identify plausible binding motifs of their TFs roughly and select candidate regulators from the pool of DNA sequence similarity analysis. A rough gene regulatory network of inflammation is constructed by linking target genes and their regulators with a cross correlation threshold larger than 30% (see Figure 2). Then we use the maximum likelihood estimation method to estimate the parameters of the dynamic model for a preliminary gene regulatory network of the inflammatory system.
Although the maximum likelihood estimation method can help us quantify the regulatory abilities of all the possible interactive candidates of regulators on target genes, we still do not know exactly how significantly the regulatory ability can be regarded as a true regulator. In order to determine whether a regulator is significant or not, a statistical approach based on model validation is proposed for evaluating the significance of our model parameters to prune the preliminary gene network. In this study, a statistical approach called the Akaike Information Criterion (AIC) is employed to validate the model order (or the number of model parameters) to determine the significance of our dynamic model parameters .
where denotes the estimated expression profile of the target gene, i.e. = ϕ·.
This is a tradeoff between residual variance and model order. The minimization of equation (14) will achieve the true model order (i.e. the number of regulators of the target gene) of the gene regulatory system .
After the statistical selection of p parameters by minimizing the Akaike Information Criterion (AIC), we can easily determine whether the regulatory TFs candidate is a significant or just a false positive and then construct a refined gene regulatory network model for inflammation. Finally, evidence from previous studies is an important validation to support our refined gene regulatory network.
We thank Tse-Ming Hsieh for the simulations. This study was supported by an NSC grant No. NSC 96-2627-B-007-004
- Kitano H: Computational systems biology. Nature. 420 (6912): 206-10. 10.1038/nature01254. 2002 Nov 14; Review.
- Kitano H: Systems biology: a brief overview. Science. 295 (5560): 1662-4. 10.1126/science.1069492. 2002 Mar 1; Review.
- Lin LH, Lee HC, Li WH, Chen BS: Dynamic modeling of cis-regulatory circuits and gene expression prediction via cross-gene identification. BMC bioinformatics. 2005, 6: 258-10.1186/1471-2105-6-258.View ArticlePubMedPubMed CentralGoogle Scholar
- Vu TT, Vohradsky J: Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae. Nucleic acids research. 2007, 35 (1): 279-287. 10.1093/nar/gkl1001.View ArticlePubMedGoogle Scholar
- Baldwin AS: The NF-kappa B and I kappa B proteins: new discoveries and insights. Annual review of immunology. 1996, 14: 649-683. 10.1146/annurev.immunol.14.1.649.View ArticlePubMedGoogle Scholar
- Ghosh S, May MJ, Kopp EB: NF-kappa B and Rel proteins: evolutionarily conserved mediators of immune responses. Annual review of immunology. 1998, 16: 225-260. 10.1146/annurev.immunol.16.1.225.View ArticlePubMedGoogle Scholar
- Hayden MS, Ghosh S: Signaling to NF-kappaB. Genes & development. 2004, 18 (18): 2195-2224. 10.1101/gad.1228704.View ArticleGoogle Scholar
- Makarov SS: NF-kappa B in rheumatoid arthritis: a pivotal regulator of inflammation, hyperplasia, and tissue destruction. Arthritis research. 2001, 3 (4): 200-206. 10.1186/ar300.View ArticlePubMedPubMed CentralGoogle Scholar
- Coussens LM, Werb Z: Inflammation and cancer. Nature. 2002, 420 (6917): 860-867. 10.1038/nature01322.View ArticlePubMedPubMed CentralGoogle Scholar
- Calvano SE, Xiao W, Richards DR, Felciano RM, Baker HV, Cho RJ, Chen RO, Brownstein BH, Cobb JP, Tschoeke SK, et al: A network-based analysis of systemic inflammation in humans. Nature. 2005, 437 (7061): 1032-1037. 10.1038/nature03985.View ArticlePubMedGoogle Scholar
- Tegner J, Yeung MK, Hasty J, Collins JJ: Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling. Proceedings of the National Academy of Sciences USA. 2003, 100 (10): 5944-5949. 10.1073/pnas.0933416100.View ArticleGoogle Scholar
- Zou M, Conzen SD: A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics (Oxford, England). 2005, 21 (1): 71-79. 10.1093/bioinformatics/bth463.View ArticleGoogle Scholar
- Santamaria P: Cytokines and chemokines in autoimmune disease: an overview. Advances in experimental medicine and biology. 2003, 520: 1-7.View ArticlePubMedGoogle Scholar
- Foxwell BM, Bondeson J, Brennan F, Feldmann M: Adenoviral transgene delivery provides an approach to identifying important molecular processes in inflammation: evidence for heterogenecity in the requirement for NFkappaB in tumour necrosis factor production. Annals of the rheumatic diseases. 2000, 59 (Suppl 1): i54-59. 10.1136/ard.59.suppl_1.i54.View ArticlePubMedPubMed CentralGoogle Scholar
- Kitano H, Oda K: Robustness trade-offs and host-microbial symbiosis in the immune system. Molecular systems biology. 2006, 2: 2006 0022-10.1038/msb4100039.View ArticlePubMedPubMed CentralGoogle Scholar
- Werner SL, Barken D, Hoffmann A: Stimulus specificity of gene expression programs determined by temporal control of IKK activity. Science (New York, NY). 2005, 309 (5742): 1857-1861.View ArticleGoogle Scholar
- Muzio M, Polentarutti N, Bosisio D, Prahladan MK, Mantovani A: Toll-like receptors: a growing family of immune receptors that are differentially expressed and regulated by different leukocytes. Journal of leukocyte biology. 2000, 67 (4): 450-456.PubMedGoogle Scholar
- Klipp E, Herwig R, Kowald A, Wierling C, Lehrach H: Systems Biology in Practice. Concepts, Implementation and Application. 2005, WILEY-VCHView ArticleGoogle Scholar
- Bar-Joseph Z, Gerber GK, Lee TI, Rinaldi NJ, Yoo JY, Robert F, Gordon DB, Fraenkel E, Jaakkola TS, Young RA, et al: Computational discovery of gene modules and regulatory networks. Nature biotechnology. 2003, 21 (11): 1337-1342. 10.1038/nbt890.View ArticlePubMedGoogle Scholar
- Hood L: Systems biology: integrating technology, biology, and computation. Mechanisms of ageing and development. 2003, 124 (1): 9-16. 10.1016/S0047-6374(02)00164-1.View ArticlePubMedGoogle Scholar
- Davidson EH, McClay DR, Hood L: Regulatory gene networks and the properties of the developmental process. Proceedings of the National Academy of Sciences of the United States of America. 2003, 100 (4): 1475-1480. 10.1073/pnas.0437746100.View ArticlePubMedPubMed CentralGoogle Scholar
- Johansson R: System modeling and identification. 1993, Englewood Cliffs, NJ: Prentice HallGoogle Scholar
- Wu WS, Li WH, Chen BS: Computational reconstruction of transcriptional regulatory modules of the yeast cell cycle. BMC bioinformatics. 2006, 7: 421-10.1186/1471-2105-7-421.View ArticlePubMedPubMed CentralGoogle Scholar
- Pahl HL: Activators and target genes of Rel/NF-kappaB transcription factors. Oncogene. 1999, 18 (49): 6853-6866. 10.1038/sj.onc.1203239.View ArticlePubMedGoogle Scholar
- Bankey PE, et al: Inflammation and the Host Response to Injury large scale collaborative research program. [http://www.gluegrant.org/]
- Klipp E, Nordlander B, Kruger R, Gennemark P, Hohmann S: Integrative model of the response of yeast to osmotic shock. Nature biotechnology. 2005, 23 (8): 975-982. 10.1038/nbt1114.View ArticlePubMedGoogle Scholar
- Breitkreutz BJ, Stark C, Tyers M: Osprey: a network visualization system. Genome biology. 2003, 4 (3): R22-10.1186/gb-2003-4-3-r22.View ArticlePubMedPubMed CentralGoogle Scholar
- Oyama N, Iwatsuki K, Homma Y, Kaneko F: Induction of transcription factor AP-2 by inflammatory cytokines in human keratinocytes. The Journal of investigative dermatology. 1999, 113 (4): 600-606. 10.1046/j.1523-1747.1999.00734.x.View ArticlePubMedGoogle Scholar
- Murakami S, Lefebvre V, de Crombrugghe B: Potent inhibition of the master chondrogenic factor Sox9 gene by interleukin-1 and tumor necrosis factor-alpha. The Journal of biological chemistry. 2000, 275 (5): 3687-3692. 10.1074/jbc.275.5.3687.View ArticlePubMedGoogle Scholar
- Fukuda K, Yoshida H, Sato T, Furumoto TA, Mizutani-Koseki Y, Suzuki Y, Saito Y, Takemori T, Kimura M, Sato H, et al: Mesenchymal expression of Foxl1, a winged helix transcriptional factor, regulates generation and maintenance of gut-associated lymphoid organs. Developmental biology. 2003, 255 (2): 278-289. 10.1016/S0012-1606(02)00088-X.View ArticlePubMedGoogle Scholar
- Imagawa S, Nakano Y, Obara N, Suzuki N, Doi T, Kodama T, Nagasawa T, Yamamoto M: A GATA-specific inhibitor (K-7174) rescues anemia induced by IL-1beta, TNF-alpha, or L-NMMA. Faseb J. 2003, 17 (12): 1742-1744.PubMedGoogle Scholar
- Koyano S, Saito Y, Sai K, Kurose K, Ozawa S, Nakajima T, Matsumoto K, Saito H, Shirao K, Yoshida T, et al: Novel genetic polymorphisms in the NR3C1 (glucocorticoid receptor) gene in a Japanese population. Drug metabolism and pharmacokinetics. 2005, 20 (1): 79-84. 10.2133/dmpk.20.79.View ArticlePubMedGoogle Scholar
- Nakano Y, Imagawa S, Matsumoto K, Stockmann C, Obara N, Suzuki N, Doi T, Kodama T, Takahashi S, Nagasawa T, et al: Oral administration of K-11706 inhibits GATA binding activity, enhances hypoxia-inducible factor 1 binding activity, and restores indicators in an in vivo mouse model of anemia of chronic disease. Blood. 2004, 104 (13): 4300-4307. 10.1182/blood-2004-04-1631.View ArticlePubMedGoogle Scholar
- Choi SJ, Oba T, Callander NS, Jelinek DF, Roodman GD: AML-1A and AML-1B regulation of MIP-1alpha expression in multiple myeloma. Blood. 2003, 101 (10): 3778-3783. 10.1182/blood-2002-08-2641.View ArticlePubMedGoogle Scholar
- Hawkins GA, Amelung PJ, Smith RS, Jongepier H, Howard TD, Koppelman GH, Meyers DA, Bleecker ER, Postma DS: Identification of polymorphisms in the human glucocorticoid receptor gene (NR3C1) in a multi-racial asthma case and control screening panel. DNA Seq. 2004, 15 (3): 167-173.View ArticlePubMedGoogle Scholar
- Kitano H: Biological robustness. Nat Rev Genet. 2004, 5 (11): 826-37. 10.1038/nrg1471.View ArticlePubMedGoogle Scholar
- Albert R: Scale-free networks in cell biology. Journal of Cell Science. 2005, 118: 4947-4957. 10.1242/jcs.02714.View ArticlePubMedGoogle Scholar
- Boldrick Jennifer, et al: Stereotyped and specific gene expression programs in human innate immune responses to bacteria. Proc Natl Acad Sci USA. 99 (2): 972-7. 10.1073/pnas.231625398. 2002 Jan 22;
- Le Y, Murphy PM, Wang JM: Formyl-peptide receptors revisited. Trends Immunol. 23: 541-548. 10.1016/S1471-4906(02)02316-5.
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1755-8794/1/46/prepub