Statistics information of integrated databases
| | | |
---|
Database
|
Organism
|
# of gene
|
# of relation
|
---|
PID+KEGG+TRANSFAC
|
Homo sapiens
|
8173
|
9308
|
Reactome
|
Homo sapiens
|
538
|
31240
|
Statistics information on each of the three databases
| | | |
Database
|
# of TFs
|
# of target gene parsed
|
# of pairing regulate relation parsed
|
TRANSFAC
|
157
|
825
|
529625
|
Database
|
# of pathways
|
# of gene, protein, enzyme parsed
|
# of relation parsed
|
PID + KEGG
|
197
|
18937
|
8880
|
PID
|
60
| | |
KEGG
|
137
| | |
- We integrated the PID (the date of version, July 15, 2008), KEGG (release 47.0, July 1, 2008) and TRANSFAC public databases (version 7.0), and further eliminated duplicated reactions and elements. Accordingly, 8173 genes and 9308 interactions were remained. To assess the importance of genes within each filtered pathway, we also implemented the betweenness centrality and degree centrality for each node. The degree and betweenness centrality of genes were calculated using the Reactome database [31] as a base to cross validate our experimental results. Pathways downloaded from PID and KEGG were parsed by batch processing. A gene (or protein) may be involved in several pathways, which means some genes were repeated. Therefore, the number of parsed entity (including genes, proteins, and enzymes) was 18937. Moreover, one gene may be regulated by several TFs, or one TF may regulate numerous target genes. As a result, the total number of pairing regulate relation parsed was 529625.