The flowchart of constructing the network-based biomarker for lung cancer investigation and diagnosis. The figure indicates the flowchart of the proposed method. Red represents the data needed. Blue denotes the processing steps of the approach. Green represents the processed results of each step and orange denotes the overall results from the entire method. In summary, two kinds of data, microarray data and PPI information, are needed for the proposed method. These data are used for protein pool selection, and then the selected proteins and the input data are used for protein association network construction, resulting in cancer protein association network (CPAN) and non-cancer protein association network (NPAN). The two constructed protein association networks form the overall network-based biomarker, which can be used for either determination of significant proteins or diagnostic evaluation. With the help of the network-based biomarker, carcinogenesis relevance value (CRV) is computed for each protein, and significant proteins in lung carcinogenesis are determined based on the CRVs. These significant proteins provide targets for further characterization. On the other hand, given the microarray data for smokers suspect of cancer, mapping errors for CPAN and NPAN can be computed, respectively, which help diagnose the smokers with cancer or without cancer.