Skip to main content

Table 3 Ten most frequently used methods to analyze each dataset

From: Developing a healthcare dataset information resource (DIR) based on Semantic Web

Dataset Methods     
NHANES EM algorithm Neural network model Wilcoxon signed-rank test Poisson regression Chi-squared test
  29.55% 19.63% 16.69% 15.02% 14.85%
  Kruskal-Wallis test Logistic regression Log-rank test Linear regression T-test
  14.32% 12.56% 12.17% 10.04% 8.51%
SEER-medicare Chi-squared test Logistic regression Cox regression Log-rank test Survival analysis
  54.52% 50.83% 39.64% 17.46% 14.87%
  T-test Regression model Kaplan-Meier survival estimates Linear regression Propensity score matching
  11.12% 10.45% 9.34% 8.85% 7.01%
Add health Logistic regression Chi-squared test Linear regression Regression model Principal component analysis
  50.00% 33.17% 13.13% 9.82% 8.07%
  ANOVA Poisson regression T-test Propensity score matching Cox regression
  7.49% 5.74% 5.06% 3.40% 3.40%
HCUP Logistic regression Chi-squared test Linear regression T-test Regression model
  57.91% 48.44% 20.24% 18.03% 15.61%
  ANOVA Poisson regression Cox regression Mann-Whitney U test Bootstrap
  9.87% 9.06% 7.45% 7.35% 4.23%
MDS Logistic regression Chi-squared test Linear regression Regression model T-test
  42.12% 39.73% 17.29% 14.90% 13.53%
  ANOVA Cox regression Mann-Whitney U test Bootstrap Survival analysis
  13.18% 9.93% 7.19% 4.11% 3.77%
CPRD Logistic regression Cox regression Chi-squared test Poisson regression Propensity score matching
  42.35% 31.03% 18.87% 12.37% 10.48%
  Linear regression Regression model Survival analysis T-test Kaplan-Meier survival estimates
  9.85% 8.60% 6.08% 5.66% 4.61%
MarketScan Chi-squared test Logistic regression Cox regression T-test Poisson regression
  [-2pt]47.88% 43.32% 19.22% 12.87% 12.21%
  Propensity score matching Linear regression Regression model ANOVA Fisher’s exact test
  10.91% 9.93% 9.77% 6.68% 5.86%
THIN Logistic regression Cox regression Chi-squared test Poisson regression Regression model
  37.33% 26.04% 23.27% 12.44% 9.91%
  Inverse probability weighting Linear regression T-test Survival analysis Propensity score matching
  8.99% 8.53% 8.06% 6.91% 6.68%
MIMIC Logistic regression Chi-squared test T-test Mann-Whitney U test Regression model
  45.39% 20.39% 17.76% 15.79% 14.47%
  Support vector machine Linear regression Cox regression Kolmogorov-Smirnov test K-nearest neighbors
  14.47% 11.84% 11.18% 9.87% 9.21%
Premier Chi-squared test K-means Decision tree model Logistic regression Propensity score matching
  41.05% 38.95% 27.37% 21.05% 14.74%
  Kruskal-Wallis test Linear discriminant analysis Regression model Linear regression T-test
  13.68% 11.58% 11.58% 8.42% 8.42%
Clinformatics Linear regression Bootstrap Regression model Kruskal-Wallis test Chi-squared test
  44.90% 28.57% 20.41% 14.29% 12.24%
  F-test Cox regression Logistic regression ANOVA Survival analysis
  12.24% 10.20% 10.20% 8.16% 6.12%
Humedica Chi-squared test Logistic regression Bootstrap Fisher’s exact test Cox regression
  33.33% 22.22% 22.22% 22.22% 11.11%
  T-test Linear regression Propensity score matching Survival analysis Ensemble learning
  11.11% 11.11% 11.11% 11.11% 11.11%