Skip to main content

Table 1 Performance of Random Forest classifier for HMB boundaries relative to other genomic regions

From: Distinct genomic and epigenomic features demarcate hypomethylated blocks in colon cancer

  Sensitivity Specificity F-measure AUC Size of data set
Boundary vs. Inside 0.90 0.89 0.90 0.96 41,425
Boundary vs. Outside 0.84 0.81 0.83 0.91 41,430
Boundary vs. Promoter 0.98 0.97 0.98 0.99 31,051
Boundary vs. Promoter (SVM) 0.97 0.97 0.97 0.99 31,051
  1. ‘Inside’ and ‘outside’ refer to regions inside or outside HMBs, respectively. These regions were selected to match the length and CG content of HMB boundaries (see Methods). The last row contains the results of a Support Vector Machine classifier that was used to replicate the Random Forest result on the HMB boundary vs. Promoter region classification. In all cases, 70 % of the data was used as training, and 30 % was used for testing. Sensitivity, Specificity and F-measure were noted as the optimal F-measure