Skip to main content

Table 1 Distribution of training and hold-out test datasets utilized for algorithm development

From: Leveraging artificial intelligence to predict ERG gene fusion status in prostate cancer

Dataset Subset Patients ERG status Gleason grade group
    Positive Negative 1 2 3 4 5
TCGA cohort Training set 235 123(52%) 112(48%) 41 68 60 32 34
Internal cohort Training set 26 11(42%) 15(58%) 1 17 6 0 2
  Hold-out test set 131 60(46%) 71(54%) 0 67 31 4 29
  1. Training subset includes initial training, cross-validation and testing sets. Hold-out test set refers to a separate subset of cases not included as part of the training subset. TCGA The Cancer Genome Atlas