Skip to main content

Table 1 Distribution of training and hold-out test datasets utilized for algorithm development

From: Leveraging artificial intelligence to predict ERG gene fusion status in prostate cancer

Dataset

Subset

Patients

ERG status

Gleason grade group

   

Positive

Negative

1

2

3

4

5

TCGA cohort

Training set

235

123(52%)

112(48%)

41

68

60

32

34

Internal cohort

Training set

26

11(42%)

15(58%)

1

17

6

0

2

 

Hold-out test set

131

60(46%)

71(54%)

0

67

31

4

29

  1. Training subset includes initial training, cross-validation and testing sets. Hold-out test set refers to a separate subset of cases not included as part of the training subset. TCGA The Cancer Genome Atlas