Skip to main content


Fig. 1 | BMC Cancer

Fig. 1

From: Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA

Fig. 1

Model training overview and CV procedures. a All methods were trained on k-fold, and the best performing method was chosen to train models for the other cross-validation procedures. Diagram describes individual steps in common to all methods. Models are trained on a given dataset and set of methods (i.e., dimension reduction and classification) and then evaluated, resulting in a performance estimate. b Illustration of CV procedures for k-fold, k-batch, ordered k-batch, and balanced k-batch. Each square represents a single sample, with the fill color indicating class label, the border color representing a confounding factor like institution, and the number indicating processing batch. Each column represents a possible fold constructed for the given CV procedure. The dashed line separates the test set of samples held out from the training set

Back to article page