Skip to main content

Table 3 Statistical analysis process

From: The EPICURE study: a pilot prospective cohort study of heterogeneous and massive data integration in metastatic breast cancer patients

STEP 1

STEP 2

STEP 3

Collection of data:

. Coordination in relation with clinical teams and clinical research units identified in the EPICURE project

. Coordination of all platforms and units to avoid missing data

DATABASE

PREPROCESSING:

Use of efficient statistical tools for the reduction of the dimension in order to get p < 5000 at least for n = 300 patients:

. Sparse Canonical Correlation analysis (in each class of variables)

. Principal Component Analysis, Partial Least Squares

Mathematical development

Adaptation of recent statistical methods of Data Mining to manage this high-dimension problem such as

.LASSO/SLOPE methods (which select solutions with a weak number of « lighted » variables)

and their variants adapted to the problem (Sparse Cox model to manage the censured data