Skip to main content

Table 1 Datasets and sample distributions

From: On Predicting lung cancer subtypes using ‘omic’ data from tumor and tumor-adjacent histologically-normal tissue

Dataset Source

Tissue type

ADC

SCC

GEO: GDS3257 (gene expression)

Tumor

58

***

TAHN

49

***

TCGA: LUAD+LUSC (gene expression)

Tumor

32

153

TAHN

***

***

TCGA: LUAD+LUSC (DNA methylation)

Tumor

65

132

TAHN

24

27

  1. See challenge in Background on lack of TAHN tissue availability (***). GEO gene expression platform: Affymetrix Human Genome U133A Array (22,283 features), TCGA gene expression platform: Agilent 244 K Custom Gene Expression (17,814 features). TCGA methylation platform: Illumina Infinium HumanMethylation 27 k (27,578 features)