The BPC3 has been described in detail elsewhere . Briefly, the consortium includes large, well-established cohorts assembled in the United States and Europe, that have both DNA samples and extensive questionnaire information. These include: the American Cancer Society Cancer Prevention Study II (CPS-II) , the European Prospective Investigation into Cancer and Nutrition (EPIC) , the Harvard Nurse's Health Study (NHS)  and Women's Health Study (WHS) , and the Multiethnic Cohort (MEC) .
Cases were identified in each cohort by self report with subsequent confirmation of the diagnosis from medical records or tumor registries, and/or linkage with population-based tumor registries (method of confirmation varied by cohort). Controls were matched to cases by ethnicity and age, and in some cohorts, additional criteria, such as country of residence in EPIC.
Most of the subjects were Caucasians of European descent. One cohort (MEC) provided most of the non-Caucasian samples. In total, we genotyped 4,401 Caucasian cases and 5,966 controls, 329 Latino cases and 385 controls, 341 African American cases and 426 controls, 425 Japanese American cases and 418 controls, and 107 Native Hawaiian cases and 285 controls.
Written informed consent was obtained from all subjects, and the project has been approved by the competent institutional review boards for each cohort.
Selection of haplotype tagging single nucleotide polymorphisms (htSNPs)
We sequenced exons and intron/exon junctions of GNRH1 and GNRHR in a panel of 95 metastatic breast cancer cases from the MEC and EPIC. These included 19 cases from each ethnic group represented in the study (African American, Latino, Japanese, Native Hawaiian, and Caucasian). About 45 kb were surveyed for GNRH1 and about 56 kb for GNRHR. No non-synonymous or splice-site variants were identified in sequencing of the exons.
Based on the resequencing and SNPs available in dbSNP, we identified 17 SNPs in GNRH1 and 36 SNPs in GNRHR with minor allele frequency greater than 5% in any of the five ethnic groups or greater than 1% overall. These SNPs were genotyped in a reference panel of 349 healthy women (70 African-Americans, 68 Latinos, 72 Japanese, 70 Caucasians, and 69 Hawaiians from the MEC cohort who had not been diagnosed with breast cancer at the time of the study; average age 65.1 (standard deviation 8.5)) at the Broad Institute (Cambridge, MA, USA) using the Sequenom (San Diego, CA, USA) and Illumina (San Diego, CA, USA) platforms.
Haplotype tagging SNPs (htSNPs) were then selected using the method of Stram et al.  to maximize R2
H among Caucasians. Three htSNPs were selected for GNRH1 (including one localized in the 5' neighboring gene, KCTD9, and one in the gene at the 3', DOCK5) and seven for GNRHR.
Genotyping of htSNPs was performed in 3 laboratories (University of Southern California, Los Angeles, CA, USA; Harvard School of Public Health, Boston, MA, USA; International Agency for Research on Cancer, Lyon, France) using a fluorescent 5' endonuclease assay and the ABI-PRISM 7900 for sequence detection (TaqMan). Initial quality control checks of the SNP assays were performed by the manufacturer (Applied Biosystems, Foster City, CA, USA); an additional 500 test reactions were run at the University of Southern California. Characteristics for the 10 TaqMan assays are available on a public website http://www.uscnorris.com/mecgenetics/CohortGCKView.aspx. Sequence validation for each SNP assay was performed on samples from the SNP500 project http://snp500cancer.nci.nih.gov and 100% concordance was observed. To assess inter-laboratory variation, each genotyping center ran assays on a designated set of 94 samples from the Coriell Biorepository (Camden, NJ, USA) included in SNP500. The internal quality of genotype data at each genotyping center was assessed by typing 5–10% blinded samples in duplicate or triplicate (depending on study).
Circulating serum hormones were measured at the International Agency for Research on Cancer for EPIC and MEC samples and at the Harvard School of Public Health for NHS samples, for a total of 4713 subjects (1405 cases and 3308 controls, 1120 pre-menopausal and 3593 post-menopausal subjects). The different assays for hormone analyses were chosen on the basis of a previously published comparative validation study . Estradiol (E2), estrone (E1) and androstenedione (Δ4) were measured by direct double-antibody radioimmunoassays from DSL (Diagnostic Systems Laboratories, Texas), while testosterone (T) was measured by direct radioimmunoassays from Immunotech (Marseille, France). Measurements were performed on never thawed serum sample aliquots. Mean intrabatch and interbatch coefficients of variation were 5.8 and 13.1%, respectively, for E2 (at a concentration of 250 pmol/l), 10.2 and 12.6% for E1 (at 75 pmol/l), and 4.8 and 18.9% for Δ4 (at 1.40 nmol/l), 10.8 and 15.3% for T (at 1.40 nmol/l).
We used conditional multivariate logistic regression to estimate odds ratios (ORs) for invasive breast cancer in subjects with a linear (log-odds additive) scoring for 0, 1 or 2 copies of the minor allele of each SNP. We also used conditional logistic regression with additive scoring and the most common haplotype as the referent to estimate haplotype-specific ORs using an expectation-substitution approach to assign haplotypes based on the unphased genotype data and to account for uncertainty in assignment [16, 17]. Haplotype frequencies and expected subject-specific haplotype indicators were calculated separately for each cohort (and country within EPIC or ethnicity in the MEC). We combined rare haplotypes (those with estimated individual frequencies less than 3% in all cohorts) into a single category, which had a combined frequency of less than 1% of the controls for both genes and both linkage disequilibrium (LD) blocks of GNRHR. To test the global null hypothesis of no association between variation in GNRH1/GNRHR haplotypes and htSNPs and risk of invasive breast cancer (or subtypes defined by receptor status), we used a likelihood ratio test comparing a model with additive effects for each common haplotype (treating the most common haplotype as the referent) to the intercept-only model.
We performed subgroup analyses stratifying by cohort, ethnicity, country within EPIC, estrogen receptor/progesterone receptor status, metastatic vs. localized disease, and age at diagnosis (≤55 years vs. >55 years). We also investigated interactions between single SNPs or haplotypes and completion of a full term pregnancy (ever/never), age at first full term pregnancy (in three categories: nulliparous, ≤24, >24), body mass index (BMI in kg/m2 in three categories: <25, 25–29, ≥30), height (<160 cm, 160–165 cm, >165 cm), smoking status (never/former/current smoker), and use of menopausal hormone therapy (ever/never). Other common risk factors, including family history of breast cancer, personal history of benign breast disease, and age at menopause were unavailable for large numbers of women, and therefore were not included in the models.
Relationships of genetic variants with serum hormone levels were estimated by standard regression models, adjusted for BMI, age, assay batch, ethnicity, and country within EPIC. These analyses were performed both using all the study subjects for whom hormone levels have been measured, and only the controls, who represent the populations giving rise to the cases.