Skip to main content

Runs of homozygosity and inbreeding in thyroid cancer



Genome-wide association studies (GWASs) have identified several single-nucleotide polymorphisms (SNPs) influencing the risk of thyroid cancer (TC). Most cancer predisposition genes identified through GWASs function in a co-dominant manner, and studies have not found evidence for recessively functioning disease loci in TC. Our study examines whether homozygosity is associated with an increased risk of TC and searches for novel recessively acting disease loci.


Data from a previously conducted GWAS were used for the estimation of the proportion of phenotypic variance explained by all common SNPs, the detection of runs of homozygosity (ROH) and the determination of inbreeding to unravel their influence on TC.


Inbreeding coefficients were significantly higher among cases than controls. Association on a SNP-by-SNP basis was controlled by using the false discovery rate at a level of q* < 0.05, with 34 SNPs representing true differences in homozygosity between cases and controls. The average size, the number and total length of ROHs per person were significantly higher in cases than in controls. A total of 16 recurrent ROHs of rather short length were identified although their association with TC risk was not significant at a genome-wide level. Several recurrent ROHs harbor genes associated with risk of TC. All of the ROHs showed significant evidence for natural selection (iHS, Fst, Fay and Wu’s H).


Our results support the existence of recessive alleles in TC susceptibility. Although regions of homozygosity were rather small, it might be possible that variants within these ROHs affect TC risk and may function in a recessive manner.

Peer Review reports


Thyroid cancer (TC) is the most common malignancy of the endocrine system with incidence rates being 2 to 3 times higher in women compared with men [1, 2]. In economically developed countries, 0.5 to 10 TC cases are diagnosed per 100 000 individuals each year [1]. Significant regional differences are seen in Europe with Italy being among the countries with the highest incidence rates in the world (Cancer Incidence in Five Continents, IX, 2000, While exposure to ionizing radiation or insufficient iodine intake is an established risk factor, anthropometric risk factors such as high body surface area, great height, or excess weight have been associated with increased TC risk [3]. However, TC is also characterized by having one of the highest familial risks of any cancer supporting heritable predisposition [46]. A high risk of TC is associated with some genetic disorders, but most of the familial risk of TC remains unexplained [7]. During the last years genome-wide association studies (GWASs) have provided robust evidence for common susceptibility to TC. At least four GWASs have identified a set of genes with susceptibility for TC [811]. These studies suggest that much of the familial risk of TC may be due to the coinheritance of multiple low/moderate-penetrant alleles, some of which may be common. The majority of cancer predisposition genes identified through the GWASs function in a co-dominant manner, and no evidence has been found for recessively functioning disease loci in TC, although the risk for TC among siblings is much higher than the parent-offspring risk, suggesting recessive inheritance [6]. Recessive inheritance has been associated with consanguinity or an increased risk in populations characterized by a higher degree of inbreeding and corresponding homozygosity [12]. A consecutive pattern, called runs of homozygosity (ROH), appears mainly in an increased frequency due to a high level of relatedness between individuals within a population or due to selection [13]. These ROHs are shown to predispose to many genetic diseases including cancers [1416]. The siblings-risk and the fact and that TC is part of recessively inherited syndromes such as the Werner syndrome make TC an ideal target to search for recessively acting disease loci [6, 7].

In a first step we estimated the proportion of the total phenotypic variance explained by all common SNPs for TC risk. This was followed by a whole-genome homozygosity analysis based on our previous GWAS in the high-incidence Italian population. The aim of our study was to examine whether inbreeding or homozygosity is associated with an increased risk of TC and to search for novel recessively acting disease loci.


Ethics statement

Study participants were recruited according to the protocols approved by the institutional review boards in accordance with the Declaration of Helsinki. All subjects provided written informed consent. This study was approved by the ethics committees of the University Hospitals of Cisanello and Santa Chiara in Pisa, Italy and of the Meyer Hospital in Florence, Italy.

Genomic data - quality control of SNP genotyping

The study is based on the genotyping data of our previously performed GWAS on the Italian cases and controls, and did not include any new participants [11, 17]. All patients were ascertained with papillary thyroid cancer (PTC) through the University Hospital Cisanello in Pisa. After a stringent quality control procedure the final set consisted of 649 cases and 431 controls with genotype data on 536 270 SNPs [18, 19]. Data have been submitted to a central database:

Proportion of the total phenotypic variance explained by all common SNPs

The approach of Yang et al. was used to estimate the proportion of the total phenotypic variance explained by all common SNPs [20]. First, we estimated the genetic relationship matrix (GRM) for each individual autosome of all the individuals and fitted the GRMs in a mixed linear model (MLM) to estimate the proportion of the phenotypic variance explained by all common SNPs. We repeated this scenario after excluding 15 identified GWAS regions for TC including the genomic region 500 kb upstream and downstream [11, 17]. This left us with a total of 520 137 autosomal SNPs.

For both scenarios sex and eigenvectors from 10 principal components of the population structure were used as covariates. Consecutive estimates on the observed 0–1 scale are linearly transformed to that on the unobserved continuous liability scale such that h 2l  = h 20 K(1 − K)/z2 [21], where K is the prevalence of the disease and z is the value of the standard normal probability density function at the threshold t. Given an incidence of 8 – 9/100 000/year will result in a cumulative risk of ~ 6 in 1000 as an estimate of the prevalence. Estimation was performed using restricted maximum likelihood (REML) via the genome-wide complex trait analysis (GCTA) software [22].

Genome-wide assessment of associations between homozygosity at individual SNPs and TC

A chi2-test was performed to test for any association between homozygosity and susceptibility of TC on a SNP-by-SNP basis in our entire sample series [14]. To control the problem of multiple testing the false discovery rate (FDR) was calculated and controlled at an arbitrary level q* < 0.05 [23].

Statistical and bioinformatics analysis

We defined ROHs following recommendations in Howrigan et al. [24] ROHs were detected using PLINK (v1.07) software. To prevent overestimating the number and size of ROHs no heterozygous SNPs were permitted in any window. We kept the remaining options to default values. The parameter for the “homozyg-kb” option was also kept at the default value of 1000 kb to select individual segments of minimal length. We only varied the parameter “homozyg-snp” option according to the definition of ROHs as below. Subsequent statistical analyses were performed using packages available in the R statistics package [25]. Comparison of the distribution of categorical variables was performed using the chi2-test. To compare the difference in the average number of ROHs between cases and controls, we used the Student’s t-test. Naive adjustment for multiple testing was based on the Bonferroni correction.

Identification of homozygosity

We used the method of Lencz et al. to estimate the minimum number of consecutive homozygous SNPs required to form a ROH that was more than an order of magnitude larger than the mean haploblock size in the human genome without being too large to be very rare [26]. In our TC data, with 1080 individuals and 536 270 SNPs, the mean heterozygosity in controls was calculated to be 35 %. Thus, a minimum length of 53 would be required to produce <5 % randomly generated ROHs across all subjects: ((1–0.35)53 × 536 270 × 1080 ≤ 0.05). Due to linkage disequilibrium (LD) between the SNPs, the SNP genotypes are not always independent. Pairwise LD was estimated using the SNP pruning function of PLINK, with a default value of r 2 > 0.8 and restricting the search of tagging SNPs within each 250 kb window. Approximately 377 000 separable tag groups were discovered, representing an >25 % reduction of information compared with the original number of SNPs. Thus, ROH length of 75 was used to approximate the degrees of freedom of 53 independent SNP calls.

The R statistics package was used to identify a list of ‘common’ ROHs with 75 consecutive homozygous SNP calls across a certain number of samples and with each ROH having identical start and end locations across the individuals. The “homozyg-group” option of the PLINK package was used to produce a file of the overlapping ROHs separated into pools containing the number of cases and controls carrying the ROH. We considered pools with more than five samples and at least 500 kb of length as recurrent ROHs. A consensus SNP set representing the minimal overlapping region across all samples in the pool was used to define the recurrent ROHs. The association of the recurrent ROHs was then tested for differences of the average proportion of ROHs among cases and controls. Within each overlapping ROH the proportion of homozygous genotypes at each SNP was calculated for cases and controls separately, and the significance of the difference was tested by a one-tailed t-test.

Testing the effects of natural selection

We used three metrics, the integrated haplotype score (iHS), the fixation index (Fst) and Fay and Wu’s H to investigate the selective pressure due to demographic events (e.g. bottleneck events, founder effects or population isolation) on each recurrent ROH [27, 28]. All metrics were obtained from Haplotter Software (University of Chicago, Chicago, IL, USA; [28, 29].

Testing the effects of inbreeding

To test whether inbreeding influenced the susceptibility to TC, three different inbreeding coefficients (F I, F II and F III) were derived for each individual based on their SNP data using GCTA [22]. The coefficients were tested for differences between cases and controls using a Student’s t-test. We also used a generalized linear regression model (GLM) and regressed F I, F II or F III as explanatory variables on the disease status of the TC patient as the binary response (0/1). We included several covariates in the model: the sex of the individuals, the first 10 ancestry-informative principal components and the percentage of SNPs missing for an individual.

A genomic measure of individual homozygosity (FROH) was calculated by a method proposed by McQuillan et al. [30] in which LROH is the sum of ROHs per individual above a certain criterion length (i.e. 1000 kb as defined beforehand) and LAUTO is the total SNP-mappable autosomal genome length, excluding the centromeres:

$$ {\mathrm{F}}_{\mathrm{ROH}}={\displaystyle \sum {\mathrm{L}}_{\mathrm{ROH}}/\ {\mathrm{L}}_{\mathrm{AUTO}}} $$

The estimate of the total genome captured was 2 677 608 286 bp. FROH estimates inbreeding differently compared to the coefficients based on SNP-by-SNP indices F I, F II and F III as it considers only homozygous regions above a pre-defined length criterion (i.e. 1000 kb). Due to the FROH distribution in our sample we divided ROHs into two classes, below and above 1500 kb, and FROH was calculated overall, and for the two subclasses using the R statistics package [25]. The overall FROH was also tested for differences between cases and controls using a Student’s t-test.


After stringent quality control and exclusion of extreme population outliers the overall genetic matching was satisfying with a genomic control inflation factor at λgc = 1.00 within the prior GWAS, indicating that no population stratification was present [11].

Proportion of total phenotypic variance explained by SNPs

The proportion of the total phenotypic variance explained by SNPs from the joint analysis transformed to the liability scale after Dempster and Lerner showed a value of 0.51 (SE 0.16 at P ≤ 1.97 × 10−7) [21]. After the exclusion of the regions covered by the previously identified TC risk SNPs the proportion of the total phenotypic variance explained by the so far unidentified SNPs was 0.33 (SE 0.15 at P ≤ 0.003). While most of variance explained by common SNPs for individual autosomes stayed constant, a major drop was detected for chromosome 2 encompassing DIRC3 (from 0.11 to 0.03) and for chromosome 9 encompassing FOXE1 (from 0.17 to 0.08).

Genome-wide assessment of associations between homozygosity at individual SNPs and susceptibility to TC

Results of the association between homozygosity and the susceptibility to TC on a SNP-by-SNP basis are shown in Table 1. The FDR was calculated and controlled at an arbitrary level q* < 0.05, for which 34 SNP were significant [23]. Corresponding odds ratios (ORs) of the one-sided Fisher’s exact test to prove the hypothesis that increased homozygosity is associated with higher risk of TC showed a minimum of OR = 1.85 with a 95 % confidence interval of 1.23–3.41 for all SNPs in Table 1.

Table 1 Association between homozygosity and susceptibility to TC for individual SNPs

Identification of ROHs and association between ROHs and TC susceptibility

We identified a total of 12 306 individual ROHs greater than 1000 kb across all 1080 individuals with 7523 ROHs in cases and 4783 ROHs in controls. On average 11.39 ROH segments with a total overall length of 22 980 kb per individual were detected. The average number of ROH segments per person in cases was 11.59 and in controls 11.09 (P diff  = 4.00 × 10−2), the total length of ROHs per person was 4761 kb higher in cases than in controls (P diff  = 1.95 × 10−5), and the average ROH length per person in kb was significantly higher in cases (1988 kb) than in controls (1788 kb) (P diff  = 3.29 × 10−8).

We extended the tests for association between ROHs and susceptibility to TC by categorizing the number of ROHs and the total length of ROHs in Mb by forming control groups of similar size. They were compared with the numbers of cases within the corresponding classes (Table 2). Cases had more ROHs and the total length of ROHs was also longer than in controls. (e.g. for entire data set >15 ROHs, OR = 1.55, P = 0.02; for >25.4 Mb, OR = 1.45, P = 0.03).

Table 2 Association between overall ROH and TC (min. 75 SNPs per ROH)

For further association analysis 2262 consensus groups were formed, of which a total of 225 ROHs were identified, that fulfilled the criteria of identical start and end location and at least 75 consecutive homozygous SNPs [26]. An example for an overlapping region is given in the Additional file 1: Figure S1. None of the ROHs were associated with susceptibility to TC after correction for multiple testing. However, 16 ROHs were associated at a suggestive level (P < =0.05) (Table 3). None of them encompassed the centromeric regions.

Table 3 List of ROHs associated with TC

Intriguingly, several recurrent ROHs harbor genes that have been associated with risk or progression of TC (Table 3). The first consensus region, located on chromosome 2, shows the strongest association with TC susceptibility (uncorrected P value = 0.002, ROH1 in Table 3). Six cases and 15 controls carried a ROH spanning this region of 79 homozygous SNPs. Another consensus region on chromosome 3 (ROH2) spans 672 kb and contains 98 SNPs. Genes and predicted transcripts include GSK3B, FSTL1, LRRC58, GPR156. A consensus region on chromosome 10 spanning 81 SNPs on a length of 959 kb (ROH3, P = 0.01) also hosts a considerable number of genes.

To scrutinize the significant ROH consensus regions, the average homozygosity for all SNP loci within a corresponding ROH was computed for cases and controls separately and tested for a difference with a one-tailed Student’s t-test (Table 3, column 9). Ten ROHs showed significant differences at P < 0.05 level, of which 6 had more cases than controls.

Natural selection as a cause of ROHs

To assess the influence of selection on the recurrent ROH regions, we used the measures iHS, Fst and Fay and Wu’s H [28, 29, 31, 32]. Every recurrent ROH showed significant values for the three estimates (iHS >2.0, Fst >0.2 and Fay and Wu’s H < <−10; Table 3), except for ROH3, for which the iHS value was 1.85. This indicates that each of the 16 ROH regions might be the result of a selective sweep.

Inbreeding and association between homozygosity and TC

We formally calculated the inbreeding coefficients (so called F I, F II and F III) after Yang et al. for all samples [22]. The means (SDs) for F I in cases and controls were 0.003 (0.01) and -0.0005 (0.006), respectively, and significantly different from each other (P = 2.94 × 10−13, by Student’s t-test). Thus, there was significant evidence that cases were more inbred than controls. This was supported by the inbreeding coefficient F III, which also differed significantly between cases and controls at P = 3.77 × 10−6 with cases being more inbred. The inbreeding coefficient F II was in cases 0.002 (0.01) and in controls 0.001 (0.007), but differences were not significant. Table 4 lists the P values for the test of true differences of F I, F II and F III between cases and controls separately for each chromosome. Chromosomes 2, 4, 5 and 8 were significantly different. For all chromosomes cases showed higher values for F I, F II and F III than controls.

Table 4 P-values for differences of inbreeding coefficients F I, F II and F III between cases and controls

When using a GLM with several covariates and regressing the explanatory variables F I, F II or F III on the disease status of the TC patient as the binary response (0/1), F I and F III remained significant at P = 0.003 with a positive effect estimate of 32.19 and 64.38, respectively. This results in an increasing slope of the regression line towards the diseased individuals. F II was also significant at P = 0.01.

A more detailed overview on the characteristics of the inbreeding coefficient for cases and controls is demonstrated in Fig. 1, which shows the variation of the inbreeding coefficient between chromosomes. The mean is rather constant across the chromosomes but the variation is increasing from chromosome 1 to 22 while the length of the chromosomes in base pairs is decreasing (r = −0.80, P = 6.51 × 10−6).

Fig. 1

Variation of inbreeding coefficients among chromosomes for cases (a) and controls (b). The boxplot figure shows the means and variation of the inbreeding coefficient F I for autosomal chromosomes 1 to 22 for cases (red) and controls (green)

Three additional associations for different consanguinity measures were tested (Fig. 2). The total length of individual ROHs was highly correlated with the total number of ROHs per individual (r = 0.77, P < 2.20 × 10−16). A significant association was also determined for the total length of ROHs per individual and the individual inbreeding coefficient F III (r = 0.83, P < 2.20 × 10−16) and for the total number of ROHs per individual and the individual inbreeding coefficient F III (r = 0.55, P < 2.20 × 10−16).

Fig. 2

Pearson's correlation coefficients for different consanguinity measures. The total length of individual ROHs is highly correlated with the total number of ROHs per individual (r = 0.77, P < 2.20*10-16) (a). A high association is determined for the total length of ROHs in Mb and the individual inbreeding coefficient F III (r = 0.83, P < 2.20*10-16) (b), while the lowest association was determined for the total number of ROHs per individual and the individual inbreeding coefficient F III (r = 0.55, P < 2.20*10-16) (c)

Finally, FROH was also 0.22 units of standard deviation SD (P = 1.95 × 10−5) higher in cases than controls. The correlation between the inbreeding coefficients and FROH were also highly significant (F I: r = 0.71, P = 2.20 × 10−16; F II: r = 0.72, P = 2.20 × 10−16; F III: r = 0.83, P = 2.20 × 10−16).


Based on our previous GWAS we showed here that the proportion of the total phenotypic variance in TC risk explained by all common SNPs is about 0.51. After correcting for identified TC risk loci about two-thirds of the genetic variance remain to be identified [11, 17]. This fact clearly shows the high influence of both the genetic factors and the environment on the susceptibility of TC. In the present study, we sought to find other genetic explanations than genes identified through previous GWASs that function in a co-dominant manner. The focus was shifted towards recessive inheritance. The current work is to our knowledge the first analysis of the influence of genomic homozygosity and genomic inbreeding on the susceptibility to TC.

Already the genome-wide SNP-by-SNP analysis showed significantly higher proportion of homozygous genotypes among the cases than controls. Further downstream analyses revealed significant differences between cases and controls in terms of the number and length of ROHs per person.

It is known that homozygosity can be caused by demographic events, consanguinity/inbreeding or selective pressure [33, 34]. Most of the ROHs in our study were rather short though. This excludes recent consanguinity as the cause of inbreeding. However, the significant genomic inbreeding coefficients point to a certain level of relatedness that might remain from distant consanguinity. All the ROHs of interest showed significant evidence for natural selection (iHS, Fst, Fay and Wu’s H) [28]. The influence of selective pressure on the ROH length can therefore not be excluded.

The analysis of specific overlapping ROHs did not result in a genome-wide significance, however, several ROHs were matching with regions that contain genes related to TC susceptibility. The majority of overlapping ROHs was absent in controls. Homozygosity in these ROHs might have been disappeared over time due to recombination. Only for ROH1, ROH3, ROH4 and ROH5 we detected more controls than cases to be homozygous for an overlapping ROH region. One of these, ROH5 overlaps with long contiguous stretches of homozygosity from other studies [35, 36]. However, in 10 out of 16 consensus regions significantly higher amount of homozygous SNPs were observed among cases than among controls. Thus, the inheritance of recessive genes harbored in these regions might be possible.

Our study shows some evidence of an association between extended stretches of homozygosity and an increased TC risk. This result is not unexpected as several studies before have detected association between ROHs and cancer susceptibility [16].

The novel result of our study is the significant effect of genomic inbreeding among cases and its relevant effect on the development of the disease. The inbreeding coefficients F I, F II, and F III were significantly higher in cases than in controls, even after correcting for numerous covariates using GLM. Inbreeding is supposed to reduce fitness by causing an overabundance of homozygous loci and increasing the probability of deleterious rare alleles that lead to inbreeding depression [37]. As inbreeding is related to homozygosity, the chances of offspring being affected by recessive or deleterious traits are therefore increased [38]. In fact, the assumption that a higher level of inbreeding or increased homozygosity correlates with cancer incidence has been proven already before on the genomic level [16].

Even the results of the FROH support the higher inbreeding among cases compared with controls, although FROH is discarding SNPs in regions outside of ROHs that are below our stringent length criterion. The fact, that we found no significant differences among cases and controls in the mean sum of shorter ROHs but highly significant differences for the longer ROHs supports the view that the differences in ROH length longer than 20 Mb reflect effects of more recent consanguinity rather than LD pattern of ancient origin. It has been shown that consanguinity increased in Italy early in the 20th century and subsequently decreased. This has been explained by population growth in the early 20th century and changing demographics since then [39]. Another reason is the very large number of distantly related spouses in determining the population level of inbreeding [39]. With this source of a consanguineous population we had the unique opportunity to detect recessively inherited genomic regions for TC.


We showed evidence for long ROHs to increase the risk of TC. Higher inbreeding among cases supports the existence of recessive alleles affecting TC risk. The genetic architecture of TC is highly supported by a genetic model, in which the variants of a complex disease are more likely to be rare than common. They are also likely to be numerous with highly polygenic architecture and of a small individual effect at the population level. If this view of the genetic architecture of common complex diseases is correct, then it would be important to consider inbreeding as a factor having an influence on the disease.

Supplementary information is available at the journals website.



base pair


Utah residents with ancestry from northern and western Europe




confidence interval


inbreeding coefficient


false discovery rate

Fst :

Fixation index


genome-wide complex trait analysis


generalized linear regression model


genetic relationship matrix


Genome-wide association study


integrated haplotype score


Kilo-base pair


linkage disequilibrium


Mega base pair


mixed linear model


odds ratio


papillary thyroid cancer


restricted maximum likelihood


runs of homozygosity


standard deviation


standard error


single-nucleotide polymorphism


thyroid cancer


  1. 1.

    Agate L, Lorusso L, Elisei R. New and old knowledge on differentiated thyroid cancer epidemiology and risk factors. J Endocrinol Investig. 2012;35(6 Suppl):3–9.

    CAS  Google Scholar 

  2. 2.

    Li N, Du XL, Reitzel LR, Xu L, Sturgis EM. Impact of enhanced detection on the increase in thyroid cancer incidence in the United States: review of incidence trends by socioeconomic status within the surveillance, epidemiology, and end results registry, 1980-2008. Thyroid. 2013;23(1):103–10.

    Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Tehranifar P, Wu HC, Shriver T, Cloud AJ, Terry MB. Validation of family cancer history data in high-risk families: the influence of cancer site, ethnicity, kinship degree, and multiple family reporters. Am J Epidemiol. 2015;181(3):204–12.

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Cardis E, Kesminiene A, Ivanov V, Malakhova I, Shibata Y, Khrouch V, Drozdovitch V, Maceika E, Zvonova I, Vlassov O, et al. Risk of thyroid cancer after exposure to 131I in childhood. J Natl Cancer Inst. 2005;97(10):724–32.

    Article  PubMed  Google Scholar 

  5. 5.

    Hemminki K, Eng C, Chen B. Familial risks for nonmedullary thyroid cancer. J Clin Endocrinol Metab. 2005;90(10):5747–53.

    CAS  Article  PubMed  Google Scholar 

  6. 6.

    Hemminki K, Sundquist J, Lorenzo Bermejo J. Familial risks for cancer as the basis for evidence-based clinical referral and counseling. Oncologist. 2008;13(3):239–47.

    Article  PubMed  Google Scholar 

  7. 7.

    Bonora E, Tallini G, Romeo G. Genetic predisposition to familial nonmedullary thyroid cancer: An update of molecular findings and state-of-the-art studies. Journal of oncology. 2010;2010:385206.

    Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Gudmundsson J, Sulem P, Gudbjartsson DF, Jonasson JG, Masson G, He H, Jonasdottir A, Sigurdsson A, Stacey SN, Johannsdottir H, et al. Discovery of common variants associated with low TSH levels and thyroid cancer risk. Nat Genet. 2012;44(3):319–22.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Gudmundsson J, Sulem P, Gudbjartsson DF, Jonasson JG, Sigurdsson A, Bergthorsson JT, He H, Blondal T, Geller F, Jakobsdottir M, et al. Common variants on 9q22.33 and 14q13.3 predispose to thyroid cancer in European populations. Nat Genet. 2009;41(4):460–4.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Takahashi M, Saenko VA, Rogounovitch TI, Kawaguchi T, Drozd VM, Takigawa-Imamura H, Akulevich NM, Ratanajaraya C, Mitsutake N, Takamura N, et al. The FOXE1 locus is a major genetic determinant for radiation-related thyroid carcinoma in Chernobyl. Hum Mol Genet. 2010;19(12):2516–23.

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Kohler A, Chen B, Gemignani F, Elisei R, Romei C, Figlioli G, Cipollini M, Cristaudo A, Bambi F, Hoffmann P, et al. Genome-wide association study on differentiated thyroid cancer. J Clin Endocrinol Metab. 2013;98(10):E1674–1681.

    Article  PubMed  Google Scholar 

  12. 12.

    Bener A, El Ayoubi HR, Chouchane L, Ali AI, Al-Kubaisi A, Al-Sulaiti H, Teebi AS. Impact of consanguinity on cancer in a highly endogamous population. Asian Pac J Cancer Prev. 2009;10(1):35–40.

    PubMed  Google Scholar 

  13. 13.

    Kijas JW. Detecting regions of homozygosity to map the cause of recessively inherited disease. Methods Mol Biol. 2013;1019:331–45.

    Article  PubMed  Google Scholar 

  14. 14.

    Spain SL, Cazier JB, Consortium C, Houlston R, Carvajal-Carmona L, Tomlinson I. Colorectal cancer risk is not associated with increased levels of homozygosity in a population from the United Kingdom. Cancer Res. 2009;69(18):7422–9.

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Enciso-Mora V, Hosking FJ, Houlston RS. Risk of breast and prostate cancer is not associated with increased homozygosity in outbred populations. Eur J Hum Genet. 2010;18(8):909–14.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Wang C, Xu Z, Jin G, Hu Z, Dai J, Ma H, Jiang Y, Hu L, Chu M, Cao S, et al. Genome-wide analysis of runs of homozygosity identifies new susceptibility regions of lung cancer in Han Chinese. Journal of biomedical research. 2013;27(3):208–14.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Figlioli G, Kohler A, Chen B, Elisei R, Romei C, Cipollini M, Cristaudo A, Bambi F, Paolicchi E, Hoffmann P, et al. Novel genome-wide association study-based candidate loci for differentiated thyroid cancer risk. J Clin Endocrinol Metab. 2014;99(10):E2084–2092.

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Anderson CA, Pettersson FH, Clarke GM, Cardon LR, Morris AP, Zondervan KT. Data quality control in genetic case-control association studies. Nat Protoc. 2010;5(9):1564–73.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42(7):565–9.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Dempster ER, Lerner IM. Heritability of threshold characters. Genetics. 1950;35(2):212–36.

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88(1):76–82.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Weller JI, Song JZ, Heyen DW, Lewin HA, Ron M. A new approach to the problem of multiple comparisons in the genetic dissection of complex traits. Genetics. 1998;150(4):1699–706.

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Howrigan DP, Simonson MA, Keller MC. Detecting autozygosity through runs of homozygosity: a comparison of three autozygosity detection algorithms. BMC Genomics. 2011;12:460.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Team RC. R: A language and environment for statistical computing. 2013.

    Google Scholar 

  26. 26.

    Lencz T, Lambert C, DeRosse P, Burdick KE, Morgan TV, Kane JM, Kucherlapati R, Malhotra AK. Runs of homozygosity reveal highly penetrant recessive loci in schizophrenia. Proc Natl Acad Sci U S A. 2007;104(50):19942–7.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Pemberton TJ, Absher D, Feldman MW, Myers RM, Rosenberg NA, Li JZ. Genomic patterns of homozygosity in worldwide human populations. Am J Hum Genet. 2012;91(2):275–92.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome. PLoS Biol. 2006;4(3):e72.

    Article  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Fay JC, Wu CI. Hitchhiking under positive Darwinian selection. Genetics. 2000;155(3):1405–13.

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    McQuillan R, Leutenegger AL, Abdel-Rahman R, Franklin CS, Pericic M, Barac-Lauc L, Smolej-Narancic N, Janicijevic B, Polasek O, Tenesa A, et al. Runs of homozygosity in European populations. Am J Hum Genet. 2008;83(3):359–72.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, Absher D, Myers RM, Cavalli-Sforza LL, Feldman MW, Pritchard JK. The role of geography in human adaptation. PLoS Genet. 2009;5(6):e1000500.

    Article  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Oleksyk TK, Smith MW, O’Brien SJ. Genome-wide scans for footprints of natural selection. Philos Trans R Soc Lond Ser B Biol Sci. 2010;365(1537):185–205.

    CAS  Article  Google Scholar 

  33. 33.

    Siraj AK, Khalak HG, Sultana M, Al-Rasheed M, Bavi P, Al-Sanea N, Al-Dayel F, Uddin S, Alkuraya FS, Al-Kuraya KS. Colorectal cancer risk is not associated with increased levels of homozygosity in Saudi Arabia. Genet Med. 2012;14(8):720–28.

    CAS  Article  Google Scholar 

  34. 34.

    Woods CG, Cox J, Springell K, Hampshire DJ, Mohamed MD, McKibbin M, Stern R, Raymond FL, Sandford R, Malik Sharif S, et al. Quantification of homozygosity in consanguineous individuals with autosomal recessive disease. Am J Hum Genet. 2006;78(5):889–96.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Li LH, Ho SF, Chen CH, Wei CY, Wong WC, Li LY, Hung SI, Chung WH, Pan WH, Lee MT, et al. Long contiguous stretches of homozygosity in the human genome. Hum Mutat. 2006;27(11):1115–21.

    CAS  Article  PubMed  Google Scholar 

  36. 36.

    Gibson J, Morton NE, Collins A. Extended tracts of homozygosity in outbred human populations. Hum Mol Genet. 2006;15(5):789–95.

    CAS  Article  PubMed  Google Scholar 

  37. 37.

    Spielman D, Brook BW, Briscoe DA, Frankham R. Does inbreeding and loss of genetic diversity decrease disease resistance? Conserv Genet. 2004;5(4):439–48.

    Article  Google Scholar 

  38. 38.

    Nabulsi MM, Tamim H, Sabbagh M, Obeid MY, Yunis KA, Bitar FF. Parental consanguinity and congenital heart malformations in a developing country. Am J Med Genet A. 2003;116A(4):342–7.

    Article  PubMed  Google Scholar 

  39. 39.

    Cavalli-Sforza LL, Moroni A, Zei G. Consanguinity, inbreeding, and genetic drift in Italy (MPB-39). Princeton: University Press; 2013.

    Google Scholar 

Download references


The Italian part of the study has received financial support from the Istituto Toscano Tumori.

Author information



Corresponding author

Correspondence to Hauke Thomsen.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

HT, AF, FG, KH, SL organized and designed the study. BC, GF, PH and SH performed the GWAS RE, CR, MC, AC and FB were responsible of the collection of samples. HT performed the statistical analysis. HT, AF, KH wrote and reviewed the manuscript. All authors read and approved the final manuscript.

Additional file

Additional file 1: Figure S1.

Example for recurrent ROHs in the telomeric region of chromosome 15 for 6 cases. (PDF 335 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Thomsen, H., Chen, B., Figlioli, G. et al. Runs of homozygosity and inbreeding in thyroid cancer. BMC Cancer 16, 227 (2016).

Download citation


  • Thyroid cancer
  • Runs of homozygosity
  • Inbreeding
  • GWAS