Cancer modeling has a long history (see for example ref [37]) and it is possible to fit many models to cancer data. Such modeling is complicated because many parameter values are uncertain and likely to differ between individuals, populations, and through time. Ideally, experimentalists and modelers interact, but many cancer equations are incomprehensible to many students and experimentalists. The current equation incorporates some of the assumptions in other cancer models (see Table 2), but its algebraic format may be easier to understand and manipulate [2].

What "causes" cancer? This model examines whether colorectal cancers can arise within a lifetime from normal division and mutation rates, and without serial selection and clonal expansion (a parsimonious pathway). Whereas the accumulation of sufficient numbers of driver mutations might be highly unlikely with normal mutation rates [38], new experimental data illustrate that colorectal cancer mutation frequencies are relatively low and consistent with normal mutation and division rates [3]. This new data constrains models because proliferation or mutation rates do not have to be and are not significantly altered during most of progression. Stem cells, which are the long-lived lineages that can accumulate mutations during progression [39], might seldom divide, but recent studies in mice suggests crypt stem cells are not quiescent but actively divide about once per day [13]. An important distinction is that an individual gets cancer when the first cell and not the average cell accumulates a critical number of driver mutations.

Here we illustrate that mutation accumulation from normal cell replication can account for the low per cell transformation rates and low cancer genome mutation frequencies. Progression to cancer is complex and variable, but certain biological features are likely to be fundamentally important when averaged over many individuals and many years. These factors are the number of divisions (*d*), the number of stem cells (*N* × *m*), the number of critical rate-limiting driver pathway mutations (*k*), and the mutation rate (*u*). The probability that at least one stem cell accumulates the required number of driver mutations in an individual's lifetime is substantially greater than the probability a typical stem cell acquires these mutations. Given a 5% risk of colorectal cancer by 100 years of age, only five cells in 100 individuals transform after 100 years. There are ~15 million crypts per colon and therefore at least ~15 million stem cells at risk for colorectal cancer in an individual. Therefore, only ~five of 1.5 billion crypt stem cell lineages transform within a 100 years, or a single transformation event per ~30 billion crypt stem cell years (stem cell lineage transformation efficiency ~3 × 10^{-9}). Chance and the enormous variation generated by replication errors in millions of stem cell lineages may be sufficient for the selection of low frequency cancer phenotypes within a lifetime.

A probabilistic description of cancer has several aspects consistent with cancer genome data, which show relatively low mutation frequencies, diverse combinations of mutations between different tumors, and a high proportion (>80%) of neutral passenger mutations [4, 5, 11, 12]. Equation [8] models random mutation that starts from conception and therefore the numbers and types of mutations in a cancer genome is highly dependent on what happens in normal colon (Fig 1). Mutations (predominantly passenger mutations) may arise as replication errors, and cancer results by chance from rare and diverse driver mutation combinations that confer a malignant phenotype in a single cell. Certain APC mutations may be common in colorectal cancer because they enhance stem cell survival during niche clonal evolution and shorten pathways by effectively increasing subsequent numbers of stem cells at risk. The similar base mutation spectrum in colorectal, pancreatic, and glial tumors [11] is consistent with a common underlying mechanism such as replication errors.

A cancer model that includes epidemiology data needs an age parameter. Implicit in Equation [8] is that progression starts at conception and most mutations accumulate in normal appearing colon (Fig 1). Once visible tumorigenesis occurs, this equation does not readily apply because it calculates the risk of the entire colon and does not model the adenoma-cancer sequence [20]. However, progression to cancer may be dominated by its passage in normal colon because tumors before the age of 50 years are rare. The accumulation of somatic driver mutations in normal tissues is poorly documented, but mouse models demonstrate that many oncogenic mutations are also compatible with normal phenotypes [40], illustrating that some driver mutations can potentially arise earlier in life and persist in normal colon. Transformation of primary human cells has been engineered in vitro, but tumorigenesis in nude mice required the simultaneous combination of all three changes in a single cell [41].

This simple equation does not include copy number or epigenetic variations, or the very likely possibility that error and division rates may change during progression, and should be viewed as an exploratory or teaching tool. Many other quantitative models of cancer have been published, include a model of cancer genome data [42], but the algebraic format of this equation may be more familiar to students, which can also be manipulated with an Excel spreadsheet (see Additional file 2). By this equation, cancer is "caused" by replication errors, a large number of cells at risk, and "bad luck" [43], with cancer risks increased by stem cell divisions that normally occur with aging [33]. The examples analyzed with Equation [8] illustrate that subtle rather than dramatic cell changes are consistent with risk changes measured in large populations. From a broader perspective, its "integrative" nature relates how cancer incidence may depend on effective stem cell numbers, division and error rates, and numbers of required rate-limiting driver pathway mutations. Many progression pathways from the zygote are possible, but the shorter, parsimonious ways may allow cancers to appear within a lifetime.