Burden of cancers in India - estimates of cancer crude incidence, YLLs, YLDs and DALYs for 2021 and 2025 based on National Cancer Registry Program

Background Cancer is the major cause of morbidity and mortality worldwide. The cancer burden varies within the regions of India posing great challenges in its prevention and control. The national burden assessment remains as a task which relies on statistical models in many developing countries, including India, due to cancer not being a notifiable disease. This study quantifies the cancer burden in India for 2016, adjusted mortality to incidence (AMI) ratio and projections for 2021 and 2025 from the National Cancer Registry Program (NCRP) and other publicly available data sources. Methods Primary data on cancer incidence and mortality between 2012 and 2016 from 28 Population Based Cancer Registries (PBCRs), all-cause mortality from Sample Registration Systems (SRS) 2012–16, lifetables and disability weight from World Health Organization (WHO), the population from Census of India and cancer prevalence using the WHO-DisMod-II tool were used for this study. The AMI ratio was estimated using the Markov Chain Monte Carlo method from longitudinal NCRP-PBCR data (2001–16). The burden was quantified at national and sub-national levels as crude incidence, mortality, Years of Life Lost (YLLs), Years Lived with Disability (YLDs) and Disability Adjusted Life Years (DALYs). The projections for the years 2021 and 2025 were done by the negative binomial regression model using STATA. Results The projected cancer burden in India for 2021 was 26.7 million DALYsAMI and expected to increase to 29.8 million in 2025. The highest burden was in the north (2408 DALYsAMI per 100,000) and northeastern (2177 DALYsAMI per 100,000) regions of the country and higher among males. More than 40% of the total cancer burden was contributed by the seven leading cancer sites — lung (10.6%), breast (10.5%), oesophagus (5.8%), mouth (5.7%), stomach (5.2%), liver (4.6%), and cervix uteri (4.3%). Conclusions This study demonstrates the use of reliable data sources and DisMod-II tools that adhere to the international standard for assessment of national and sub-national cancer burden. A wide heterogeneity in leading cancer sites was observed within India by age and sex. The results also highlight the need to focus on non-leading sites of cancer by age and sex. These findings can guide policymakers to plan focused approaches towards monitoring efforts on cancer prevention and control. The study simplifies the methodology used for arriving at the burden estimates and thus, encourages researchers across the world to take up similar assessments with the available data. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-022-09578-1.


Background
Cancer ranks either first or second among the leading causes of death before the age of 70 years across 91 out of the 172 countries worldwide [1]. The GLOBOCAN 2018, reported 18.1 million new cancer cases and 9.6 million deaths globally [2]. By 2040, the cancer incidence and mortality are expected to rise to 29.5 million and 16.3 million, respectively [2]. New and challenging problems -rapid urbanization, population ageing, inactive and unhealthy lifestyles, indoor and outdoor air pollution, etc., are responsible for the emerging cancer burden across the globe, majorly impacting the middle-to-low socio-economic countries including India [3].
There have been previous attempts to estimate the cancer burden in different parts of India [3][4][5][6][7][8][9][10]. The Global Burden of Disease (GBD) 2016 study, attributed 8.3% of deaths and 5% of disability-adjusted life years (DALYs) to cancer alone [3]. The GLOBOCAN 2018, reported 1.1 million cancer cases and more than 0.7 million cancer deaths [11]. The Medical Certification of Cause of Death, 2018 reported cancer as the fifth leading cause of death amounting to 5.7% of all deaths in India [12]. The cancer burden has shown a steady increase with an estimated 0.8 million new cancer cases every year [11]. In 2040, nearly 2 million new cancer cases and more than 1 million deaths are estimated [11]. The heterogeneities in cancer epidemiology within India are well-documented [3][4][5][6][7][8][9][10]. The latest publication from the National Cancer Registry Program (NCRP), India points to the differences in cancer incidence rates. Aizawl district of Mizoram showed 7 times higher incidence rates of cancer in males and 4 times in females from that in Osmanabad and Beed in Maharashtra [6,7].
The population-based cancer registries (PBCRs) are the only reliable, long period sources of data on the magnitude and patterns of cancer in the country [6,7]. The data has been used in multiple studies, including the GBD study, to estimate the cancer burden for India [3,5]. The Indian Council of Medical Research (ICMR) -National Centre for Disease Informatics and Research (NCDIR) -NCRP is a valuable data repository on cancer since its establishment in 1981 [6,7]. To date, only constructed methods have been adopted to estimate the DALYs for cancer as a metric to quantify cancer burden in India. These studies have used econometric models that have inconsistent methodology and provide diverse results over different years. Moreover, all these results would also depend on availability of incidence and mortality data. In India due to limited availability of mortality data we have used the real-time data on cancer incidence and mortality from the PBCRs. Thus, we aimed to analyze and report cancer burden estimates for 2021 and 2025 in India at the national and sub-national level using the simplest methods that can be easily adapted and replicated. Also, these methods can be easily applied by other similar countries having inadequate representative data sets. Such attempts have been made in the past by ICMR to assess the cancer burden [8]. Additionally, the study aimed to derive adjusted mortality to incidence (AMI) ratio for India. In this paper, we present the methodology, results and discuss the implications of cancer burden in India using PBCR-NCRP data of 2012-16.

Overview
As a continued effort to the ICMR -Burden of Disease (BOD) study in 2004 [8], the BOD -Noncommunicable Disease (NCD) study was undertaken by ICMR -Ministry of Health and Family Welfare, Government of India collaboration to update the burden estimates for 2015. The cancer burden aspect of the study was tasked to the ICMR-NCDIR, a nodal institute for the NCRP [6, 7] to assess metrics of cancer burden in India. Incidence, mortality, prevalence, years of life lost (YLLs), years lived with disability (YLDs) and DALYs were the metrics derived. This analysis supersedes the 2015 update with new projections for 2021 and 2025 using 2012-16 NCRP data.

Data sources
To meet the specific data requirements for estimating the national burden of cancer metrics and its future projections, we adhered to freely available standard data sources (Additional figure 1a) and adapted the recommended GBD-WHO-DisMod II tool [13].
Cancer-specific data on incidence and mortality by age and sex for the period 2012-16 was extracted from the 28 NCRP-PBCRs, which collect data on all 'ICD -O3' behaviour code cancers [6,7]. For every registry, the pooled incidence rate for quinquennial age groups for sites was observed within India by age and sex. The results also highlight the need to focus on non-leading sites of cancer by age and sex. These findings can guide policymakers to plan focused approaches towards monitoring efforts on cancer prevention and control. The study simplifies the methodology used for arriving at the burden estimates and thus, encourages researchers across the world to take up similar assessments with the available data.
Keywords: Burden, Cancer, Disability adjusted life years, Epidemiology, India both sexes, every cancer site (C00 -C96) were calculated as the number of new cases reported per year divided by the mid-year population of that year. The number of deaths for every cancer site was divided by the mid-year population to determine the pooled mortality rates. The PBCR reported mortality to incidence (RMI) ratio for males and females in each age group was estimated by dividing the number of deaths from cancer with the corresponding incidence estimates for the defined year from the cancer registry data [6,7].
Considering the limitations of under reporting in the mortality systems in India, we searched for available evidence on the national MI ratio. We came across two national studies reporting an average of 35.0 and 75.4% as MI ratio for India [4,14]. Accordingly, longitudinal data points with reported MI ratio of ≥35.0% [14] from NCRP-PBCRs between 2001 and 2016 was extracted to derive at the standard MI ratio for India by sex. Total of 256 data points had an MI ratio of ≥35.0%, of these 150 were for males and 106 for females. The Akaike's and Bayesian Information criterion showed that the Gamma distribution as the best fit to the data (R software, ver- . The estimated mean of AMI converged after 10,000 iterations was found to be 50.3% (95% uncertainty interval (UI): 48.7-51.9) for males and 46.6% (95% Ul: 44.9-48.4%) for females. We replaced this mean AMI for those registries (2012-16) with RMI < 50.3% (males) and < 46.6% (females). The results thus derived from the AMI are presented as YLL AMI , YLD AMI and DALY AMI for all cancer by sex.
The state-level population was obtained from the 2001 and 2011 Census of India, to arrive at the projected population by age and sex from 2012 to 2016 using the difference distribution method [15]. Age and sex-specific all-cause mortality were procured from the Sample Registration System (SRS), Office of Registrar General of India for 2012-16 [16]. The standard lifetables provided by WHO were used to arrive at the life expectancy for age groups and sex [17] (Additional Tables 1, 2 and 3). The disability weight (DW) for cancer were obtained from the published GBD-2004 update by WHO. The DW for malignant neoplasm and their long-term sequel is 0.75 for all the metastatic stages of cancer [18]. The prevalence of cancer by type, age and sex were obtained using the cancer incidence, mortality, MI ratio, population and all-cause mortality rate as inputs in the DisMod-II tool [5,13] (Additional Figure 1a).
The 28 states and 2 out of 8 Union Territories (Delhi and Jammu & Kashmir) were grouped into six regions based on the pooled NCRP-PBCR reporting for regions [6]. The regions with existing PBCRs were the data sources for those respective regions corresponding to their location and, regions that had less number or lacked PBCRs, the nearest PBCRs data was used (Additional Table 4a) [6].

Statistical analyses
To obtain YLLs, the total number of deaths due to cancer in a defined age group was multiplied by the standard life expectancy of that age group. YLD estimates were generated by multiplying the total number of prevalent cancer cases at respective age groups by disability-weight of cancer. DALY metrics were the sum of the YLL and YLD estimates [5]. All cancer burden metrics -YLL, YLD and DALY as well as YLL AMI , YLD AMI and DALY AMI were estimated in age-standardized rates (ASR) using the WHO World Population Standard distribution  to assess the comparability of results with previous GBD estimates [19].
The cancer burden metrics projections for 2021 and 2025 (using adjusted MI ratio) was done assuming the pooled incidence and mortality rates obtained from 28 PBCRs that represents India's incidence and mortality from cancer and age-specific cancer incidence. With the obtained information on cancer type, sex and age groups, the cancer burden estimates at the national and sub-national level were projected using 16 years of data between 2001 and 16 [20]. The negative binomial regression model was applied using STATA 14.2 (StataCorp, College Station, Texas, USA) to project the cancer burden metrics for 2021 and 2025. This model was preferred over the Poisson regression model as the conditional variance of the cancer burden metrics were greater than the conditional mean. We applied the Lagrange multiplier test to examine the presence of over dispersion in the data. The results for all cancers are presented as YLL, YLD, DALY with reported MI ratio and YLL AMI , YLD AMI , DALY AMI with adjusted MI ratio per 100,000 population.

National and sub-national burden of cancer in India
In 2016, men from southern, northern and eastern regions had the highest crude incidence rates of lung cancer, while mouth and oesophageal cancers ranked first in the western, central and northeast regions of the country followed by lung cancer. Oesophageal cancer ranked fifth to eleventh in incidence in other regions (Fig. 1a). In females, breast cancer and cervical cancer ranked first and second in incidence irrespective of regions, while ovarian cancer occupied the third rank across all regions except in the north and northeast parts of India. Mouth cancer ranked fourth to eighth and oesophageal cancer was between fifth -eleventh rank in incidence (Fig. 1b).
Registry wise information has been provided in Additional Table 5.
The total estimated burden of cancer for India in 2016 was 1277 DALYs per 100,000. After adjusting the MI ratio, the national cancer burden was 1908 DALYs AMI per 100,000 (Additional Table 4a). The burden imposed from cancer for males was higher than females (Additional Table 4b and c). The majority of the cancer burden was in the northeast region (1428 DALYs per 100,000), followed by southern (1353 DALYs per 100,000) and central (1351 DALYs per 100,000) regions of India. After adjusting the MI ratio, north (2408 DALYs AMI per 100,000), north-east (2177 DALYs AMI per 100,000) southern (2138 DALYs AMI per 100,000) and central (2024 DALYs AMI per 100,000) regions had the highest DALYs AMI from cancer. North-east and northern India (103 YLDs AMI per 100,000) showed the highest YLDs AMI (Additional Table 4a and Fig. 2). Across all Fig. 1 a Ranking of crude incidence rate of leading cancer sites in males for India, by region and PBCR. b Ranking of crude incidence rate of leading cancer sites in females for India, by region and PBCR the states/UTs, Mizoram (3424 DALYs AMI per 100,000) followed by Delhi (2651 DALYs AMI per 100,000) and Meghalaya (2609 DALYs AMI per 100,000) had the highest cancer DALYs. Mizoram (153 per 100,000) and Arunachal Pradesh (140 per 100,000) had the highest YLDs AMI from cancer (Additional Table 4a).
Both sexes combined, the proportion of YLDs increased with age, and subsequently, after 45-49 years there was a nearly four-fold increase in YLDs. The highest YLDs was observed between 75 and 79 years (14.6%) and decreased thereafter. Among females, YLDs were highest in the age group of 70-74 years (13.1%) (Additional Figure 2b).

Distribution of total cancer DALYs in percentage (%) by age group and site
Among both sexes, the highest DALYs at 0-14 years were from cancer of eye (50.7%), lymphoid leukaemia (44.6%); and at 15-34 years it was from cancer of testis (53.1%) and malignant bone tumours (37.9%). Nearly 24 cancer sites added to more than 50% of total cancer DALYs at 35-59 years, the highest from cancer of cervix uteri and breast (63.5% each). While at 60+ years, cancer of prostate and ureter contributed to 83.2% and 73.0% cancer DALYs, respectively. Thus, cancer prostate and ureter contributed to the highest DALYs amongst all other cancer sites irrespective of age (Additional Figure 3).

Projections for 2021 and 2025
In India, the burden for cancer was projected to be 26.7 million DALYs AMI in 2021 and 29.8 million DALYs AMI in 2025. The burden was higher among males than females (Table 2). Males contributed to 14.7 million YLLs AMI , 0.72 million YLDs AMI and 15.5 DALYs AMI , whereas females contributed to 13.6 million YLLs AMI , 0.69 million YLDs AMI and 14.3 DALYs AMI from cancer in 2025 (Table 2).

Change in cancer DALYs per 100,000 from 2004 to 2021
When examining the DALYs by cancer sites from 2004 to 2021, cancer lung showed an increase in DALYs from tenth place to first. The number of DALYs contributed by oesophageal cancer was expected to increase from 22.7 per 100,000 to 63.5 per 100,000 moving from eighth to seventh place in 2021. Cancer cervix, ovary, lymphoma and multiple myeloma and stomach that led rankings in 2004 showed decline in 2021 (Fig. 4).

Discussion
As per the National Health Policy 2017 of India, the estimation of DALYs is recommended as a key epidemiological tool to assess epidemiological transitions and study macro-level policies on the expected health care use, evaluate the impact of prevention and control programs, allocate resources, and benchmark the progress being made in the country [21][22][23][24]. This study provides robust country-specific burden imposed by cancer using the PBCR data (2012-16). However, challenges arise in providing timely data updates on cancer burden to the government due to existing limitations in the cancer registration and reporting systems. Real time availability of cancer incidence is lacking by 2-3 years and underreporting of deaths with inaccuracy in reporting of cancer specific deaths in the Civil Registration System would have its effects on estimation of AMI ratio. In addition, PBCRs cover approximately 10% of total population and the distribution of PBCR location and coverage is not uniform across all the states. Nevertheless, the burden imposed by cancer reinforces the need to view cancer burden metrics with the available data sources in the best possible context to inform cancer care and prevention strategies. This study calculated the burden of cancer in a standardized way in a developing country with limited information, and must be read with an understanding of the existing limitations in the reporting systems.
The main strengths of this study include the use of original, reliable and robust data to estimate national, regional and state-level cancer burden in India. Primarily, we used the ICMR-NCDIR-NCRP data as a source on cancer incidence and mortality in India and other established national data sources locally available to India -Census of India and SRS [6, 12,13]. The use of WHO-DisMod II enables the replication of such studies by other researchers as well [5,8,25]. With better coverage of the PBCRs, quality and access to data in recent times, quantifying cancer burden from available data resources has improved [7]. Additionally, the study estimated standardized MI ratio derived from real-time longitudinal PBCR data points. The combination of sources and methods used in this study to estimate cancer burden are easily reproducible by other investigators.
There was an evident increase in the reported proportion of deaths from neoplasms between 2000 and 2018 ranging from 3.6% to 6.4%, while the cancer incidence in the country was projected to rise by 12% from 2020 to 2025 [7,12]. In the previous ICMR 2004 -cancer burden estimates for India, 5.9 million DALYs were arrived from NCRP-PBCR data and WHO-DisMod-II tool [8]. The current study estimates for 2016 are 22.6 million DALYs AMI , indicating an evident rise of more than 3 times cancer DALYs since 2004, primarily due to the growing and ageing population. Further, our study projections reveal a 11.4% rise in cancer DALYs from 2021 to 2025. However, these projections are influenced by the future investment decisions in health care, cancer research, public awareness on cancer risk factor reduction, other social, economic changes and cancer notifiability [26][27][28].
The cancer burden varies between regions within the country. The current study reports a high cancer burden in the northern region followed by the north-east. Breast cancer in females, lung cancer and oesophageal cancer in males contributed to the highest burden in northern and northeast region, respectively. The estimated DALYs (reported and adjusted MI) in the current study varied with those reported by GBD in 2016 [3,29]. The resulting estimates are highly dependent on the data quality, sources of data, completeness of data (incidence and mortality), data collection period, methodology, statistical modelling and assumptions [3,5,30,31]. Nevertheless, the reliability on the NCRP is deemed to provide robust data to estimate national, regional and state burden on cancer as it adheres to standard methods established by the WHO-International Agency for Research in Cancer for the last 40 years [3,6,20]. NCRP has been the primary data source even for the GBD and the GLO-BOCAN publications [3,26]. With every 1% increase in adjusted MI from reported MI ratio, the DALYs per 100,000 increased by 37 for both sex, 36 for males and 39 for females. Furthermore, the inconsistencies between studies are not surprising even for high-income countries with complete cancer incidence and mortality data and, such variations in epidemiological analysis are common [31][32][33][34]. It is not possible to evaluate the causes for  these inconsistencies as the complex constructed statistical models, as well as data adjustments used by GBD or other relevant studies, are not fully available in the public domain [31,34].
The nations within a nation description of India best describes the heterogeneities in the epidemiological transition levels within its states. Most of the increase in cancer incidence in India can be attributed to its epidemiologic transition and the commitment of Government of India in improving the use of cancer diagnostics in the country. However, the burden imposed by cancer in India will continue to increase as a result of the continuing pace of cancer risk factors and its determinants. Maximum increases will occur in the most populous and least affluent states, where the current cancer diagnostic and treatment facilities are inadequate [22,26,35,36].
In the current study, the YLLs contribution to DALYs is around 93% for all sites of cancer, and we have relied completely on the population-based registries for all cancer-specific data. Adding on to the merits of the current study is the use of recent 2012-16 period PBCR data to arrive at cancer burden metrics, that has not yet been utilized by any other study.
The study results were derived from the use of simple, locally relevant and reproducible methods replicable at the sub-national level for policy and planning purposes. However, the estimates are very sensitive to changes in the environment influenced by health insurance systems, certification of cause of death, number of cancer treating hospitals, number of cancers registrations and PBCR coverage. Despite the caveats, our results are the best available real-time and ongoing estimates of the cancer burden in India. They are best suited for use for priority setting and planning of cancer resource allocation and management across the nation [26].
Recognizing these inadequacies, focus on expansion and strengthening of available data systems in coverage, completeness, quality and access is an important investment to future research and development. Cross talk between data sources -mortality databases, cancer registration systems, Ayushman Bharat, Hospital Information System recording, and national as well as state cancer control programmes would play a significant role in strengthening the completeness of local data. Bringing in place cancer notifiability within the country will further strengthen the already available data through wider coverage with limited resource and funding [3,6,7].

Conclusion
The detailed epidemiology of all cancer types by sex, age and at sub-national level described in this paper will be a blueprint for specifically targeted policy and program planning appropriate to the regional burden and needs. Concerted and sustained efforts for strategic allocation of resources in terms of finance and health infrastructure should be made for improved access to care, prevention, early detection and, management to meet the rising burden of cancer in India. Also, quantifying burden through locally relevant, regionalized approaches helps make relevant decisions assess set targets of Sustainable Development Goals, Universal Health Coverage and National Multisectoral Action Plan for Prevention and Control of Common Noncommunicable Diseases. This study would also encourage other low-middle income countries to adopt similar methods for strengthening their cancer disease burden estimation needed for policy and program interventions.