Data sources
The Pediatric Oncology Group of Ontario Networked Information System (POGONIS) is a population-based registry capturing data on all cases of Ontario pediatric cancer diagnosed at pediatric oncology centers. Pediatric oncology care in Ontario is delivered through five tertiary centers and their associated satellite centers. POGONIS collects data through an active process; trained data managers at each of the five tertiary centers prospectively and actively abstract demographic, disease, treatment and outcome data for all new cancer cases. Data managers routinely attend tumor boards and other medical rounds to ensure completeness and validity of the data, including diagnosis. Senior POGONIS administrators also review these data centrally to assess accuracy; data managers are routinely contacted for clarification of certain data elements, and can be asked to return to the patient chart if necessary. Data managers in turn contact treating clinicians if necessary. Previous work has shown that POGONIS identifies greater than 96 % of Ontario children with cancer aged 0–14 years [11]. Adolescents treated at adult institutions are not identifiable through POGONIS, leading to lower capture rates within POGONIS of patients aged 15–18 years.
Covering a population of approximately 13 million, the Ontario Cancer Registry (OCR) is a population-based tumor registry which relies on the passive receipt of reports from four sources: pathology reports with a diagnosis of cancer from all pathology labs across the province, hospital discharge records containing a diagnosis of cancer from all hospitals across Ontario, electronic health records from specific treatment centers, and any electronic death record with cancer as one of the underlying causes [13, 14]. During the study period, computerized algorithms employing deterministic and probabilistic linkage were used to link multiple records pertaining to the same individual. In contrast to POGONIS, during the study period OCR employed a set of computerized rules to passively assign the site and histology of the primary malignancy. Given this passive process, OCR was not able to return to source documents for additional data or clarifications, nor was OCR able to incorporate from additional data sources such as POGONIS.
Study population
All Ontario residents diagnosed with any malignancy between 2000 and 2011, less than 18 years of age at diagnosis, and treated and registered at a pediatric oncology center were identified using POGONIS and included. Patients with histiocytic disorders such as Langerhans cell histiocytosis or hemophagocytic lymphohistiocytosis were excluded given their variable inclusion in OCR over the study period. The study eligibility end date of 2011 was chosen to maximize the eligible cases be registered in both databases.
Determining OCR diagnoses
As noted above, and in contrast to adult classification systems, pediatric cancers are generally categorized according to morphology and not primary site of origin. The third edition of the International Classification of Childhood Cancer (ICCC-3) is currently accepted as the standard classification system for childhood cancer [15]. The ICCC-3 operates hierarchically, with 12 main Level 1 diagnostic groups and 47 Level 2 diagnostic subgroups. For certain heterogeneous subgroups, Level 3 optional extended classifications are provided. As an example, Diagnostic Group I corresponds to leukemias, myeloproliferative diseases, and myeloplastic diseases, with Diagnostic Subgroup Ia pertaining to lymphoid leukemias and extended classification Ia.1 designating precursor cell leukemias. The ICCC-3 also includes an algorithm which converts International Classification of Diseases for Oncology, third edition (ICD-O-3), codes (ICD-O-M, ICD-O-T) to ICCC-3 diagnostic groups and subgroups [15]. ICCC-3 diagnostic groups are mainly based on morphology codes indicated by morphology (ICD-O-M), but are sometimes also dependent on topography codes (ICD-O-T) [16]. For example, cases with morphology codes indicating histologies consistent with germ cell tumors (e.g. 9071 – yolk sac tumor, 9080 – teratoma) are further classified by ICCC-3 as gonadal, intracranial/intraspinal, or extracranial/extragonadal based upon ICD-O-T codes.
Similar to most population-based cancer registries, the OCR uses ICD-O-3 codes to classify incident cases. Using the aforementioned algorithm, we converted these ICD-O-3 codes to ICCC-3 diagnostic groups and subgroups (see Fig. 1 for schematic overview). As ICD-O-T codes were unavailable for the study population, we first converted OCR ICD-9 codes that indicated site of disease to ICD-O-T codes (Fig. 1). For example, we converted the ICD-9 code 189.0 (malignant neoplasm of kidney, except pelvis) and its derivatives to the ICD-O-T code C64.9, which indicates a primary renal site of disease. Additional examples may be seen in Additional file 1.
Importantly, as ICD-O-M codes were available from OCR, the information from these ICD-O-M codes were the only data source used to determine histology/morphology, even if additional or contrary information was available from the OCR ICD-9 codes. This decision was made as the published conversion algorithm to ICCC-3 uses only ICD-O-M codes for histology/morphology data and not ICD-9 codes [15]. Our approach should thus mimic that endorsed by the literature for use by researchers and cancer registrars.
Several additional modifications were necessary:
-
1.
A small number of morphology codes from either previous ICD-O editions or more recently introduced were encountered in the OCR. These codes were first mapped to ICD-O-3 morphology/histology codes before the conversion to ICCC-3 (Additional file 2).
-
2.
Patients for whom no ICD-O-M code was listed or for whom codes indicating solely “malignant primary” were classified as Diagnostic Subgroup XIIb, or “Other unspecified malignant tumors”. Patients coded as 9990/3, or “No mircoscopic proof”, were similarly classified. As noted above, ICD-9 codes were not used in these cases in order to clarify histology/morphology.
-
3.
Rhabdoid tumors are rare malignancies that share a characteristic histology and which occur primarily in the brain (known as atypical teratoid/rhabdoid tumors – AT/RT) and kidney [17]. The ICCC-3 algorithm classifies any tumor with ICD-O-M code 9508/3 (AT/RT) as an AT/RT (IIIc.4) regardless of site, and tumors with ICD-O-M code 8963/3 (malignant rhabdoid tumor) as either rhabdoid renal tumors (VIa.2) or extrarenal rhabdoid tumors (IXd.3) depending on site. The algorithm however does not account for patients with ICD-O-M code 8963/3 with a central nervous system primary site. We classified these patients as having AT/RT (IIIc.4).
Determining POGONIS diagnoses
POGONIS was established in 1985, prior to the existence of internationally recognized classification systems for childhood cancer. Malignancies are therefore categorized in POGONIS using a unique classification system derived by local clinicians that nonetheless shares similarities with the ICCC-3. Each POGONIS diagnosis code was mapped to the appropriate ICCC-3 category; an example is illustrated in Additional file 3. Rare pediatric malignancies such as squamous cell carcinomas and malignant carcinoid tumors were coded in POGONIS as single diagnostic categories independent of site, unlike in the ICCC-3. For these tumors, site of disease information was also extracted from POGONIS to allow accurate ICCC-3 categorization. Rare cases for which only general diagnoses were available (e.g. “bone tumor”) were mapped to ICCC-3 Level 1 diagnostic groups only. Other demographic variables were also obtained from POGONIS, including age at diagnosis, gender, and time period (early, 2000–2005 vs. late, 2006–2011).
Validation of OCR diagnoses
Cohort patients were linked to the OCR deterministically by individually assigned Ontario Health Insurance Program numbers. Where deterministic linkage was not possible due to a lack of an exact health insurance number match, probabilistic linkage using name, date of birth and gender was employed. All patients linked probabilistically were reviewed for linkage quality. For those patients successfully linked, OCR-based and POGONIS-based ICCC-3 classifications were compared. Comparisons were made by Level 1 diagnostic groups and where appropriate, Level 2 and Level 3 subgroups. Given its use of pediatric-trained data managers, active capture of data, clinician involvement, and ability to correct and supplement data when needed, the POGONIS-based classification was considered the gold standard against which the OCR-based classification was validated.
Analyses
Successfully and unsuccessfully linked patients were compared by age, gender, and time period of diagnosis using the Chi square test or the Wilcoxon rank sum test as appropriate. Guidelines pertaining to studies validating health administrative data have recommended the use of multiple statistical measures of agreement [18]. Agreement between the POGONIS-based and OCR-based classifications was therefore assessed by calculating the kappa statistic, sensitivity, specificity, positive predictive value and negative predictive value. Statistical significance was defined as p < 0.05.