The reporting of this population-based retrospective cohort study is based on the Reporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement [8]. The study was approved by the WA Department of Health Human Research Ethics Committee (2012/42), which exempted the study from obtaining individual patient consent.
Data sources and linkage
The data sources analysed were: (i) the WACR; (ii) the hospital morbidity data collection (HMDC, from 1998), and; (iii) death registrations. These datasets are routinely linked by the WA Data Linkage Branch [9]. At June 2016 WA had a population of ~ 2.56 million (10.6% of the national population of ~ 24.2 million [10]).
Description of participants
WACR data from 1 January 1982 to 31 December 2016 were used. All new, invasive malignant cases among WA residents were included in the incidence and prevalence analyses, including cancers of unknown primary site. Multiple primary cancers were included, with the WACR following the International Association of Cancer Registry (IARC) rules for multiple primary cancers [11]. Multiple primary cancers are separate records in the same individual, but not a metastasis from an initial primary. Usually multiple primaries are in separate topographical (anatomical) sites, but histologically different malignancies in the same site would be considered as two separate primaries (e.g. a breast carcinoma and a Phyllodes tumour of the breast would both be recorded). Kaposi’s sarcoma is only counted once per individual using the IARC rules, even if identified in multiple body sites at different times. For the survival analyses, records with an unknown age at diagnosis, death certificate only diagnoses, an age > 115 years at censoring, a date of death prior to the diagnostic date, or with no survival time (i.e. diagnostic date equal to date of death) were excluded [12]. For the hospital analysis, the chronologically first sarcoma from the group used in the survival analysis was linked to subsequent hospitalisations presentations (i.e. only hospitalisations occurring post-diagnosis were included).
Selection of cancer types
Sarcoma was selected using the latest definitions and three tier hierarchy reported by the Information Network on Rare Cancers (RARECARENet) [13] (Additional file 1). Tier 1 refers to soft tissue sarcoma, bone sarcoma, gastro-intestinal stromal tumour (GIST) and Kaposi sarcoma. For this study, sarcoma was considered the sum of the tier 1 entities. This is consistent with the approach by Gatta et al. [14], though these authors classified Kaposi sarcoma as skin cancers and non-cutaneous melanoma. The tier 2 allocation separates soft tissue sarcoma anatomically and bone sarcoma into its origin in bone, cartilage etc. Tier 3 considers histology. Histological classification systems have evolved considerably over the study period, with the description of new entities and reclassification of others. For the purposes of the study, diagnoses recorded at the time were retained, albeit if some have now been modified or replaced e.g. malignant fibrous histiocytoma.
This topographical and histological inclusion is different to the allocation of sarcoma published on the Cancer Australia website [1], but allowed for a more detailed description of sarcoma epidemiology. The reference cancer types were female breast (International Classification of Disease (ICD)-10 code C50), colorectal (ICD-10 codes C18-C20, C218), prostate (ICD-10 code C61), and lung (ICD-10 codes C33, C34).
Outcomes
Study outcomes were; (i) incidence; (ii) corrected prevalence; (iii) relative survival, and; (iv) annual total and rate of hospitalisation and associated costs.
Statistical analysis
All analyses were conducted using Stata SE Version 15 (College Station, Texas).
Descriptive statistics
Descriptive statistics were generated stratified by the following diagnostic periods: 1982–87; 1988–93; 1994–99; 2000–05; 2006–11, and; 2012–16. Differences in categorical variables between periods were assessed statistically using the Pearson’s chi-squared test or Fisher’s Exact test (the latter for small cell sizes), while continuous variables were assessed using the Kruskal-Wallis test.
Incidence
Age-standardised incidence per 100,000 was calculated using the WA mid-year populations published by the Australian Bureau of Statistics (ABS), stratified into 5-year age groups [10]. The European Standard Population (2013) was used as a reference for age-standardisation, to allow comparison of incidence between periods to incidence reported by RARECARENet [15]. Crude incidence by broad diagnostic age group and sex was also reported for 2016.
Prevalence
Prevalence was calculated at 30 June for each diagnostic year by summing incident cases prior to this date among individuals who had not died. Because the prevalent period was 34.5 years (i.e. 1 January 1982 to 30 June 2016), it was assumed that the prevalence in 2016 was accurate. However, for previous years, there was less follow-up and thus a higher chance that people previously diagnosed with sarcoma and still alive were diagnosed before 1982 and therefore not counted among the prevalent population. The approach taken with an earlier analysis of WA data by Maxwell et al. [16] was adopted to correct for this. First, the number of individuals who would be prevalent in 2016, had the start of the study period been 1 January 2016, was calculated. This was then repeated working backwards by 1 year (i.e. 1 January 2015, 1 January 2014 and so on). This generated a proportion of the ‘actual’ prevalent population at mid-2016 which could then be used to generate a ‘correction factor’ to multiply by the apparent prevalence for each year according to the equation:
$$ \mathrm{CP}={\mathrm{P}}_{\mathrm{x}}/\left({\mathrm{P}}_{2016\mathrm{Yyears}}/{\mathrm{P}}_{2016,34.5\ \mathrm{years}}\right) $$
Where CP = corrected prevalence, P = prevalence, X = the year to be corrected, and Y = the number of years of look-back data available for year X. For example, if there were 1000 cases prevalent in mid-2016, but that number would have been 50 if the start of the study period had been 2016 instead of 1982, this would yield a ‘correction factor’ of 50/1000 (=0.05) for mid-1982, where there was only 6 months of diagnostic data available. If the measured prevalence in mid-1982 was 10, the corrected prevalence would then be 200 (10 divided by 0.05). As GIST was first recognised as a diagnostic entity during the study period, this was not reported separately in the prevalence analysis.
Relative survival
Relative survival was estimated using the Ederer II method and a period approach for 2012–2016, using the – strs – user-written command [17, 18]. The relative survival approach compared survival of the group with sarcoma to that of the general WA population. Relative survival is one of several cancer survival measures (see Baade et al. [19]). It was selected for this study because the measure is reported for sarcoma by RARECARENet [20] and Cancer Australia [1]. Single year-age and sex-specific death rates for WA published by the ABS were used [21]. For ages where there were no data, the mean of the previous and subsequent years was used. Individual cancer records ‘entered’ at 1 January 2012 and were followed to the first of all-cause death (failure) or 31 December 2016. The date of death according to the mortality registry was used, unless there was date uncertainty or the date of death was missing, in which case the date of death recorded in the cancer registry was used.
Health service use
Linked hospital admission records were considered cancer-related if they contained a cancer principal diagnosis, chemotherapy or radiotherapy procedure codes [16] (Additional file 1). The total number of episodes and mean episodes per corrected prevalent person, along with the total cost and mean cost per prevalent person, was reported by year of admission. Inter-hospital transfers were considered as a single episode. The cost of each episode of care was assigned based on average price weight for each Australian Refined Diagnosis Related Group (AR-DRG) code specific to the date of separation of each hospital record [22]. Cost values were reported in Australian dollars ($), adjusted to March 2019 using consumer price indices [23].