Epidemiological features and survival outcomes in patients with malignant pulmonary blastoma: a US population-based analysis

Background Pulmonary blastoma (PB) is a rare lung primary malignancy with poorly understood risk factors and prognosis. We sought to investigate the epidemiologic features and long-term outcomes of PB. Methods A population-based cohort study was conducted to quantify the death risk of PB patients. All subjects diagnosed with malignant PB from 1988 to 2016 were screened from the Surveillance, Epidemiology and End Results database. Cox regression model of all-cause death and competing risk analysis of cause-specific death were performed. Results We identified 177 PB patients with a median survival of 108 months. The 5 and 10-year survival rate in all PB patients were 58.2 and 48.5%, as well as the 5 and 10-year disease-specific mortality were 33.5 and 38.6%. No sex or race disparities in incidence and prognosis was observed. The death risk of PB was significantly associated with age at diagnosis, clinical stage, histologic subtype and surgery treatment (p<0.01). On multivariable regression analyses, older age, regional stage and no surgery predicted higher risk of both all-cause and disease-specific death in PB patients. Conclusion We described the epidemiological characteristics of PB and identified its prognostic factors that were independently associated with worse clinical outcome.

biphasic PB (CBPB) containing tissue of both fetal adenocarcinoma (typically of low grade) and primitive mesenchymal stroma [8,9]. The existence of partiallyoverlapping genetic abnormalities in PPB, CBPB and WDFA has been proved [10], and the original pathological grouping and histological characteristics of these three subtypes are similar and coherent. Due to the rarity of PB, there are few researches exploring the longterm outcome of these populations. Most previous studies are case reports and literature reviews focusing on a small number of subjects, the results are ambiguous and even controversial. The aims of our study were to describe the epidemiological features of malignant PB in detail and to investigate the independent prognostic factors for PB patients.

Study population
The Surveillance, Epidemiology and End Results (SEER) database (https://seer.cancer.gov/), a publicly available cancer database covering 34.6% of the US population, was applied to retrieve patients diagnosed with malignant PB between 1988 and 2016, using National Cancer Institute's SEER*Stat software (version 8.3.5). The diagnosis of PB has to be histologically confirmed by surgery or lymph node biopsy. Histology codes (International Classification of Disease for Oncology, third edition, ICD-O-3) 8972/3, 8973/3 and ICD 8333/3 were used for the identification of all CBPB, PPB and WDFA cases, respectively. The International Classification of Diseases-10 (ICD-10) codes were used to identify the underlying causes of death. PB patients with unavailable cancerspecific data and vital status were excluded. Informed consent was not required for the analysis of the data from SEER.

Clinically applicable predictors and primary outcome
The primary focus of this study was given to the potential predictors of overall survival (OS) and diseasespecific survival (DSS) in patients with malignant PB. Predictors were specified based on the availability in clinical practice of PB and the published literatures [2,[11][12][13]. Age at diagnosis was divided into 3 groups: ≤14 years, 15-64 years, and ≥ 65 years. Race was classified into white, black and other (American Indian/AK Native, Asian/Pacific Islander). The year of diagnosis was categorized into two periods: 1988-2006 and 2007-2016. Clinical variables included anatomical laterality (left, right and others), primary site (upper lobe, lower lobe, and other sites), histological subtype (CBPB, PPB and WDFA), clinical stage (localized, regional and distant), surgery status (yes/no) and presence of second or more primary malignancies (yes/no). "Others" in anatomical laterality was defined as "bilateral sites" or "unspecified site". Other primary sites included main bronchus, pleura, subcutaneous tissue and other soft tissue, and overlapping lesion of lung, heart, mediastinum and pleura. Tumor stage was described as "localized" if it is entirely confined to the original organ, "regional" if it extends to regional lymph nodes and/or surrounding organs or tissues, and "distant" if it has metastasized to distant organs or lymph nodes according to SEER staging system. We chose not to include tumor grade as an indicator for two reasons: firstly, this information was unknown for almost 60% of the cases; secondly, PPB and CBPB are generally not graded and WDFA is by definition grade I (although high grade fetal adenocarcinoma also has been described) in the clinical practice. Followup time was defined as the time from diagnosis to the date of death, last contact or end of the study period (31 December 2016), whichever occurred first. Subjects with any missing data relevant to the outcome were excluded from our study in order to perform a complete case analysis.

Statistical analysis
The distributions of all baseline data were summarized by calculating the frequencies for categorical variables, which were further analyzed by chi-square to determine statistical significance. The median follow-up time was evaluated using the reverse Kaplan-Meier method. Hazard ratios (HRs) and 95% confidence intervals (95% CIs) for mortality associated with various potential predictors were calculated using Cox univariate analysis. For multivariate analysis, Cox proportional hazards regression modeling was adopted to identify the predictors independently associated with death risk by adjusting for a large set of covariates. R program (Version 3.6.3, R core team) was used to perform statistical analysis and make figures according to a priori defined study protocol. All tests were 2-sided, and statistical significance was set as p-value of < 0.05.

Sensitivity analysis
In addition to the primary analysis, sensitivity analysis was conducted to evaluate the robustness of our findings. We applied competing-risk model to test, under careful consideration of the competing risk events of our interest events, how the conclusions would be affected. The commonly used endpoint target of competitive risk analysis was the cumulative incidence function (CIF). Crude cumulative mortality was calculated and plotted for disease-specific death and death from other causes among PB patients. Additionally, stratified analyses by predictors with statistical significance were performed. Competing-risk model was completed using the R package 'cmprsk' (R Foundation for Statistical Computing, Vienna, Austria) [14,15].

Baseline characteristics of study population
The demographic and clinical characteristics of all PB patients were shown in Table 1 Patients who were 65 years and older had the worst clinical outcome, and almost half of them died within 2 years (Fig. 2a, p<0.001). Patients with regional or distant tumor stage had dramatically increased risk of all-cause death (regional, HR: 2.16, 95% CI:1.28-3.66; distant, HR: 3.28, 95% CI: 1.87-5.77) compared to those with localized stage. The 5-year OS for patients with localized stage was more than 75%, while for patients with regional or distant stage it was less than 40% (Fig. 2c, p< 0.001). As expected, no surgical treatment portended worse outcome with a quite higher death risk (HR: 6.93, 95% CI: 4.07-11.78). Nearly 70% of non-operated patients died in the first year, but the 5-year OS for operated patients was above 60%. (Fig. 2d, p<0.001).
As to the survival difference in three histological subtypes of PB, PPB patients showed a significant reduction in all-cause death compared to CBPB patients (HR: 0.39, 95% CI: 0.21-0.73), and their OS stabilized at around 75% after a 30-month follow-up, after which all survivors achieved long-term survival. WDFA patients had a lower risk of death compared to CBPB patients, but the differences showed no statistical significance (Fig. 2b, p = 0.005). Moreover, patients with multiple malignant cancers suffered a higher all-cause death risk than those only with PB, although no significant difference was observed between two groups (HR:1.57, 95% CI: 0.21-0.73, p = 0.068) (Supplement Fig. 1A). No significant decrease of all-cause death was observed in patients who were diagnosed between 2007 and 2016 compared to those diagnosed between 1988 and 2006, despite of the great advances in medical care during this decade. Other factors like sex and race of the patients, as well as the primary site and anatomical laterality of the tumor, were not observed to be significantly associated with the clinical outcome of PB (Supplement Fig. 1B-F).

Cumulative incidence of disease-specific death
The cumulative mortality for various causes of death among all patients were illustrated in Fig. 3a. PB caused by far the main mortality, with the majority of diseasespecific deaths occurring within 30 months. During this time, the cumulative disease-specific mortality rose rapidly to nearly 30%, whereafter gradually stabilized below 40%. The 5 and 10-year disease-specific mortality of PB was 33.5 and 38.6%, respectively. By contrast, the othercaused cumulative mortality showed a relatively flat upward trend with the extension of follow-up time, and the 10-year mortality caused by other reasons remained below 15%.

Disease-specific mortality stratified by age and clinical stage
The cumulative incidences of disease-specific death and other-caused death among PB patients in different stratifications were illustrated respectively. There were 60 cases of disease-specific death (33.9%) and 18 cases of other-cause death (10.2%) during the follow-up period. Stratified by age groups (Fig. 3b), patients aged 65 years or older had the highest cumulative disease-specific mortality and other-caused mortality compared with patients in other groups. Other-caused mortality sharply increased with age at diagnosis(p<0.001) and length of follow-up, but the disease-specific mortality between three age groups showed no statistical difference (p = 0.066). The association between clinical stage and disease-specific mortality was strong, and the regional or distant stage presented a much worse outcome with quite higher disease-specific mortality compared with localized stage (p<0.001). Strikingly, almost half of the patients with distant stage died of this tumor within 20 months (Fig. 3c).
Disease-specific mortality stratified by histological subtype and other factors Figure 3d demonstrated the disease-specific mortality among patients with different histological subtypes. The association between histological subtype and clinical outcomes of PB patients was still pronounced when it comes to the risk of disease-specific death (p<0.05). Few patients with PPB died from causes other than this disease, and almost 80% of them could achieve long-term survival after 30 months' follow up. The disease-specific mortality and other-cause mortality of patients with CBPB were higher than those of patients with PPB or WDFA. The disease-specific mortality of CBPB patients sharply increased to 40% within the first 3 years, and then gradually approached 50% in the ten-year's follow up. The death rate from other causes in CBPB was close to 10%, accounting for a quarter of the overall cumulative mortality in the fifth year. As to the effect evaluation of surgery (Fig. 3e), it was observed that the cumulative incidences of disease-specific death in patients without surgery was close to 80% within 20 months, which was significantly higher than that in operated patients (p< 0.001). In addition, to examine the association between co-existing primary cancers and cause of death, we Fig. 1 Forest plot of HR for all-cause death in PB patients. A total of 177 patients with malignant PB were stratified by different factors. HRs and 95% CIs for all-cause death in different stratifications were calculated using Cox models, with the first subgroup as reference. The p values were for the difference between subgroups in each stratification. All tests were 2-sided, and statistical significance was set as p-value of < 0.05. *, ** and *** indicated p<0.05, p<0.01 and p<0.001 respectively (R program, Version 3.6.3, R core team). PB, pulmonary blastoma; PPB, pleuropulmonary blastoma; WDFA, well-differentiated fetal adenocarcinoma; CBPB, classic biphasic PB; HR, hazard ratio; CI, confidence interval stratified the cohort into two groups with or without a second or more primary cancers (Fig. 3f). The findings suggested some new changes in the trend of the association. Although the cumulative incidences of diseasespecific death between two groups showed no statistic difference, the main cause of death in patients without other primary malignancy was PB itself, while nearly half of the death of patients with multiple malignant cancers was caused by other reasons other than PB itself (p< 0.001).

Independent predictors of all-cause death and diseasespecific death
Multivariate analysis was further performed to investigate the independent prognostic factors for survival among PB patients. All above-mentioned predictors with statistical significance in univariate analysis, including age stratification, histological subtype, clinical stage, surgery treatment, and one or more primary malignancies, were included into Cox proportional hazards regression model or proportional subdistribution hazards regression model, respectively. Two multivariate analysis models yielded nearly identical results. Age stratification, clinical stage and surgery treatment turned out to be independently associated with the all-caused death and disease-specific death in PB patients, while the statistical correlation between histological subtype and the survival of PB patients was no longer significant. The calculated effect sizes and p values from two analysis models were reported and compared in Table 2.

Discussion
Although PB has been known for over 70 years, its prognostic factors remain largely unknown. Limited evidence suggests that age of onset, gender, anatomical location, tumor size and stage, histologic subtype, comorbidities and metastasis status and surgical resection may be associated with different outcome, but these results warrant further validation [2,[6][7][8][9]. PB is previously reported to be an aggressive tumor with relatively poor prognosis. Some previous literature mentions that two-thirds of PB patients die within 2 years of diagnosis, only 16 and 8% survive 5 and 10 years post diagnosis, respectively [13,16,17]. But in our study, nearly half of the PB patients achieved longterm survival, the 5 and 10-year survival rate in all PB patients were 58.2 and 48.5%, even 40% of patients with  metastatic PB achieved long-term survival over 5 years. It could be seen that the survival rate of PB in our study was quite higher than that in previous reports [18]. In addition, we found a slight female preponderance in the incidence of PB, and female patients were more common in all three histological subtypes. Whereas, the clinical outcomes between two gender groups indicated no significant difference. Nearly a quarter of PB patients were associated with other malignancies, which had not been reported before. Considering the previously published papers were almost case reports and literature reviews with small sample size, our study was more informative. The three subtypes of PB, CBPB, PPB and WDFA, are reported to be distinguished on morphological, immunohistochemical and radiographical and clinical outcome grounds. The World Health Organization (WHO) classification of lung tumors in 2004 qualifies CBPB as lung sarcomatoid carcinoma, PPB as pulmonary soft tissue tumor and WDFA as a lung adenocarcinoma variant [4]. Patients with CBPB generally present with common symptoms of lung cancer and larger diameter tumors [19]. WDFA often radiographically manifests as peripheral asymptomatic nodules with mixed solid and cystic components [20]. The biological characteristics of PPB are unique, and the tumors often undergo a transition from cystic to solid based on different subtypes and disease progresses, of which type I is associated with better prognosis and type III has the worst prognosis [21]. The 5-year survival for CBPB was reported about 15% versus 62% for PPB and about 75% for WDFA [2,5,11,20]. It was worth noting that the 5-year survival rate between different histological subtypes in our study did not show such huge disparity despite statistical differences (Fig. 2b), and the mortality gap among three subtypes was even smaller when it comes to disease-specific death (Fig. 3d). Further multivariate analysis also indicated that histological subtype was not an independent predictor of prognosis in PB patients.
As our results suggested, peak ages of onset in three subtypes were quite different. PPB occurred almost exclusively in children aged 14 years or younger, which was why the OS and DSS of PPB patients were almost identical. On the contrary, CBPB occurred in all ages and mostly in middle-aged and old patients, and the chances of dying from other factors other than PB itself were increasing dramatically with age (Fig. 2b). Othercaused deaths even accounted for 30% of overall deaths in patients 65 years or older at diagnosis (9 of 30). Undoubtedly, the difference in mortality between different histological subtypes could be influenced by the uneven distribution of the number of patients at different ages. The impact of PB on OS was much more potent in the older cohort. New molecular data indicates that PB patients share some overlapping molecular profiles, and DICER1 mutations are found to be important drivers and are likely to be associated with the later presentation of both CBPB and WDFA, as well as PPB [10]. The similarities and differences among three subtypes of PB should be explored further.
Surgical excision is regarded as the optimal treatment choice for well-localized mass and regional disease, 86.4% of the patients in our study performed surgery. Consistent with previous reports [22,23], surgery treatment significantly prolonged the survival time of PB patients, and the cumulative incidences of disease-specific death in operated patients was much lower than that in non-operated patients. Expanded resection plus lymph node dissection is the preferred method of PB treatment, and the specific range of operations should be customized according to individual clinical features. Postoperative radiochemotherapy can be performed when lymph node metastasis or surrounding tissue involvement is observed. However, it's reported that only a few cases were sensitive to radiotherapy [24]; Cutler et al. summarized the clinical outcome of 468 patients who underwent postoperative chemotherapy and found that the effect of single or combined medication was not satisfactory, and the median survival of these patients was only 14.7 months [25]. Some scholars believe that the survival time of PB is mainly related to the degree of resection and the prognosis of patients with complete resection is better. While some other scholars think that, the effect of PB surgery largely relies on the differentiation of mesenchymal components. Patients with immature, undifferentiated and embryonic-like tumor tissues have the better prognosis. Because there are few cases of continuous long-term follow-up before and after surgery, the optimal therapeutic regimen of PB needs to be further explored.
In this registry-based cohort study, SEER database, a large population-based resource, was applied to provide valuable information of these low-incidence malignancies. To our knowledge, this study had the largest number of subjects among all researches conducted so far on the long-term clinical outcome of patients with malignant PB. Our results filled some previous gaps in terms of epidemiology of PB, as well as added new evidence to current controversial issues about the prognosis of PB patients. Another highlight of the study was that we used two different statistical methods to analyze the overall survival and disease-specific mortality of PB patients during various follow-up period. As we all know, there are multiple endpoint events in prospective observational cohort study, and if one event may affect the probability of another event or completely hinder its occurrence, they will be competitive risk events for each other. The standard Kaplan-Meier analyses reflect mortality from the event of interest without the consideration of competing events. This approach of treating failures from competing events as censored will lead to an overestimation of the absolute risk of the event of interest and is less clinically relevant [26]. Therefore, we applied the competitive risk model, an analytical method designed for the survival data with multiple potential outcomes, to calculate the diseasespecific mortality in a condition of retaining the underlying risk set for patients who died due to competing causes of death. As we mentioned above, similar findings were observed in the competitive risk model analysis. And the independent prognostic factors for PB predicted by two different statistical models were the same, which further showed the robustness of our results.
Our study had several limitations. First, this was a retrospective study based on administrative information from the SEER database. Therefore, clinical variables such as tumor morphology, chemoradiotherapy information, complications and medication use were lacking. In addition, details from the surgery procedures and preoperative TNM-classification were not available. Second, as with any other retrospective study, we could not exclude the possibility of residual or unmeasured confounding. Third, although SEER is designed to approximate the national distribution of cancer characteristics by collecting cancer incidence data from population-based cancer registries in the USA, it is derived from 18 states and covers only 34.6% of the U.S. population, which may lead to over-or underrepresentation of certain hospital types and limit its generalizability to other population. Another limitation of this study was its small sample size due to the low incidence rate of PB, resulting in the compromise in quality of estimates. Nevertheless, the unique strengths in this study were the preciseness of statistical analyses and the long follow-up time, which partially increased the power of test.

Conclusion
In conclusion, older age, biphasic tumors (CBPB), initial presence of metastasis (stage of distant) and not receiving surgery were identified to be closely associated with an unfavorable prognosis of PB. The independent predictors of both all-cause death and disease-specific death in PB patients were age stratification, clinical stage and surgery treatment. In a word, our study filled some previous gaps in terms of PB epidemiology, provided new evidence to current controversial issues about the prognosis of this rare lung cancer, and would be helpful to guide the prognosis estimation of PB patients.