Multivariable regression analysis of febrile neutropenia occurrence in early breast cancer patients receiving chemotherapy assessing patient-related, chemotherapy-related and genetic risk factors

Background Febrile neutropenia (FN) is common in breast cancer patients undergoing chemotherapy. Risk factors for FN have been reported, but risk models that include genetic variability have yet to be described. This study aimed to evaluate the predictive value of patient-related, chemotherapy-related, and genetic risk factors. Methods Data from consecutive breast cancer patients receiving chemotherapy with 4–6 cycles of fluorouracil, epirubicin, and cyclophosphamide (FEC) or three cycles of FEC and docetaxel were retrospectively recorded. Multivariable logistic regression was carried out to assess risk of FN during FEC chemotherapy cycles. Results Overall, 166 (16.7%) out of 994 patients developed FN. Significant risk factors for FN in any cycle and the first cycle were lower platelet count (OR = 0.78 [0.65; 0.93]) and haemoglobin (OR = 0.81 [0.67; 0.98]) and homozygous carriers of the rs4148350 variant T-allele (OR = 6.7 [1.04; 43.17]) in MRP1. Other significant factors for FN in any cycle were higher alanine aminotransferase (OR = 1.02 [1.01; 1.03]), carriers of the rs246221 variant C-allele (OR = 2.0 [1.03; 3.86]) in MRP1 and the rs351855 variant C-allele (OR = 2.48 [1.13; 5.44]) in FGFR4. Lower height (OR = 0.62 [0.41; 0.92]) increased risk of FN in the first cycle. Conclusions Both established clinical risk factors and genetic factors predicted FN in breast cancer patients. Prediction was improved by adding genetic information but overall remained limited. Internal validity was satisfactory. Further independent validation is required to confirm these findings.


Background
Chemotherapy-induced neutropenia (CIN) and febrile neutropenia (FN) are serious and frequent complications in breast cancer patients receiving adjuvant chemotherapy, and they result in hospitalisations [1][2][3] and chemotherapy dose reductions or delays that impact on treatment outcome and short-term mortality [4]. Adjuvant fluorouracil, epirubicin, and cyclophosphamide (FEC) chemotherapy has an FN risk of between 9% and 14% (low-intermediate risk) [5].
Antibacterial or antifungal prophylaxis has recently been recommended for neutropenic patients expected to have a prolonged low neutrophil count or with other risk factors that favour complications [6]. Prophylaxis with granulocyte colony-stimulating factor (GCSF) in patients at high risk of FN (>20%) is recommended in international guidelines [5,7,8]. For chemotherapy regimens with an intermediate FN risk (10-20%), the European Organisation for Research and Treatment of Cancer (EORTC) GCSF guideline recommends that patient risk factors should also be considered to determine individual risk of FN [5] and the likely benefit of prophylactic GCSF. Therefore, it is important to identify patients at high risk of FN before the initiation of chemotherapy to provide them with appropriate prophylactic measures.
Risk models for the occurrence of CIN [9] and FN [10] in patients with breast cancer have been published. The risk factors identified include older age, lower weight, higher planned dose of chemotherapy, higher number of planned chemotherapy cycles, vascular comorbidity, lower baseline white blood cell count (WBC), lower platelet and neutrophil count, and higher baseline bilirubin. Prior chemotherapy, abnormal liver or renal function, low WBC, higher chemotherapy intensity, and planned delivery were identified as risk factors for neutropenic complications in a prospective US study of patients with different types of cancer [11]. Poor performance status and low lymphocyte and neutrophil counts were risk factors in a European study of solid tumour patients [12], as were tumour stage and number of comorbidities in elderly patients with solid tumours [13].
These risk models of CIN or FN that included patientor chemotherapy-related factors were reported to be predictive. However, more refined models are necessary to achieve satisfactory performance in independent patient populations that include existing and emerging types of data, including stable genetic factors that are easily measurable, objective, and potentially independent from the inherent viabilities of clinical decision-making. Several studies have assessed the impact of genetic factors on haematological toxicity, but these studies were small in size or limited to only a few candidate genetic factors [14][15][16].
The objective of this study was to develop risk models for the occurrence of FN in breast cancer patients receiving FEC chemotherapy in any cycle and the first cycle based on a large set of patient-related, chemotherapy-related, and genetic characteristics.

Study population
We retrospectively studied early (i.e., no distant metastases; Stage I-IIIC) breast cancer patients treated between 2000 and 2010 at the Leuven Multidisciplinary Breast Cancer Center of the University Hospitals Leuven, Belgium. Consecutive patients were included if they received either three cycles of neoadjuvant or adjuvant combination chemotherapy consisting of FEC followed by three cycles of docetaxel or four to six cycles of FEC. Patient-related factors (genetics and tumour characteristics) and chemotherapy-related factors were retrospectively recorded in a clinical database. Haematological toxicities included were: FN (defined as an absolute neutrophil count (ANC) < 0.5 × 10 9 /L and a body temperature ≥ 38°C according to the Infectious Diseases Society of America), prolonged grade 4 neutropenia (≥ 5 days), deep neutropenia (< 100/μl), grade 3/4 thrombocytopenia, and grade 3/4 anaemia during FEC chemotherapy cycles. Haematological toxicities that occurred during chemotherapy cycles with docetaxel were not included in the model. Grade 3/4 non-haematological toxicities were also recorded (toxicity grade based on the Common Terminology Criteria for Adverse Events 3.0 [17]). During most of the study period, only primary prevention with GCSF was reimbursed and, therefore, only used in selected patients aged 65 or over. Similarly, secondary use of GCSF was only reimbursed and used if patients had FN in the previous cycle or if deep neutropenia occurred for at least five days (although the latter was not systematically measured during the study period).
The study design and full analysis of single nucleotide polymorphisms (SNPs) have previously been described in detail [18]; however, in the previous analysis the association of SNPs with FN was only adjusted for age, growth factor use, BMI, and planned cycles of chemotherapy. Only those SNPs that have been reported to be associated with haematological toxicity or to play a role in the metabolism of FEC chemotherapy were included in the current study. Logistic regression was performed to describe the association of SNPs with haematological toxicity, adjusted for known predictors of FN risk such as age, growth factor use, and planned number of cycles of chemotherapy. The ethics committee of the University Hospitals Leuven approved the study and all patients included in the study had given written informed consent for collection of genetic samples and for further analyses using this material and associated data.

Endpoints and predictor variables
The primary endpoint of the study was FN in any cycle, and FN occurring in the first cycle (cycle 1) was the secondary endpoint. The following variables were considered as predictors of FN: planned doses of fluorouracil, epirubicin and cyclophosphamide (FC, 600 mg/m 2 until August 2004 and 500 mg/m 2 after this date; epirubicin 100 mg/m 2 ), age at diagnosis, height, weight, body mass index (BMI), body surface area (BSA), chemotherapy setting (i.e. adjuvant or neoadjuvant), use of GCSF (information only available on primary or secondary use), planned cycles of FEC chemotherapy, selected SNPs [18], baseline WBC, ANC and platelet count, and other baseline laboratory parameters such as haemoglobin, bilirubin, alanine aminotransferase (ALT), aspartate aminotransferase (AST) and creatinine. Although timing and reasoning of GCSF use were incomplete, its potential impact on the variables included in the final model was assessed for exploratory analysis.

Statistical analysis
All analyses were performed using Stata/SE version 12.1 (StataCorp LP, College Station, TX, USA). All statistical tests were carried out two-sided at a 5% significance level and 95% confidence intervals (CIs) were obtained.

Descriptive and univariable analysis
Binary and categorical data were summarised using frequencies and percentages. Continuous data were reported using means and standard deviations. In the univariable analysis of SNPs, the impact of multiple testing was assessed by separately calculating the false discovery rate (FDR) for each endpoint [19]. Associations between the endpoints and binary or categorical variables were assessed using the chi-squared test or Fisher's exact test, as appropriate. Continuous variables and their associations with the endpoints were assessed using univariable logistic regression analysis. Variables were further assessed in multivariable logistic regression analysis if a trend was seen in the univariable analysis (p ≤ 0.25), as recommended [20]. Linear correlations between potential predictors were assessed by calculating Pearson's correlation coefficient and monotonic correlations were assessed using Spearman's rank correlation coefficient. Variables were regarded as being dependent if the correlation coefficient was ≥ 0.7 or the correlation p-value was ≤ 0.05.

Multivariable analysis
Multivariable logistic regression analysis was used to assess the joint explanatory value of the candidate variables identified in univariable analysis; variables were included in the final multivariable models if their corresponding p-value was ≤ 0.05. Where simultaneous inclusion of dependent variables led to estimation problems (collinearity issues), the variable that explained more of the variability present in the endpoint was finally used. As patient-related and chemotherapy-related factors were already established as risk factors in several previous risk models, these variables were entered into the model first, ordered according to the p-value obtained in univariable analysis. SNPs were subsequently added. Interactions between variables were assessed. Model fit was assessed with the Hosmer-Lemeshow [21] goodness-of-fit test. Test characteristics such as specificity (proportion of negatives correctly identified as not having an event), sensitivity (proportion of positives correctly identified as having an event), positive predictive value (PPV, proportion of patients identified to have an event who had an event) and negative predictive value (NPV, proportion of patients identified not to have an event who did not have an event) were obtained. The predictive ability of the final models was assessed by calculating the area under the receiver operating characteristic (ROC; sensitivity over 1-specificity) curve.
To test the internal validity of the final models, nonparametric bootstrapping was performed [22]. Bootstrap estimates of the 95% CIs of the multivariable models were obtained by resampling the data 200 times. The obtained 95% CI estimates of the bootstrap resampling were compared to the 95% CIs calculated by the multivariable logistic regression model.

Characteristics of the study group
Of 1,012 patients that received FEC chemotherapy between 2000 and 2010, 18 patients were excluded due to receiving chemotherapy prior to FEC, which may have impacted on FN risk. The majority of 994 eligible patients received adjuvant chemotherapy (n = 874, 88.0%); the remainder received neoadjuvant chemotherapy. Most patients received three cycles of combination chemotherapy with FEC followed by three cycles of docetaxel (n = 507, 51.0%) or six cycles of FEC (n = 405, 40.7%) ( Table 1). The most common type of breast cancer was invasive ductal carcinoma (n = 823, 82.8%) and patients mostly had grade 2 (n = 334, 34.1%) or grade 3 (n = 606, 61.9%) tumours. FN occurred in any cycle in 166 (16.7%) patients, of which 107 (10.8%) had FN in the first cycle of FEC chemotherapy. The most common haematological toxicity was prolonged grade 4 neutropenia (n = 345, 34.7%). Other haematological toxicities such as grade 3/4 thrombocytopenia and severe bleeding, and grade 3/4 non-haematological toxicities such as diarrhoea, mucositis, and neuropathy were rare (n < 10, <1%). Primary prophylactic GCSF (before a CIN or FN event occurred) was given to 15 (1.5%) patients and the majority received no GCSF (n = 654, 65.8%). Additional toxicities and other relevant characteristics such as planned number of chemotherapy cycles, tumour stage, and subtype are presented in Table 1. The list of SNPs included in the analyses is shown in Table 2.

Univariable analysis
All candidate predictors (p ≤ 0.25) for FN in any cycle and in cycle 1 are shown in Table 3. Patient-related factors (genetics, laboratory parameters, etc.) and chemotherapyrelated factors fulfilled the inclusion criteria for the multivariable analysis. The number of planned FEC cycles, WBC, ANC, platelet count, and haemoglobin were significantly associated with FN in any cycle and cycle 1 (p ≤ 0.05). SNPs significantly associated with FN in any cycle and cycle 1 were the rs4148350, rs45511401, and rs246221 variants in MRP1 (multidrug resistanceassociated protein 1). The FDR for associated SNPs for any cycle FN was 0.47 and 0.33 for cycle 1 FN. There were no correlations between SNPs included in the final model and patient-related or chemotherapy-related factors.

Risk factors of febrile neutropenia in any cycle
Multivariable regression identified the following factors to be significantly associated with a higher occurrence of FN: lower platelet count and lower haemoglobin at    Figure 1a: a value of 1 would denote perfect discrimination and 0.5 discrimination no better than chance. Overall, 864 of 910 patients (84.0%) were correctly classified by the logistic regression model at a predicted probability cut-off of 0.5; six out of 150 having FN and 758 out of 760 not having FN. Sensitivity was very low (4.0%) compared to specificity (99.7%). NPV and PPV were similar; the proportion of patients correctly identified not to have FN was 84.0% and the proportion of patients correctly identified to have FN was 75.0%. When the optimal cut-off of the model was used (i.e., predicted probability of 0.1609, where sensitivity and specificity were almost identical at 61.3%), the model correctly classified 61.2% of the patients and PPV and NPV were 23.8% and 88.9%, respectively. Internal validity of the FN in any cycle model was satisfactory; the 95% CIs of the bootstrap resampling were similar to the 95% CIs calculated by the multivariable logistic regression model.

Risk factors of febrile neutropenia in cycle 1
Lower platelet count, haemoglobin at baseline, and lower patient height were significantly associated with a higher risk of FN in cycle 1 ( Table 4). The SNP found to be significantly associated with FN in cycle 1 was rs4148350 in MRP1. For rs4148350, homozygous carriers of the Tallele had a higher risk of FN in cycle 1 than carriers of the homozygous or heterozygous G-allele (FN risk of 40% versus 10% or 18%). We found a statistically significant interaction between haemoglobin and height that increased the protective effect of higher haemoglobin and increased height but did not affect the other main effects of the model.
The area under the ROC curve was 0.664 (CI 0.633-0.694), as presented in Figure 1b. At a probability cut-off of 0.5, one out of 98 patients was correctly classified having FN in cycle 1 and all 839 patients without FN in cycle 1 were correctly classified not having FN (overall, 89.7% correct classifications). Sensitivity was very low (1.0%); specificity was 100%, PPV was 100%, and NPV was 89.6%. At the optimal probability cut-off for the model (0.1041), 61.5% of the patients were correctly classified, sensitivity and specificity were 61%, PPV was 15.7%, and NPV was 93.1%. The 95% CIs of the bootstrap resampling were similar to the 95% CIs calculated by the multivariable logistic regression model, which supports the internal validity of the FN in the first cycle model.

Discussion
In this population of early breast cancer patients seen in routine clinical practice at a tertiary referral centre, we identified a set of genetic factors, in addition to patientrelated and chemotherapy-related factors, that predict occurrence of FN in any cycle or the first cycle of chemotherapy. Significant predictors of a higher risk of FN in any cycle and in cycle one were: lower baseline platelet count, lower baseline haemoglobin, and carriers  Although the predictive ability of the models was improved by including genetic factors, the overall predictive ability remained poor. Genetic effects were stable and FN occurrence was very high in patients with specific SNP allele variants. The observed effects of lower baseline platelet count and haemoglobin are consistent with previous reports. Baseline platelet count has been shown to differ between cancer patients with mild and severe haematological toxicity [16], and low haemoglobin has been mentioned as possible risk factor for FN [27] and survival [28]. In the model of FN occurrence in any cycle, higher baseline ALT was significantly associated with FN but not baseline bilirubin [9,29]. Both measures are indicators of liver function and since the liver detoxifies drugs like epirubicin [30], impaired liver function may be an important risk factor for FN occurrence in patients receiving chemotherapy with epirubicin. A predictive role for WBC or ANC in CIN and FN occurrence in cancer patients receiving chemotherapy has been described in other studies [9][10][11][12], but could not be confirmed in our models. Most SNPs previously associated with FN occurrence [18] and   reported to be involved in anthracycline-induced cardiotoxicity [31][32][33] were confirmed in the multivariable analysis. The SNP rs45511401 was not included in the multivariable regression model as it was highly correlated with rs4148350, and the latter variant explained the model variability slightly better. There were no correlations between SNPs included in the final model and patient-or chemotherapy-related factors.
International guidelines [5,7,8] and the literature [9,12] report age, planned dose intensity, and planned number of chemotherapy cycles to be important risk factors for CIN and FN during chemotherapy. These risk factors could not be confirmed in our models. Patient-specific approaches to clinical management were not recorded in detail in this study and might therefore have masked the effect of age on FN occurrence. In addition, the exact cycle of FN occurrence was not available after the first cycle. Factors previously reported to protect against CIN and FN in any cycle of chemotherapy, such as dose reductions, dose delays, or growth factor use before an event occurred, could not be investigated since the details, reasons, and timing information were not available and only 15 out of 994 patients received primary prophylaxis with GCSF, mainly due to reimbursement criteria.
The apparent predictive ability, i.e., the predictive ability assessed in the 'training' dataset used to develop the models, was lower than in previously published models of CIN or FN occurrence in other cancers [9,11,34]. In these models, sensitivity and specificity at the optimal predicted probability cut-off was about 70% or higher, but in this study it remained below 70%. As commonly seen in models of FN occurrence, the NPV (≥ 90%) was much higher than the PPV because FN incidence is often around 20%; this implies an NPV of around 80% for simply assuming that FN does not occur in any patient. The areas under the ROC curves were relatively low but significantly higher than 0.5, the value indicating no predictive ability. In other words, the models allowed partial discrimination of patients at low or high risk of FN. Including genetic risk factors improved the models but absolute predictive ability remained rather low. The effects of the SNPs were stable and FN occurrence was very high in patients with specific, sometimes rare, SNP allele variants. In terms of clinical implications, genetic testing might help to identify a small proportion of patients at very high risk of FN who can be targeted with prophylactic measures. For the majority of patients, the current models do not reliably identify patients that will develop FN, but they do delineate patients who are unlikely to develop FN. This is clinically relevant since patients at low risk of FN probably do not need primary GCSF prophylaxis or nadir assessment, while the highrisk group is unpredictable and might need more extensive preventive measures or follow-up.
The performance of any model tends to be highest in the training dataset. The results obtained with bootstrap resampling supported the internal validity of the FN in any cycle and the FN in first cycle models. The predictive ability of the models has yet to be tested in an entirely independent population, where model performance is usually lower. Before risk models are put to clinical use, true external validation is essential [35,36]. Another limitation of this study is the retrospective design; no detailed information was available on patient management in clinical practice, which is known to influence the risk of FN occurrence, and the reasons and timing of dose reductions and dose delays were not available. FN occurrence was not assessed according to chemotherapy cycle beyond the first cycle. GCSF was only administered to 15 patients before an event occurred due to stringent reimbursement criteria. Hence, the impact of GCSF on FN occurrence was difficult to assess.
To the best of our knowledge, this is the first study of risk of FN in the first and any cycle of chemotherapy in patients with early breast cancer that combined a set of patient-and chemotherapy-related factors with a large Figure 1 Receiver operating characteristic curve for febrile neutropenia occurrence in a) any cycle and b) cycle 1 of chemotherapy. ROC, receiver operating characteristic. *bysecting line indicates a predictiove ability that is no better than chance (ROC = 0.5).
set of SNPs. Further validation studies are needed to confirm our findings, which should ideally be prospectively designed, sufficiently powered, and measure all possible predictors of FN occurrence reported in the literature. Approaches to clinical management that are measurable and known to influence the risk of FN occurrence, such as dose modifications or growth factor use before an FN event occurred, should be included. Information on SNPs should be available for as many patients as possible and the frequencies of possible genotypes of one SNP should be similar. Validated genetic factors have the potential to become reliable predictors of FN occurrence. The specific SNPs that were assessed in this study are independent from clinical decision-making and therefore less likely to be confounded by clinical practice.

Conclusions
We have identified a set of chemotherapy-related, patientrelated, and genetic risk factors that predict occurrence of FN in the first and any cycle of chemotherapy in a large cohort of early breast cancer patients. Genetic effects in the models improved the predictive ability, but the overall predictive ability of the models remained poor. FN occurrence was very high in patients with specific SNP allele variants. Up-front genetic testing might be helpful to identify a limited group of very high-risk patients. Further independent validation is required to develop risk models that include genetic predictors of FN occurrence and can be used to personalise care.