- Research article
- Open Access
- Open Peer Review
Pilot study on developing a decision support tool for guiding re-administration of chemotherapeutic agent after a serious adverse drug reaction
© Loke et al; licensee BioMed Central Ltd. 2011
- Received: 8 April 2011
- Accepted: 28 July 2011
- Published: 28 July 2011
Currently, there are no standard guidelines for recommending re-administration of a chemotherapeutic drug to a patient after a serious adverse drug reaction (ADR) incident. The decision on whether to rechallenge the patient is based on the experience of the clinician and is highly subjective. Thus the aim of this study is to develop a decision support tool to assist clinicians in this decision making process.
The inclusion criteria for patients in this study are: (1) had chemotherapy at National Cancer Centre Singapore between 2004 to 2009, (2) suffered from serious ADRs, and (3) were rechallenged. A total of 46 patients fulfilled the inclusion criteria. A genetic algorithm attribute selection method was used to identify clinical predictors for patients' rechallenge status. A Naïve Bayes model was then developed using 35 patients and externally validated using 11 patients.
Eight patient attributes (age, chemotherapeutic drug, albumin level, red blood cell level, platelet level, abnormal white blood cell level, abnormal alkaline phosphatase level and abnormal alanine aminotransferase level) were identified as clinical predictors for rechallenge status of patients. The Naïve Bayes model had an AUC of 0.767 and was found to be useful for assisting clinical decision making after clinicians had identified a group of patients for rechallenge. A platform independent version and an online version of the model is available to facilitate independent validation of the model.
Due to the limited size of the validation set, a more extensive validation of the model is necessary before it can be adopted for routine clinical use. Once validated, the model can be used to assist clinicians in deciding whether to rechallenge patients by determining if their initial assessment of rechallenge status of patients is accurate.
- Adverse Drug Reaction
- Clinical Predictor
- Attribute Subset
- Positive Rechallenge
- Spontaneous Adverse Drug Reaction
The chemotherapeutic drug class was identified as the most common class for adverse drug reactions (ADR), accounting for 21.8% of 408 ADRs reported in an Indian hospital . Patients on chemotherapy may experience serious ADRs which can be potentially fatal and may require costly interventions. 10.5% of 4075 women on chemotherapy had hospitalizations or emergency room visits for serious ADRs, resulting in an additional $1271 per person annually . However, re-introduction of chemotherapeutic agents may be required due to the lack of alternative treatments. Currently, there are no standard guidelines for recommending re-administration of the chemotherapeutic drug to the patient after a serious ADR incident. Thus, management decisions are based purely on the experience of the clinicians and are highly subjective. Although patients with mild ADRs can usually be re-administered with the drug without much risks, there is still no consensus on whether patients with serious ADRs should be re-administered with the drug. This is because there are currently no methods which can accurately identify patients who will have negative rechallenge (ADRs do not re-occur upon re-administration of drug) or positive rechallenge (ADRs re-occur upon re-administration of drug). Thus, it will be useful to have a method that can predict patients' rechallenge status in order to improve patient safety and assist in chemotherapy choices.
Data mining is the use of sophisticated data analysis tools to discover patterns and relationships in large data set . It can build computational models from data sets by learning from past experiences. Applications of data mining have been widely used in many diverse areas like business, medical research and pharmacovigilance. For example, Nordyke et al. proposes the use of Naïve Bayes in the automated diagnosis of thyroid dysfunction . Hence, data mining methods could potentially be useful in analyzing an ADRs database and identifying important clinical predictors for patients' rechallenge status.
The aim of this study is to develop a clinical decision support tool, using data mining methods, to assist in determining the appropriateness of re-introducing a chemotherapeutic agent following the confirmatory association between the drug and the occurrence of a serious ADR in a patient. A Naïve Bayes method was used to analyze differences between the profiles of patients with negative rechallenge and those with positive rechallenge, and to develop a model to predict the patients' rechallenge status. Naïve Bayes was selected as it has been shown to produce relatively good classification performance and is straightforward in implementation . A genetic algorithm (GA) attribute selection method was used to identify clinical predictors that could differentiate patients with negative challenge from those with positive rechallenge. GA can identify several combinations of clinical predictors that are equally good. This allows analysis of the results and selection of the most clinically relevant attributes as predictors.
A total of 854 patients who were treated at the National Cancer Centre Singapore (NCCS) during the period 2004-2009 and experienced ADRs from chemotherapeutic agents were identified using spontaneous ADR reporting forms. Among these patients, 81 experienced serious ADR and 46 of them were rechallenged. These include 23 negative rechallenge cases and 23 positive rechallenge cases. Information about these 46 patients' demographics, relevant medical records and laboratory values were collected from the case notes and electronic medical systems.
The dataset was split into 3 sets: training, testing and validation. A total of 24 cases were randomly selected from the 35 cases that occurred between 2004 to 2008 to form the training set for developing the Naïve Bayes model. All the 35 cases were used as the testing set to test the performance of the model during the GA attribute selection process. The 11 cases that occurred during 2009 were used as an independent validation set to validate the model. These 11 cases were not used during the development of the model or during attribute selection.
Patient's attributes that were collected in this study
- Drug allergy
- Cancer type
- Cancer malignancy
- Symptoms of the ADRs
- Hospitalization for prior
- ADRs affected organ systems
- Onset of ADRs
- Number of cycles
- Chemotherapeutic drug
- Chemotherapeutic drug class
- Number of doses
- Concurrent medications
- Dose reduction on rechallenge
- Rechallenge on same day
- White blood cell
- Red blood cell
- Serum creatinine
- Alkaline Phosphatase
- Alanine aminotransferase
- Aspartate aminotransferase
Discrete categories of attributes were derived from some of the continuous attributes. For example, age was expressed as a continuous variable and a discrete variable with 2 categories: elderly (65 years old and above) and non-elderly (less than 65 years old) . The number of concurrent medications was also categorized into a binomial attribute to test for possible associations between polypharmacy and rechallenge status. The definition of "more than 5 concomitant drugs" for serious polypharmacy  was adopted in this study. In addition to the presence of comorbidities, the number of comorbidities for each patient was also calculated. These comorbidities include hypertension, diabetes, hyperlipidemia, psoriasis, gastroestrophageal reflux disease (GERD), asthma and other allergic disorders. Other additional attributes constructed include the class of chemotherapeutic agents, types and malignancy of cancer, and types of ADRs experienced.
Laboratory parameters used include full blood count (white blood cell, red blood cell, platelet, neutrophil, lymphocyte, monocyte, eosinophil and basophil levels), renal indices (serum creatinine) and liver indices (aspartate aminotransferase, alanine aminotransferase and alkaline phosphatase). These are routinely analyzed for all patients in NCCS and the laboratory panels were obtained from the most recent measurements before the reported ADR incident. Additional attributes representing the presence or absence of abnormal laboratory parameters were also constructed.
The total number of attributes used for data mining was 53. Detailed information on these 53 attributes is provided in Additional File 1: Appendices 1 to 3.
Genetic Algorithm Attribute Selection
The performance of the model was measured using AUC, which is frequently used to evaluate prediction models in the biomedical informatics field [10, 11]. In addition, the sensitivity and specificity of the model were calculated. Sensitivity refers to the proportion of patients with negative rechallenge who are predicted to have negative rechallenge. Specificity refers to the proportion of patients with positive rechallenge who are predicted to have positive rechallenge.
Descriptive statistics of clinical predictors in testing set (n = 35)
Negative rechallenge (Mean +/- SD)
Positive rechallenge (Mean +/- SD)
47.7 +/- 14.9
55.4 +/- 14.2
35.7 +/- 4.6
30.9 +/- 5.6
4.06 +/- 0.58
3.69 +/- 0.81
331.41 +/- 153.62
324.39 +/- 150.86
Categorical and binominal attributes
Negative rechallenge (proportion)
Positive rechallenge (proportion)
Abnormal WBC level*
Abnormal ALT level*
Abnormal AP level*
Naïve Bayes model
The Naïve Bayes model developed using the 35 cases that occurred between 2004-2008 and the selected clinical predictors had an AUC of 0.767 for the validation set.
A platform independent version and an online version of the model (PaDEL-Rechallenge) is available at http://padel.nus.edu.sg/software/padelrechallenge. This will facilitate independent validation of the model by clinicians.
It is important to note that the identified clinical predictors in this study were only found to be associated with rechallenge status. However, association does not imply causality. It is also essential to note that most of these were weak associations, but when considered together in the Naïve Bayes model, they were found to be useful for predicting patients' rechallenge status. Detailed discussion on the individual predictors can be found in Additional File 1: Appendix 4.
Since there are no similar studies that develop models for predicting patients' rechallenge status, we will assess the performance of our model by making tentative comparisons with other models developed in other biomedical fields. A Naïve Bayes and Radial Basis Function method for predicting implantation potentials of IVF embryos reported superior performance with an AUC of 0.712 . Another model that used artificial neural network model to differentiate between patients with and without prostate cancer had an AUC range of 0.77 to 0.81 . Thus, our Naïve Bayes model with an AUC of 0.767 can be considered to have acceptable prediction performance.
It is important to note that the rechallenge status of those patients who were not rechallenged will never be known. Since our model was not trained using this group of patients, it is not justifiable to replace clinical judgement with our model for predicting the rechallenge status for all patients. Instead, a more suitable application for our model will be to assist in subsequent clinical decision making after clinicians had identified a group of patients who are likely to have negative rechallenge.
The 13 serious ADR cases that occurred in 2009 will be used to illustrate the potential usefulness of our model for this type of application. Initial clinical judgement identified 11 cases as potentially negative rechallenge cases. Out of these, 6 were negative rechallenge cases and 5 were positive rechallenge cases. Thus initial clinical judgement had a sensitivity of 100% and specificity of 0%. Our Naïve Bayes model can be used to improve the prediction accuracy of the clinicians by providing a score for each case. A threshold for the score can be set and patients with scores above or below this threshold will be predicted by the model as potential negative or positive rechallenge cases respectively. Clinicians can choose different thresholds for the score according to their treatment objectives for the patients. For example, a low threshold is more important for patients who are undergoing curative treatment so that they are not deprived of a useful drug treatment. This is a significant issue in chemotherapeutic treatment because there are limited choices of effective chemotherapeutic drugs available to the patients. Thus, the benefits from using the first line drugs usually outweigh the potential risks caused by any serious ADRs. The sensitivity and specificity of our model using 0.01 as the threshold is 100% and 20% respectively. Conversely, a high threshold would be more useful for patients undergoing palliative chemotherapeutic treatment as the key priority is to prevent them from experiencing unnecessary serious ADRs caused by rechallenge. A threshold of 0.8 would result in a sensitivity of 67% and specificity of 80% for our model.
Despite analyzing 6 years worth of data, the size of the dataset used to develop and validate our model is rather small. This is due to the limited number of ADR cases reported and collected in NCCS. Thus, this study is only a pilot study and there is a need for further validation of the accuracy and reproducibility of this model. A study is currently ongoing to validate the model using 2010 to 2014 data. In addition, a visual aid will be added to improve the interpretability of the results by clinicians. Additional discussion on other limitations of this study can be found in Additional File 1: Appendix 4.
Compared to clinical judgement, the Naïve Bayes model developed in this study is able to guide rechallenge decisions more consistently and thus allows clinicians to make more confident decisions on whether to rechallenge a patient with the same drug after a prior serious ADR. The proposed use of the model is to assist in subsequent clinical decision making after the clinicians had identified a group of patients for rechallenge. Thus the model serves as a subsequent check to reinforce or discourage the initial decision for rechallenge. This would help to reduce serious ADRs and improve patient's treatment options.
This work was supported by the National University of Singapore (NUS) start-up grant (R-148-000-105-133) and Ministry of Education Academic Research Fund Tier 1 grant (R-148-000-136-112) to Chun Wei Yap, and NUS Department of Pharmacy Final Year Project grant (R-148-000-003-001) to Pei Yi Loke.
- Jose J, Rao PG: Pattern of adverse drug reactions notified by spontaneous reporting in an Indian tertiary care teaching hospital. Pharmacol Res. 2006, 54 (3): 226-233. 10.1016/j.phrs.2006.05.003.View ArticlePubMedGoogle Scholar
- Hassett MJ, O'Malley AJ, Pakes JR, Newhouse JP, Earle CC: Frequency and cost of chemotherapy-related serious adverse effects in a population sample of women with breast cancer. J Natl Cancer Inst. 2006, 98 (16): 1108-1117. 10.1093/jnci/djj305.View ArticlePubMedGoogle Scholar
- Pieter A, Dolf Z: Introduction to Data Mining and Knowledge Discovery. 1999, NewYork: Two Crows Corporation, 3Google Scholar
- Nordyke RA, Kulikowski CA, Kulikowski CW: A comparison of methods for the automated diagnosis of thyroid dysfunction. Comput Biomed Res. 1971, 4 (4): 374-389. 10.1016/0010-4809(71)90022-X.View ArticlePubMedGoogle Scholar
- Huang Y, McCullagh P, Black N, Harper R: Feature selection and classification model construction on type 2 diabetic patients' data. Artif Intell Med. 2007, 41 (3): 251-262. 10.1016/j.artmed.2007.07.002.View ArticlePubMedGoogle Scholar
- Population and Vital Statistics. [http://www.moh.gov.sg/mohcorp/statistics.aspx?id=5524]
- Veehof L, Stewart R, Haaijer-Ruskamp F, Jong BM: The development of polypharmacy. A longitudinal study. Fam Pract. 2000, 17 (3): 261-267. 10.1093/fampra/17.3.261.View ArticlePubMedGoogle Scholar
- Langley P, Sage S: Induction of selected Bayesian classifiers. Proceedings of the Conference on Uncertainty in Artificial Intelligence. 1994, 399-406.Google Scholar
- Minsky M: Steps toward artificial intelligence. Trans Instit Radio Engineers. 1961, 49: 8-30.Google Scholar
- Fawcett T: An introduction to ROC analysis. Pattern Recogn Lett. 2006, 27 (8): 861-874. 10.1016/j.patrec.2005.10.010.View ArticleGoogle Scholar
- Lasko TA, Bhagwat JG, Zou KH, Ohno-Machado L: The use of receiver operating characteristic curves in biomedical informatics. J Biomed Inform. 2005, 38 (5): 404-415. 10.1016/j.jbi.2005.02.008.View ArticlePubMedGoogle Scholar
- Asli U, Ayse B, Ciray HN, Bahceci M: ROC Based Evaluation and Comparison of Classifiers for IVF Implantation Prediction. 2009, Springer Berlin Heidelberg, 27:Google Scholar
- Stephan C, Cammann H, Deger S, Schrader M, Meyer HA, Miller K, Lein M, Jung K: Benign prostatic hyperplasia-associated free prostate-specific antigen improves detection of prostate cancer in an artificial neural network. Urology. 2009, 74 (4): 873-877. 10.1016/j.urology.2009.02.054.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2407/11/319/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.