Our current meta-analysis presented the high level diagnostic accuracy of Fine-needle aspiration biopsy (FNAB). In our first classification (C1 was temporarily exluded as most studies did.), the sensitivity rate was 92.7% and the specified rate was nearly 94.8%. The SROC curve showed the maximum joint sensitivity and specificity (i.e. the Q-value) was 0.948; while the area under the curve (AUC) was 0.986, presenting excellent level of overall accuracy.
The DOR is a single indicator of test accuracy  that combines the data from sensitivity and specificity into a single number. The DOR of a test is the ratio of the odds of positive test results in the patient with disease relative to the odds of positive test results in the patient without disease. The value of a DOR ranges from 0 to infinity, with higher values indicating better discriminatory test performance (i.e. higher accuracy). A DOR of 1.0 indicates that a test does not discriminate between patients with the disorder and those without it. In the present meta-analysis, we have found that the mean DOR was 429.73, also indicating a high level of overall accuracy.
Since the SROC curve and the DOR are not easy to interpret and use in clinical practice, and since likelihood ratios are considered to be more clinically meaningful [78, 79], we also presented both PLR and NLR as our measures of diagnostic accuracy. Likelihood ratios of > 10 or < 0.1 generate large and often conclusive shifts from pre-test to post-test probability (indicating high accuracy) . In our first classification, PLR value of 25.72 suggests that patients with various grade cancers have an approximately 26-fold higher chance of being FNAB result-positive compared with patients with benign breast lesion. This high probability would be considered high enough to begin surgical treatment or other therapy. On the other hand, NLR was found to be 0.08 in our current meta-analysis. If the FNAB result was negative, the probability that this patient has breast carcinoma is approximately 8%.
It should be emphasized that we used the approach of Burapa Kanchanabat  and Etta D. Pisano  for evaluating the diagnostic performance of FNAB (1. unsatisfactory samples was temporarily excluded; 2. unsatisfactory samples was classified as positive). In our first classification, unsatisfactory samples (C1) was exluded as most studies did. In our second classification, Inadequate cytological material have to be interpreted as "positive". Because treating the unsatisfactory result as a negative outcome is a poor policy that has the potential to cause harm to patients and delay the diagnosis of breast cancer. On the purpose of minimizing the chance of a missed diagnosis of breast cancer, certain discrepancies between FNAB and open biopsy (e.g. cytological results including C3, C4, C5 on FNAB and atypical hyperplasia or various grades cancer on open biopsy) were considered as agreements and needed further management. The reclassified agreement rate is therefore a clinically relevant and pragmatic estimate for the accordance between FNAB and actual disease status.
Breast cancer was present in certain proportion of the inadequate FNAB specimens. Since unsatisfactory samples (C1) played important roles in influencing diagnostic accuracy of FNAB, we also assessed the pooled sensitivity and specificity for FNAB in the other classification (unsatisfactory samples were regarded as positive) and the underestimation rate of unsatisfactory samples. This pooled sensitivity (92.7%) was similar with the sensitivity (92.0%) that mentioned above in our first classification (unsatisfactory samples was exluded) while the pooled specificity (76.8%) was lower than the specificity (94.8%) above. This change may be due to the underestimation rate of inadequate samples which was currently assessed in our study. This pooled unsatisfactory samples' underestimate rate was 27.5% which was higher than the value (8.5%) reported by H.C.Lee . However, we included more recent related studies and more patients than H.C.Lee did. Our underestimate rate indicated that 27.5% of the patients with a diagnosis of inadequate samples for cytological analysis will prove to have various grades breast cancer. This rate was not low enough to rule out breast cancer. So, in most of these cases, an additional managemant such as core biopsies or surgical procedure will then be necessary.
On the whole, the quality of the included studies is higher than median level according to QUADAS. Many studies did not reach item 11 (reference standard review bias), 13 (uninterpretable test results) or 14 (withdrawals). According to QUADAS items and studies' detail analysis, most studies did not mention blinding results interpreted, uninterpretable test results or explained withdrawals which did not match item 11, 13 and 14. These bias would affect the analysis of accuracy of FNAB.
An exploration of the reasons for heterogeneity rather than the computation of a single summary measure was an important goal of meta-analysis . In our meta-analysis, QUADAS scores were used in the meta-regression analysis to assess the effect of study quality on RDOR. We did not observe that the studies with relatively higher quality (QUADAS score of ≥10) had better test performances than those with lower quality.
Although we found a significant heterogeneity for sensitivity, specificity, PLR, NLR and DOR among the studies analyzed, meta-regression results showed that 3 different aspects among 46 studies (such as needle size, study locations and prospective/retrospective designs) didn't reach statistical significance, indicating that these aspects did not substantially affect diagnostic accuracy. On the other hand, 2 different aspects such as guidance systems (with ultrasound or stereotactic guidance vs without imaging guidance) and reference standard (histopathology only or not) affect the diagnostic accuracy in great part. These may be due to the following reasons. First, fine-needle aspiration biopsy without imaging guidance is not suitable for patients with ill-defined masses because the aspiration cannot be done at the exact position and the cytological result may not represent the true nature of the mass. In other words, breast lesions could be definitely localized by imaging guidance then FNAB could be done. FNAB with imaging guidance system can make a favorite diagnosic accuracy. Second, there were the combined two standard methods adopted by some included studies, surgery biopsy for suspicious lesion and imaging or clinic follow-up for benign cytological result from low-risk patients. Moreover, the time of follow-up is different from each other (range from 6-24 months). As a result, misclassification may occur easier in this two different reference standard situation than in that histopathologic is the only reference standard.
Apart for having a comprehensive search strategy, our study assessed the FNAB diagnosis accuracy in all directions, such as sensitivity, specificity, PLR, NLR, DOR, SROC curve and AUC. In addition, we assessed the influence of unsatisfactory samples on FNAB diagnosis accuracy. Heterogeneity and potential publication bias were also explored in accordance with published guidelines. However, our systematic review had some limitations. Only including English and Chinese language studies and the lack of conference abstracts, letters to editors might have led to publication bias.