Skip to main content

Table 3 Performance of double reading with and without AI by region for the ten-year cohort

From: Multi-vendor evaluation of artificial intelligence as an independent reader for double reading in breast cancer screening on 275,900 mammograms

Performance metric

Historical double reading

Double reading (DR)

with AI

Test outcome for DR with AIa

Regional breakdown for UKb

 Recall rate

3.8% (3.8, 3.9)

3.8% (3.7, 3.9)

Non-inferior

0.99 (0.98, 1.01)

 CDR (3Y)

8.8 per 1000 (8.6, 9.0)

8.6 per 1000 (8.4, 8.7)

Non-inferior

0.98 (0.97, 0.98)

 Sensitivity (3Y)

86.1% (84.5, 87.6)

83.9% (82.3, 85.6)

Non-inferior

0.98 (0.97, 0.98)

 Specificity

97.1% (96.9, 97.2)

97.1% (97.0, 97.3)

Superior

1.00 (1.00, 1.00)

 PPV (3Y)

24.5% (24.0, 25.0)

24.0% (23.5, 24.4)

Non-inferior

0.98 (0.97, 0.99)

Regional breakdown for HUc

 Recall rate

9.2% (9.0, 9.4)

7.8% (7.7, 8.0)

Superior

0.85 (0.85, 0.86)

 CDR (2Y)

7.7 per 1000 (7.1, 8.3)

7.6 per 1000 (7.0, 8.2)

Non-inferior

0.99 (0.98, 0.99)

 Sensitivity (2Y)

88.8% (86.2, 90.9)

87.5% (84.9, 89.7)

Non-inferior

0.99 (0.98, 0.99)

 Specificity

94.7% (94.3, 95.0)

95.8% (95.4, 96.1)

Superior

1.01 (1.01, 1.01)

 PPV (2Y)

8.3% (8.1, 8.6)

9.6% (9.4, 9.9)

Superior

1.16 (1.14, 1.16)

  1. 95% confidence intervals are presented in parentheses
  2. aThe ratio of proportions and 95% confidence intervals for assessing non-inferiority and superior are presented
  3. bThe positive pool for CDR, sensitivity, and PPV include screen-detected positives and three-year ICs, which are relevant for the UK
  4. cThe positive pool for CDR, sensitivity, and PPV include screen-detected positives and two-year ICs only, which are relevant for HU