Skip to main content

Table 1 List of structured and extracted variables*

From: Predicting invasive breast cancer versus DCIS in different age groups

Structured

Variables extracted using NLP

• Age

• Calcification distribution

• Family history (of breast cancer)†

• Calcification morphology¥

• Personal history (of breast cancer)

• Mass margins

• Prior surgery‡

• Mass shape

• Palpable lump

• Architectural distortion

• Breast density

• Focal asymmetric density

• BI-RADS assessment

• Indication for exam if diagnostic

• Principal mammography findingΨ

• Mass size

  1. *These variables were used as input to the stepwise regression to produce the models for older and younger women.
  2. †Defined as family history of breast cancer (Minor = one or more relatives more distant than first-degree relatives, Strong = one first-degree relative with unilateral postmenopausal breast cancer, Very Strong = more than one first-degree relative with unilateral postmenopausal breast cancer, one first-degree relative with bilateral breast cancer, or one first-degree with premenopausal breast cancer).
  3. ‡Defined as prior breast surgery of any kind.
  4. ΨPrincipal mammographic finding: architectural distortion, calcifications, asymmetry (one view), focal asymmetry (two views), developing asymmetry, mass, single dilated duct, both calcifications and something else.
  5. ¥To overcome low frequency categories, features are grouped into high probability malignancy, intermediate and typically benign categories, as described in the Breast Imaging and Reporting Data System (BI-RADS) lexicon [18].