Skip to main content

Table 2 Attribute description for each data set, where numerical attributes are indicated with "∗", non-binary categorical attributes were converted into binary representations through dummy coding and classification labels are shown in the last row.

From: Differentially private distributed logistic regression using private and public data

Data set 1

Data set 2

Data set 3

Hormonal therapy

1. Yes, 2. No.

Specimen

1. Blood, 2. Urine, 3. sputum, 4. CSF

Race (25 categories)

Age*

Specific days*

Age*

Menopausal status

1.Premenopausal, 2. Postmenopausal

Day of the week for collection

1. Weekday, 2. Weekend

Marital status (6 categories)

Tumor size*

Age*

Histology*

Tumor grade*

(Levels I, II, III)

Day of the week for the final result

1. Weekday, 2. Weekend

Number of nodes examined*

Number of positive nodes*

Gender

1. Male,

2. Female

Number of positive nodes*

Recurrence free Survival time*

(in days)

Insurance

1.Medicare, 2. Medicaid,

3. Commercial, 4. Other

Grade*

Progesterone receptor*

 

Tumor size*

Estrogen receptor*

Race

1. White, 2. Black, 3. Asian, 4. Hispanic,

5. unknown/declined

ER status

(4 categories)

Status indicator

Potential error

Vital status recode

Pos: Alive, Neg: Died

Pos: Not a potential follow-up error, Neg: A potential follow-up error

Pos: Alive, Neg: Died