tapas.attacks.synthinference.SyntheticPredictorAttack

class tapas.attacks.synthinference.SyntheticPredictorAttack(estimator: ClassifierMixin, criterion: tuple, label=None)

Bases: tapas.attacks.base_classes.TrainableThresholdAttack

Attribute Inference Attack that first trains a classifier C on the synthetic data to predict the sensitive value v of a record x, then uses C(target_record) as prediction for the target record.

This is a common baseline, linked to CAP (Correct Attribution Probability), although whether it constitutes a privacy violation is controversial, since correlations in the data could reveal the sensitive attribute even if the user does not contribute their data. TAPAS circumvents this issue by randomising the sensitive attribute independently from all others. As such, this attack mostly aims at detecting overfitted models.

This attack is implemented exclusively for tabular data.

__init__(estimator: ClassifierMixin, criterion: tuple, label=None)

Initialise this attack with a given threshold-selection criterion.

The criterion is a tuple with at least one entry. The first entry, criterion[0], is the target criterion (accuracy/tp/fp/threshold). Further entries give additional information on the target.

Acceptable criterions are:
  • (“accuracy”,): choose the threshold that yields maximum accuracy.

  • (“tp”, float): choose the threshold that yields as close as possible

    to a given true positive (“tp”) rate.

  • (“fp”, float): similarly, for the false positive rate (“fp”).

  • (“threshold”, float): manually specify the threshold.

For “tp” and “fp” and “threshold”, you may also include a third entry (int), which is the label to consider as positive value. If this is not provided, then True or 1 (depending on label type) is assumed to be the positive label.

Methods

__init__(estimator, criterion[, label])

Initialise this attack with a given threshold-selection criterion.

attack(datasets)

Make a prediction for each dataset.

attack_score(datasets)

Perform the attack on each dataset in a list, but return a confidence score (specifically for classification tasks).

train(threat_model[, num_samples])

Train this attack: train the score, then choose a threshold meeting the target criterion.

Attributes

label

A label to describe this attack in reports.

attack(datasets: list[Dataset])

Make a prediction for each dataset.

This computes attack_score for each dataset, then decides that the target user is in the training dataset if and only if the score is higher than self._threshold.

Parameters

datasets (a list of synthetic datasets.) –

Returns

predictions

Return type

np.array of booleans.

attack_score(datasets: list[Dataset])

Perform the attack on each dataset in a list, but return a confidence score (specifically for classification tasks).

property label

A label to describe this attack in reports.

train(threat_model: tapas.threat_models.attacker_knowledge.LabelInferenceThreatModel, num_samples: Optional[int] = None, **attack_score_kwargs)

Train this attack: train the score, then choose a threshold meeting the target criterion.

Parameters
  • threat_model (LabelInferenceThreatModel) – The threat model from which to generate labelled samples.

  • num_samples (int (default, None).) – Number of training samples to generate to select the threshold. If None, use all pre-generated training samples (only do this if you have already generated datasets).

  • (optionally)

  • arguments (additional keyword) –

  • _train_attack_score. (passed to) –