tapas.attacks.base_classes.TrainableThresholdAttack

class tapas.attacks.base_classes.TrainableThresholdAttack(criterion: tuple)

Bases: tapas.attacks.base_classes.Attack

Generic class to represent attacks that rely on a score, combined with a threshold that is chosen according to some (fairly generic) criterion. Many attacks should fall under this

__init__(criterion: tuple)

Initialise this attack with a given threshold-selection criterion.

The criterion is a tuple with at least one entry. The first entry, criterion[0], is the target criterion (accuracy/tp/fp/threshold). Further entries give additional information on the target.

Acceptable criterions are:
  • (“accuracy”,): choose the threshold that yields maximum accuracy.

  • (“tp”, float): choose the threshold that yields as close as possible

    to a given true positive (“tp”) rate.

  • (“fp”, float): similarly, for the false positive rate (“fp”).

  • (“threshold”, float): manually specify the threshold.

For “tp” and “fp” and “threshold”, you may also include a third entry (int), which is the label to consider as positive value. If this is not provided, then True or 1 (depending on label type) is assumed to be the positive label.

Methods

__init__(criterion)

Initialise this attack with a given threshold-selection criterion.

attack(datasets)

Make a prediction for each dataset.

attack_score(datasets)

Perform the attack on each dataset in a list, but return a confidence score (specifically for classification tasks).

train(threat_model[, num_samples])

Train this attack: train the score, then choose a threshold meeting the target criterion.

Attributes

label

A label to describe this attack in reports.

attack(datasets: list[Dataset])

Make a prediction for each dataset.

This computes attack_score for each dataset, then decides that the target user is in the training dataset if and only if the score is higher than self._threshold.

Parameters

datasets (a list of synthetic datasets.) –

Returns

predictions

Return type

np.array of booleans.

abstract attack_score(datasets: list[Dataset])

Perform the attack on each dataset in a list, but return a confidence score (specifically for classification tasks).

property label

A label to describe this attack in reports.

train(threat_model: tapas.threat_models.attacker_knowledge.LabelInferenceThreatModel, num_samples: Optional[int] = None, **attack_score_kwargs)

Train this attack: train the score, then choose a threshold meeting the target criterion.

Parameters
  • threat_model (LabelInferenceThreatModel) – The threat model from which to generate labelled samples.

  • num_samples (int (default, None).) – Number of training samples to generate to select the threshold. If None, use all pre-generated training samples (only do this if you have already generated datasets).

  • (optionally)

  • arguments (additional keyword) –

  • _train_attack_score. (passed to) –