tapas.attacks.closest_distance.ClosestDistanceAIA
- class tapas.attacks.closest_distance.ClosestDistanceAIA(distance: tapas.attacks.distances.DistanceMetric = <tapas.attacks.distances.HammingDistance object>, criterion: tuple = 'accuracy', label: typing.Optional[str] = None)
Bases:
tapas.attacks.closest_distance.ClosestDistanceMIAAttack that finds the closest-record to the target record, and uses the value of the sensitive attribute of that closest-record as answer to the attribute-inference attack.
This attack is a bit more flexible: for each value v, it returns a score equal to - distance(r(v), D) / sum_v’ distance(r(v’), D), where r(v) is the target record with v as value for its sensitive attribute.
This is a TrainableThresholdAttack, and thus is able to automatically select the threshold for .attack from .attack_score. To set the behaviour to be “choose v minimising distance(r(v), D)”, set: criterion = (“threshold”, -0.5) in the constructor of this object.
- __init__(distance: tapas.attacks.distances.DistanceMetric = <tapas.attacks.distances.HammingDistance object>, criterion: tuple = 'accuracy', label: typing.Optional[str] = None)
Create the attack with chosen parameters.
- Parameters
distance (DistanceMetric) – Distance to use between records for the attack.
criterion (tuple) – Criterion to select the threshold (see TrainableThresholdAttack for details).
(optional) (label) –
Methods
__init__([distance, criterion, label])Create the attack with chosen parameters.
attack(datasets)Make a prediction for each dataset.
attack_score(datasets)Compute the decision score for this attack.
train(threat_model[, num_samples])Train this attack: train the score, then choose a threshold meeting the target criterion.
Attributes
A label to describe this attack in reports.
- attack(datasets: list[Dataset])
Make a prediction for each dataset.
This computes attack_score for each dataset, then decides that the target user is in the training dataset if and only if the score is higher than self._threshold.
- Parameters
datasets (a list of synthetic datasets.) –
- Returns
predictions
- Return type
np.array of booleans.
- attack_score(datasets: list[Dataset])
Compute the decision score for this attack.
The target score is the minimal distance between the target record with a given value v for the sensitive attribute and records in the synthetic dataset, weighted such that the sum of scores is 1 for each dataset. If the distance is 0 for all values, this returns 1/num_values for each value.
- Parameters
datasets (a list of synthetic datasets.) –
- Returns
scores – If the number of possible values (threat_model.attribute_values) is two, then the scores array is 1-dimensional, and each entry contains the score for the second (“positive”) value. Otherwise, this returns a score per value, and k is the number of possible values (k = len(self.threat_model.attribute_values).
- Return type
array of size len(datasets) or len(datasets) x k
- property label
A label to describe this attack in reports.
- train(threat_model: tapas.threat_models.attacker_knowledge.LabelInferenceThreatModel, num_samples: Optional[int] = None, **attack_score_kwargs)
Train this attack: train the score, then choose a threshold meeting the target criterion.
- Parameters
threat_model (LabelInferenceThreatModel) – The threat model from which to generate labelled samples.
num_samples (int (default, None).) – Number of training samples to generate to select the threshold. If None, use all pre-generated training samples (only do this if you have already generated datasets).
(optionally) –
arguments (additional keyword) –
_train_attack_score. (passed to) –