tapas.threat_models.attacker_knowledge.AuxiliaryDataKnowledge

class tapas.threat_models.attacker_knowledge.AuxiliaryDataKnowledge(dataset: Dataset = None, auxiliary_split: float = 0.5, aux_data: Dataset = None, test_data: Dataset = None, num_training_records: int = 1000)

Bases: tapas.threat_models.attacker_knowledge.AttackerKnowledgeOnData

This attacker knowledge assumes access to some auxiliary dataset from which training datasets are sampled, as random subset of this auxiliary data. A distinct testing dataset, sampled from the same distribution, is also used to generate testing samples.

__init__(dataset: Dataset = None, auxiliary_split: float = 0.5, aux_data: Dataset = None, test_data: Dataset = None, num_training_records: int = 1000)

Initialise this threat model with a given dataset. This threat model requires an auxiliary dataset available to the attacker, and a test dataset to evaluate attacks. These can either be specified by giving a dataset (dataset) to split between the two, with a fraction of auxiliary_split giving the relative size of the auxiliary dataset; and/or by explicitly specifying aux_data and test_data. If both are given together, the resulting auxiliary/test datasets are obtained by concatenating the other two.

Parameters

dataset (Dataset (None)) – Dataset to split between test and auxiliary data.
auxiliary_split (float in [0,1], optional.) – Fraction of dataset to use as auxiliary dataset. The rest of the dataset is used as test dataset.
aux_data (Dataset, optional) – Dataset that the adversary is assumed to have access to. This or auxiliary_split must be provided. The default is None.
test_data (Dataset, optional) – Dataset used to generate test datasets to evaluate the attack. The default is None.
num_training_records (int, optional (default 1000).) – Number of training records to use to train each copy of shadow_model, when generating synthetic training datasets for the attack. The default is 1000.

Methods

`__init__`([dataset, auxiliary_split, ...])	Initialise this threat model with a given dataset.
`generate_datasets`(num_samples[, training])	Generate training/testing "real" datasets.

Attributes

label

A string to represent this knowledge.

generate_datasets(num_samples: int, training: bool = True) → list[Dataset]

Generate training/testing “real” datasets.

Parameters

num_samples (int) – Number of training dataset pairs to generate.
training (bool, optional) – If True, D’s will be sampled from the adversary’s data (self.adv_data). Otherwise, D’s will be sampled from the real data (self.datasets). The default is True.

Returns

List of generated synthetic datasets. List of labels.

Return type

tuple(list[Dataset], np.ndarray)

property label: A string to represent this knowledge.