Prototype cell key perturbation to enhance disclosure control against difference attacks #251

tombisho · 2021-11-25T19:35:52Z

Stefan has illustrated two methods of retrieving data from DataSHIELD with difference attacks. In short:

(1) by comparing the mean of a column with all rows and with one row removed
(2) by comparing the mean of a column with all rows and with one row duplicated

This is hard to protect against because it is done by creating two subsets that generally have large numbers of rows.

Research indicates that the best protection against difference attacks is to add noise. There is a package cellKey which provides the ability to add noise to a table in R. This could be repackaged for DataSHIELD use.

The issues to address are:

when to apply the noise - on import of data into the session? Or when the data are split into subsets?
the cell key process has been used for census data, and tends to be evaluated on a particular data set to see if it is appropriate. How would that work for DataSHIELD with diverse datasets?

tombisho added the question label Nov 25, 2021

StuartWheater added this to the v6.4 milestone Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prototype cell key perturbation to enhance disclosure control against difference attacks #251

Prototype cell key perturbation to enhance disclosure control against difference attacks #251

tombisho commented Nov 25, 2021

Prototype cell key perturbation to enhance disclosure control against difference attacks #251

Prototype cell key perturbation to enhance disclosure control against difference attacks #251

Comments

tombisho commented Nov 25, 2021