You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Stefan has illustrated two methods of retrieving data from DataSHIELD with difference attacks. In short:
(1) by comparing the mean of a column with all rows and with one row removed
(2) by comparing the mean of a column with all rows and with one row duplicated
This is hard to protect against because it is done by creating two subsets that generally have large numbers of rows.
Research indicates that the best protection against difference attacks is to add noise. There is a package cellKey which provides the ability to add noise to a table in R. This could be repackaged for DataSHIELD use.
The issues to address are:
when to apply the noise - on import of data into the session? Or when the data are split into subsets?
the cell key process has been used for census data, and tends to be evaluated on a particular data set to see if it is appropriate. How would that work for DataSHIELD with diverse datasets?
The text was updated successfully, but these errors were encountered:
Stefan has illustrated two methods of retrieving data from DataSHIELD with difference attacks. In short:
(1) by comparing the mean of a column with all rows and with one row removed
(2) by comparing the mean of a column with all rows and with one row duplicated
This is hard to protect against because it is done by creating two subsets that generally have large numbers of rows.
Research indicates that the best protection against difference attacks is to add noise. There is a package cellKey which provides the ability to add noise to a table in R. This could be repackaged for DataSHIELD use.
The issues to address are:
The text was updated successfully, but these errors were encountered: