Privacy-preserving Data Mining


The relatively young subject of privacy-preserving data mining attempts to provide ways of executing complex statistical analyses on large databases without the need to fully disclose the contents of the databases themselves. There are two approaches that achieve this goal: one is based on cryptographical (and thus mostly number-theoretical) techniques; another uses statistical randomization methods. This paper aims to give a brief overview of the two methods. The sketch of the privacy-preserving version of the decision-tree construction algorithm ID3 is presented as an illustration of the cryptographic technique.


Page last updated: 30.11.2005