Document Type: Research Articles
Department of Computer Science and Engineering, PSG College of Technology, Coimbatore, Tamilnadu, India.
Objective: Privacy protection in the medical field means the protection of individuals from being associated with
undesirable conditions, diagnoses or treatments (Sensitive Attributes). The problem of knowledge discovery from health
care data by applying data mining algorithms is inversely related to the privacy of individuals. Due to the tremendous
growth of data in a large scale, there is a demand to protect the sensitive data accessible from medical datasets. Methods:
This paper considers the problem of building privacy preserving association rule mining algorithm using the notion of
TF * IDF derived from the information retrieval domain. The highly sensitive transaction is chosen using the product
of Relative Item Frequency and Condensed Frequency. Finally, sensitive fuzzy data is perturbed to hide these refined
rules. Results: It has been found that the number of non-sensitive rules lost as a side effect of hiding sensitive rule is
20% less and number of ghost rules is 30% less in proposed work than in previous work using Transactional Impact
factor method. The execution time of hiding a rule is 26% lesser on an average in the proposed technique for various
values of minimum confidence threshold. It has been observed that the number of modifications to the original dataset
after hiding three rules were reduced by 66% in proposed method than in previous work. As the number of modifications
to original data is less the chances of generating false association is also reduced. Conclusion: In this paper, a novel
method was presented to hide the sensitive rule in quantitative data by decreasing the support of the RHS of the rule.
Experimental results demonstrate that the proposed approach is more efficient as it facilitates better rule hiding and
minimizes the number of lost rules and ghost rules. Also, this approach makes minimum modifications to the dataset.