Hiding sensitive rules using SIF-IDF to preserve privacy in extracting association rules

1.648 521


Nowadays, data mining and privacy preserving are two important and fundamental issues for organizations, individuals, and data miners. Data mining discovers the relations among the items of a database. Some of the discovered relations are private for organizations and individuals and must not be available to others. This information is called sensitive information and the database owner tries to hide it. Hiding sensitive information has some side effects for the database and insensitive information including loss of insensitive information, creation of new information that doesn’t exist in the original database (ghost rules), dissimilarity in the database, etc. All the presented algorithms for privacy preserving try to sanitize databases with the least side effects. In this paper, an algorithm based on SIF-IDF algorithm in order to hide sensitive rules is proposed. In the proposed algorithm, heuristic technique and support-based approach are used for sanitizing databases. The aim of the proposed algorithm is reducing the side effects of database sanitization including loss of rules, runtime reduction and hiding failure. The proposed algorithm is assessed by 1.b, MDSRRC, and SIF-IDF algorithms and the results show the efficacy of the proposed algorithm.


Sensitive information, Data mining, Privacy preserving

Full Text:



V.S. Verykios, E. Bertino, I.N. Fovino, “State-of-the-art in privacy preserving data mining”, SIGMOD Record, Vol. 33, No. 1, March 2004, pp. 50–57.

M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, V. Verykios, “Disclosure limitation of sensitive rules”, Knowledge and Data Engineering Exchange, November 1999, pp. 45-52.

E. Dasseni, V. S. Verykios, A. K. Elmagarmid, E. Bertino, "Hiding association rules by using confidence and support", IHW '01 Proceedings of the 4th International Workshop on Information Hiding, 2001, pp. 369–383.

V.S. Yerykios, E.D. Pontikakis, Y. Theodoridis, L. Chang, ” Efficient algorithms for distortion and blocking techniques in association rule hiding”, Distributed and Parallel Databases, Vol.22, No.1, 2007, pp.85-104;

K. Shah, A. Thakkar, A. Ganatra, “Association rule hiding by heuristic approach to reduce side effects & hide multiple R.H.S. items”, International Journal of Computer Applications, Vol. 45, No. 1, May 2012, pp. 1–7.

N.H. Domadiya and U.P. Rao. “Hiding sensitive association rules to maintain privacy and data quality in database”, Advance Computing Conference, February 2012, pp. 1306–1310.

T-P Hong, C-W Lin, K-T Yang, “Using TF_IDF to hide sensitive itemset”, Applied Intelligence, Vol.38, 2013, pp.502-510.

C-W Lin, T-P Hong, H-C Hsu, “Reducing Side Effects of Hiding Sensitive Itemsets in Privacy Preserving Data Mining”, The ScientificWorld Journal, Vol. 2014, April 2014, pp.1-5

N.J. Ghalehsefidi, M.N. Dehkordi, “A hybrid approach to privacy preserving in association rules mining”, Advances in Computer Science: an International Journal, Vol.3, No.12, November 2014, pp.69-72.

Han, J., M Kamber, “Data mining:concept and techniques”, 2end ed., Diane Cerra, CA: San Francisco, 2006, pp.203-233.