2024 : 10 : 31
Hossein Ghaffarian

Hossein Ghaffarian

Academic rank: Assistant Professor
ORCID: https://orcid.org/0000-0002-7998-8618
Education: PhD.
ScopusId: 24765997700
HIndex:
Faculty: Engineering
Address: Arak University
Phone:

Research

Title
Apache Flink and clustering-based framework for fast anonymization of IoT stream data
Type
JournalPaper
Keywords
Internet of Things, Data privacy, Streaming data, Data anonymity, Apache Flink, Data processing engine
Year
2023
Journal intelligent systems with applications
DOI
Researchers Alireze Sadeghi Nasab ، Hossein Ghaffarian ، Mohsen Rahmani

Abstract

In this paper, we present a novel framework that considers the expiration period time of the Internet of Things (IoT) data stream to anonymize it. IoT stands among one of most fast-growing technology in the world. Also, anonymity is one of the safeguards in place to protect data privacy. Because of the dynamic nature, vastness, and rapid changes in data streams, traditional approaches cannot be used to anonymize IoT data. The anonymization framework proposed in this paper performs its operation using a new clustering method and Apache Flink flow data processing engine. In this framework, firstly, we cluster received data. Then, if the size of the clusters doesn't meet the K-anonymity threshold, our review will continue to suppress and delete them; otherwise, the data would be anonymized and published. In this way, the framework handles both numerical and categorical data. At the end of the stream, the final remaining data will be merged and anonymized. Implementing and evaluating the framework using Scala and Apache Flink shows that the proposed approach reduces data delay by 12.33–66.62% compared with the other methods. Furthermore, in the end, combining the leftover clusters avoids information loss. In comparison with similar methods, information loss is reduced by 5.68–18.26%. The evaluation results show better performance in terms of data delay and information loss.