Abstract
With the proliferation of location-based services enabled by a large number of mobile devices and applications, the quantity of location data, such as trajectories collected by service providers, is gigantic. If these datasets could be published, then they would be valuable assets to various service providers to explore business opportunities, to study commuter behavior for better transport management, which in turn benefits the general public for day-to-day commuting. However, there are two major concerns that considerably limit the availability and the usage of these trajectory datasets. The first is the threat to individual privacy, as users’ trajectories may be misused to discover sensitive information, such as home locations, their children’s school locations, or social information like habits or relationships. The other concern is the ability to analyze the exabytes of location data in a timely manner. Although there have been trajectory anonymization approaches proposed in the past to mitigate privacy concerns. None of these prior works address the scalability issue, since it is a newly occurring problem brought by the significantly increasing adoption of location-based services. In this article, we conquer these two challenges by designing a novel parallel trajectory anonymization algorithm that achieves scalability, strong privacy protection, and high utility rate of the anonymized trajectory datasets. We have conducted extensive experiments using MapReduce and Spark on real maps with different topologies, and our results prove both effectiveness and efficiency when compared with the centralized approaches.
- Osman Abul, Francesco Bonchi, and Mirco Nanni. 2008. Never Walk Alone: Uncertainty for anonymity in moving objects databases. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’08).Google Scholar
Digital Library
- Francesco Bonchi and Hui Wendy Wang. 2011. Trajectory anonymity in publishing personal mobility data. Spec. Interest Group Knowl. Discov. Data Mining 13, 1 (2011), 30--42.Google Scholar
- Rui Chen, Benjamin C. M. Fung, Noman Mohammed, Bipin C. Desai, and Ke Wang. 2013. Privacy-preserving trajectory data publishing by local suppression. Info. Sci. 231 (2013), 83--97.Google Scholar
- Melanie Deal. 2016. Census Bureau Reports 471,000 Workers Commute into Los Angeles County, California, Each Day. Retrieved from http://www.census.gov/newsroom/press-releases/2013/cb13-r13.html.Google Scholar
- Josep Domingo-Ferrer and Rolando Trujillo-Rasua. 2012. Microaggregation- and permutation-based anonymization of movement data. Info. Sci. 208 (2012), 55--80.Google Scholar
- Ahmed Eldawy and Mohamed Mokbel. 2013. A demonstration of spatialhadoop: An efficient mapreduce framework for spatial data. Very Large Data Base 6, 12 (2013), 1230--1233.Google Scholar
- Apache Software Foundation. 2016. What is Apache Hadoop? Retrieved from http://hadoop.apache.org/.Google Scholar
- Hend Kamal Gedawy. 2009. Dynamic path planning and traffic light coordination for emergency vehicle routing. Carnegie Mellon University Thesis (2009), 1--9.Google Scholar
- Moein Ghasemzadeh, Benjamin C. M. Fung, Rui Chen, and Anjali Awasthi. 2014. Anonymizing trajectory data for passenger flow analysis. Transportation Research Part C 39 (2014), 63--79.Google Scholar
Cross Ref
- Marco Gruteser and Dirk Grunwald. 2003. Anonymous usage of location-based services through spatial and temporal cloaking. Proceedings of the 1st International Conference on Mobile Systems Applications and Services (MobiSys’03). Vol. 3, 31--42.Google Scholar
- Sashi Gurung, Dan Lin, Wei Jiang, Ali Hurson, and Rui Zhang. 2014. Traffic information publication with privacy preservation. ACM Trans. Intell. Syst. Technol. 5, 3 (2014), 1--26. DOI:https://doi.org/10.1145/2542666Google Scholar
Digital Library
- Alon Y. Halevy, Michael J. Franklin, and David Maier. 2009. TRUSTER:TRajectory data processting on ClUSTERs. In Proceedings of the International Conference on Database Systems for Advanced Applications (DASFAA’09). 768--771. DOI:https://doi.org/10.1007/11733836Google Scholar
- Pin-I Han and Hsiao-Ping Tsai. 2015. SST: Privacy preserving for semantic trajectories. In Proceedings of the 16th IEEE International Conference on Mobile Data Management, Vol. 2. 80--85. DOI:https://doi.org/10.1109/MDM.2015.18Google Scholar
- Xi He, Graham Cormode, Ashwin Machanavajjhala, Cecilia M. Procopiuc, and Divesh Srivastava. 2015. DPT: Differentially private trajectory synthesis using hierarchical reference systems. Proc. Very Large Data Base Endow. 8, 11 (2015), 1154--1165. DOI:https://doi.org/2150-8097/15/07Google Scholar
Digital Library
- C. S. Jensen, D. Lin, and B. C. Ooi. 2007. Continuous clustering of moving objects. IEEE Trans. Knowl. Data Eng. 19, 9 (2007), 1161--1174.Google Scholar
Digital Library
- Xun Li, Wenwen Li, Luc Anselin, Sergio Rey, and Kochinsky. 2014. A mapreduce algorithm to create contiguity weights for spatial analysis of big data. In Proceedings of the International Workshop on Analytics for Big Spatial Data (BigSpatial’14).Google Scholar
- Dan Lin, Elisa Bertino, Reynold Cheng, and Sunil Prabhakar. 2008. Position transformation: A location privacy protection method for moving objects. In Proceedings of the SIGSPATIAL ACM International Conference on Advances in Geographic Information Systems (GIS’08).Google Scholar
- Dan Lin, Elisa Bertino, Reynold Cheng, and Sunil Prabhakar. 2009. Location privacy in moving-object environments. Trans. Data Privacy 2, 1 (2009), 21--46.Google Scholar
Digital Library
- Anna Monreale, Dino Pedreschi, Ruggero G. Pensa, and Fabio Pinelli. 2014. Anonymity Preserving Sequential Pattern Mining, Vol. 22. 141--173. DOI:https://doi.org/10.1007/s10506-014-9154-6Google Scholar
- Mehmet Ercan Nergiz, Maurizio Atzori, Yucel Saygin, and Baris Guc. 2009. Towards trajectory anonymization a generalization-based approach. Trans. Data Priv. 2, 106 (2009), 47--75. DOI:https://doi.org/10.1145/1503402.1503413Google Scholar
Digital Library
- Ruggero G. Pensa, Anna Monreale, Fabio Pinelli, and Dino Pedreschi. 2008. Pattern-preserving k-anonymization of sequences and its application to mobility data mining. CEUR Workshop Proc. 397 (2008), 44--60.Google Scholar
- Giorgos Poulis, Spiros Skiadopoulos, Grigorios Loukides, and Aris Gkoulala-Divanis. 2013. Select-organize-anonymize: A framework for trajectory data anonymization. Proceedings of the IEEE 13th International Conference on Data Mining Workshops (ICDMW’13). 867--874. DOI:https://doi.org/10.1109/ICDMW.2013.136Google Scholar
- Giorgos Poulis, Spiros Skiadopoulos, Grigorios Loukides, and Aris Gkoulalas. 2014. A priori-based algorithms for km-anonymizing trajectory data. Trans. Data Priv. 7, 2 (2014), 165--194.Google Scholar
- Giorgos Poulis, Spiros Skiadopoulos, Grigorios Loukides, and Aris Gkoulalas-Divanis. 2013. Distance-based km-anonymization of trajectory data. Proceedings of the IEEE International Conference on Mobile Data Management, Vol. 2. 57--62. DOI:https://doi.org/10.1109/MDM.2013.66Google Scholar
- Swaminathan Sankararaman, Pankaj Agarwal, Thomas Molhave, Jiangwei Pan, and Arnold Boedihardjo. 2013. Model-driven matching and segmentation of trajectories. In Proceedings of the ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL’13).Google Scholar
- Shuo Shang, Lisi Chen, Zhewei Wei, Christian S. Jensen, Kai Zheng, and Panos Kalnis. 2017. Trajectory similarity join in spatial networks. Proc. Very Large Data Base Endow. 10, 11 (Aug. 2017), 1178--1189.Google Scholar
- Shuo Shang, Lisi Chen, Zhewei Wei, Christian S. Jensen, Kai Zheng, and Panos Kalnis. 2018. Parallel trajectory similarity joins in spatial networks. Very Large Data Base J. 27, 3 (June 2018), 395--420.Google Scholar
- Executive Summary. 2014. Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2013--2018. Retrieved from http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white_paper_c11-520862.html.Google Scholar
- Weina Wang, Lei Ying, and Junshan Zhang. 2014. On the tradeoff between privacy and distortion in differential privacy. In Proceedings of the Special Interest Group on Knowledge Discovery and Data Mining (KDD’14). 517--525. Retrieved from http://arxiv.org/abs/1402.3757.Google Scholar
- Katrina Ward, Dan Lin, and Sanjay Madria. 2017. MELT: Mapreduce-based efficient large-scale trajectory anonymization. In Proceedings of the International Conference on Scientific 8 Statistical Database Management (SSDBM’17).Google Scholar
- Roman Yarovoy, Francesco Bonchi, Laks V. S. Lakshmanan, and Wendy Hui Wang. 2009. Anonymizing moving objects: How to hide a MOB in a crowd? Proceedings of the 12th International Conference on Extending Database Technology Advances in Database Technology (EDBT’09). 72--83. DOI:https://doi.org/10.1145/1516360.1516370Google Scholar
- Weizhong Zhao, Huifang Ma, and Qing He. 2009. Parallel k-means clustering based on mapreduce. In Cloud Computing. Springer, Berlin, 674--679.Google Scholar
- Yu Zheng, Lizhu Zhang, Xing Xie, and Wei-Ying Ma. 2009. Mining interesting locations and travel sequences from GPS trajectories. In ACM Press. 791--800.Google Scholar
- Kathryn Zixkhur. 2013. Location-based Services. Retrieved from http://www.pewinternet.org/2013/09/12/location-based-services.Google Scholar
Index Terms
A Parallel Algorithm For Anonymizing Large-scale Trajectory Data
Recommendations
Personalized Privacy-Preserving Publication of Trajectory Data by Generalization and Distortion of Moving Points
AbstractWith the rising prevalence of location-aware devices such as mobile phones, Radio-Frequency Identification (RFID) tags, and Global Positioning Systems (GPSs), the amount of trajectory data is significantly increasing, resulting in various data ...
A Distributed Approach for Privacy Preservation in the Publication of Trajectory Data
GeoPrivacy'15: Proceedings of the 2nd Workshop on Privacy in Geographic Information Collection and AnalysisAdvancements in mobile computing techniques along with the pervasiveness of location-based services have generated a great amount of trajectory data. These data can be used for various data analysis purposes such as traffic flow analysis, infrastructure ...
SST: Privacy Preserving for Semantic Trajectories
MDM '15: Proceedings of the 2015 16th IEEE International Conference on Mobile Data Management - Volume 02To preserve privacy in trajectory data, most existing approaches adapt cloaking techniques to protect individual location points or clustering and perturbation techniques to protect entire trajectories. To confirm to the k-anonymity model, they first ...






Comments