ABSTRACT
Mobile crowdsourcing services (MCS), enable fast and economical data acquisition at scale and find applications in a variety of domains. Prior work has shown that Foursquare and Waze (a location-based and a navigation MCS) are vulnerable to different kinds of data poisoning attacks. Such attacks can be upsetting and even dangerous especially when they are used to inject improper inputs to mislead users. However, to date, there is no comprehensive study on the extent of improper input validation (IIV) vulnerabilities and the feasibility of their exploits in MCSs across domains. In this work, we leverage the fact that MCS interface with their participants through mobile apps to design tools and new methodologies embodied in an end-to-end feedback-driven analysis framework which we use to study 10 popular and previously unexplored services in five different domains. Using our framework we send tens of thousands of API requests with automatically generated input values to characterize their IIV attack surface. Alarmingly, we found that most of them (8/10) suffer from grave IIV vulnerabilities which allow an adversary to launch data poisoning attacks at scale: 7400 spoofed API requests were successful in faking online posts for robberies, gunshots, and other dangerous incidents, faking fitness activities with supernatural speeds and distances among many others. Lastly, we discuss easy to implement and deploy mitigation strategies which can greatly reduce the IIV attack surface and argue for their use as a necessary complementary measure working toward trustworthy mobile crowdsourcing services.
- 2000. Gas Buddy. https://www.gasbuddy.com.Google Scholar
- 2000. OpenCV. https://opencv.org/.Google Scholar
- 2007. Map My Run. https://www.mapmyrun.com/.Google Scholar
- 2008. Google Maps. https://www.google.com/maps.Google Scholar
- 2008. UI Automator. https://developer.android.com/training/testing/ui-automator.html.Google Scholar
- 2009. Fitbit. https://www.fitbit.com.Google Scholar
- 2009. Strava. https://www.strava.com/.Google Scholar
- 2009. Strava Labs. https://labs.strava.com/.Google Scholar
- 2011. Genymotion Android Emulator. https://www.genymotion.com/.Google Scholar
- 2012. Transit. https://transitapp.com/.Google Scholar
- 2015. ToiFi. https://play.google.com/store/apps/details?id=com.apprevelations.indiantoiletfinder.Google Scholar
- 2016. Apktool. https://ibotpeaches.github.io/Apktool/.Google Scholar
- 2016. Basket. http://basket.com/.Google Scholar
- 2017. Appium. http://appium.io/.Google Scholar
- 2017. Frida. https://frida.re/.Google Scholar
- 2018. Neighbors App by Ring. https://store.ring.com/neighbors.Google Scholar
- 2018. Police Detector (Speed Camera Radar). https://play.google.com/store/apps/details?id=tat.example.ildar.seer&hl=en_GB.Google Scholar
- 2019. Google Images Download — Google Images Download documentation. https://google-images-download.readthedocs.io/en/latest/index.html.Google Scholar
- 2019. minimaxir/gpt-2-simple. https://github.com/minimaxir/gpt-2-simple.Google Scholar
- 2020. Google Cloud Natural Language. https://cloud.google.com/natural-language/.Google Scholar
- 2020. You VS the Year 2020 | MapMyFitness. https://www.mapmyrun.com/challenges/yvsty2020/register.Google Scholar
- 2021. Project Website.https://sites.google.com/view/data-poisoning-mcs.Google Scholar
- David Ifeoluwa Adelani, Haotian Mai, Fuming Fang, Huy H Nguyen, Junichi Yamagishi, and Isao Echizen. 2020. Generating sentiment-preserving fake online reviews using neural language models and their human-and machine-based detection. In International Conference on Advanced Information Networking and Applications. Springer, 1341–1354.Google Scholar
Cross Ref
- Josip Bozic and Franz Wotawa. 2012. Model-based testing-from safety to security. In Proceedings of the 9th Workshop on Systems Testing and Validation (STV’12). 9–16.Google Scholar
- Bogdan Carbunar and Rahul Potharaju. 2012. You unlocked the mt. everest badge on foursquare! countering location fraud in geosocial networks. In 2012 IEEE 9th International Conference on Mobile Ad-Hoc and Sensor Systems (MASS 2012). IEEE, 182–190.Google Scholar
Digital Library
- Jiongyi Chen, Wenrui Diao, Qingchuan Zhao, Chaoshun Zuo, Zhiqiang Lin, XiaoFeng Wang, Wing Cheong Lau, Menghan Sun, Ronghai Yang, and Kehuan Zhang. 2018. IoTFuzzer: Discovering Memory Corruptions in IoT Through App-based Fuzzing.. In NDSS.Google Scholar
- Soteris Demetriou, Nan Zhang, Yeonjoon Lee, XiaoFeng Wang, Carl A Gunter, Xiaoyong Zhou, and Michael Grace. 2017. HanGuard: SDN-driven protection of smart home WiFi devices from malicious mobile apps. In Proceedings of the 10th ACM Conference on Security and Privacy in Wireless and Mobile Networks. 122–133.Google Scholar
Digital Library
- Sascha Fahl, Marian Harbach, Thomas Muders, Lars Baumgärtner, Bernd Freisleben, and Matthew Smith. 2012. Why Eve and Mallory love Android: An analysis of Android SSL (in) security. In Proceedings of the 2012 ACM conference on Computer and communications security. 50–61.Google Scholar
Digital Library
- Sebastian Gehrmann, Hendrik Strobelt, and Alexander M Rush. 2019. Gltr: Statistical detection and visualization of generated text. arXiv preprint arXiv:1906.04043(2019).Google Scholar
- Martin Georgiev, Subodh Iyengar, Suman Jana, Rishita Anubhai, Dan Boneh, and Vitaly Shmatikov. 2012. The most dangerous code in the world: validating SSL certificates in non-browser software. In Proceedings of the 2012 ACM conference on Computer and communications security. 38–49.Google Scholar
Digital Library
- Peter Gilbert, Landon P Cox, Jaeyeon Jung, and David Wetherall. 2010. Toward trustworthy mobile sensing. In Proceedings of the Eleventh Workshop on Mobile Computing Systems & Applications. 31–36.Google Scholar
Digital Library
- gov.uk. 2020. United Kingdom milk prices and composition of milk statistics notice (data for April 2020). https://www.gov.uk/government/publications/uk-milk-prices-and-composition-of-milk/united-kingdom-milk-prices-and-composition-of-milk-statistics-notice-data-for-june-2019.Google Scholar
- Yuyu He, Lei Zhang, Zhemin Yang, Yinzhi Cao, Keke Lian, Shuai Li, Wei Yang, Zhibo Zhang, Min Yang, Yuan Zhang, 2020. TextExerciser: Feedback-driven Text Input Exercising for Android Applications. In 2020 IEEE Symposium on Security and Privacy. IEEE.Google Scholar
- Jianjun Huang, Zhichun Li, Xusheng Xiao, Zhenyu Wu, Kangjie Lu, Xiangyu Zhang, and Guofei Jiang. 2015. {SUPOR}: Precise and Scalable Sensitive User Input Detection for Android Apps. In 24th {USENIX} Security Symposium ({USENIX} Security 15). 977–992.Google Scholar
- J. Hubbard, K. Weimer, and Y. Chen. 2014. A study of SSL Proxy attacks on Android and iOS mobile applications. In 2014 IEEE 11th Consumer Communications and Networking Conference (CCNC). 86–91.Google Scholar
- Mika Juuti, Bo Sun, Tatsuya Mori, and N Asokan. 2018. Stay on-topic: Generating context-specific fake restaurant reviews. In European Symposium on Research in Computer Security. Springer, 132–151.Google Scholar
Digital Library
- Hongwei Li and Bin Yu. 2014. Error rate bounds and iterative weighted majority voting for crowdsourcing. arXiv preprint arXiv:1411.4086(2014).Google Scholar
- Krista Merry and Pete Bettinger. 2019. Smartphone GPS accuracy study in an urban environment. PloS one 14, 7 (2019).Google Scholar
- Yuhong Nan, Min Yang, Zhemin Yang, Shunfan Zhou, Guofei Gu, and XiaoFeng Wang. 2015. Uipicker: User-input privacy identification in mobile applications. In 24th {USENIX} Security Symposium ({USENIX} Security 15). 993–1008.Google Scholar
- Victor Naroditskiy, Nicholas R Jennings, Pascal Van Hentenryck, and Manuel Cebrian. 2013. Crowdsourcing dilemma. arXiv preprint arXiv:1304.3548(2013).Google Scholar
- Martin Pesendorfer. 2002. Retail sales: A study of pricing behavior in supermarkets. The Journal of Business 75, 1 (2002), 33–66.Google Scholar
Cross Ref
- Iasonas Polakis, Stamatis Volanis, Elias Athanasopoulos, and Evangelos P Markatos. 2013. The man who was there: validating check-ins in location-based services. In Proceedings of the 29th Annual Computer Security Applications Conference. 19–28.Google Scholar
Digital Library
- Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.Google Scholar
- Ina Schieferdecker. 2012. Model-Based Fuzz Testing. In 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation. IEEE, 814–814.Google Scholar
- Weckert Simon. 2020. Google Maps Hacks. http://www.simonweckert.com/googlemapshacks.html.Google Scholar
- Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeff Wu, Alec Radford, and Jasmine Wang. 2019. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203(2019).Google Scholar
- David Sounthiraraj, Justin Sahs, Garret Greenwood, Zhiqiang Lin, and Latifur Khan. 2014. Smv-hunter: Large scale, automated detection of ssl/tls man-in-the-middle vulnerabilities in android apps. In In Proceedings of the 21st Annual Network and Distributed System Security Symposium (NDSS’14. Citeseer.Google Scholar
Cross Ref
- Dapeng Tao, Jun Cheng, Zhengtao Yu, Kun Yue, and Lizhen Wang. 2018. Domain-weighted majority voting for crowdsourcing. IEEE transactions on neural networks and learning systems 30, 1(2018), 163–174.Google Scholar
- usda.gov. 2019. Price Spreads from Farm to Consumer. https://www.ers.usda.gov/data-products/price-spreads-from-farm-to-consumer/.Google Scholar
- Mark Utting, Alexander Pretschner, and Bruno Legeard. 2012. A taxonomy of model-based testing approaches. Software testing, verification and reliability 22, 5(2012), 297–312.Google Scholar
- Gang Wang, Bolun Wang, Tianyi Wang, Ana Nika, Bingzhe Liu, Haitao Zheng, and Ben Y Zhao. 2015. Attacks and defenses in crowdsourced mapping services. CoRR, abs/1508.00837(2015).Google Scholar
- Gang Wang, Bolun Wang, Tianyi Wang, Ana Nika, Haitao Zheng, and Ben Y Zhao. 2016. Defending against sybil devices in crowdsourced mapping services. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. 179–191.Google Scholar
Digital Library
- Gang Wang, Bolun Wang, Tianyi Wang, Ana Nika, Haitao Zheng, and Ben Y Zhao. 2018. Ghost riders: Sybil attacks on crowdsourced mobile mapping services. IEEE/ACM transactions on networking 26, 3 (2018), 1123–1136.Google Scholar
- Kun Wang, Xin Qi, Lei Shu, Der-jiunn Deng, and Joel JPC Rodrigues. 2016. Toward trustworthy crowdsourcing in the social internet of things. IEEE Wireless Communications 23, 5 (2016), 30–36.Google Scholar
Digital Library
- Kan Yang, Kuan Zhang, Ju Ren, and Xuemin Shen. 2015. Security and privacy in mobile crowdsourcing networks: challenges and opportunities. IEEE communications magazine 53, 8 (2015), 75–81.Google Scholar
- Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Haitao Zheng, and Ben Y Zhao. 2017. Automated crowdturfing attacks and defenses in online review systems. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 1143–1158.Google Scholar
Digital Library
- Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending against neural fake news. In Advances in Neural Information Processing Systems. 9051–9062.Google Scholar
- Rui Zhang, Jinxue Zhang, Yanchao Zhang, and Chi Zhang. 2013. Secure crowdsourcing-based cooperative pectrum sensing. In 2013 Proceedings IEEE INFOCOM. IEEE, 2526–2534.Google Scholar
Cross Ref
- Qingchuan Zhao, Chaoshun Zuo, Dolan-Gavitt Brendan, Giancarlo Pellegrino, and Zhiqiang Lin. 2020. Automatic Uncovering of Hidden Behaviors From Input Validation in Mobile Apps. In 2020 IEEE Symposium on Security and Privacy. IEEE.Google Scholar
- Qingchuan Zhao, Chaoshun Zuo, Giancarlo Pellegrino, and Li Zhiqiang. 2019. Geo-locating Drivers: A Study of Sensitive Data Leakage in Ride-Hailing Services.. In Annual Network and Distributed System Security symposium, February 2019 (NDSS 2019).Google Scholar
Cross Ref
Index Terms
- Characterizing Improper Input Validation Vulnerabilities of Mobile Crowdsourcing Services
Recommendations
Towards Analyzing the Input Validation Vulnerabilities associated with Android System Services
ACSAC '15: Proceedings of the 31st Annual Computer Security Applications ConferenceAlthough the input validation vulnerabilities play a critical role in web application security, such vulnerabilities are so far largely neglected in the Android security research community. We found that due to the unique Framework Code layer, Android ...
Preventing Input Validation Vulnerabilities in Web Applications through Automated Type Analysis
COMPSAC '12: Proceedings of the 2012 IEEE 36th Annual Computer Software and Applications ConferenceWeb applications have become an integral part of the daily lives of millions of users. Unfortunately, web applications are also frequently targeted by attackers, and criticial vulnerabilities such as cross-site scripting and SQL injection are still ...
Truth Discovery against Strategic Sybil Attack in Crowdsourcing
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningCrowdsourcing is an information system for recruiting online workers to perform human intelligent tasks (HITs) that are hard for computers. Due to the openness of crowdsourcing, dynamic online workers with different knowledge backgrounds might give ...





Comments