Abstract
Web site defacement, the process of introducing unauthorized modifications to a Web site, is a very common form of attack. In this paper we describe and evaluate experimentally a framework that may constitute the basis for a defacement detection service capable of monitoring thousands of remote Web sites systematically and automatically.
In our framework an organization may join the service by simply providing the URLs of the resources to be monitored along with the contact point of an administrator. The monitored organization may thus take advantage of the service with just a few mouse clicks, without installing any software locally or changing its own daily operational processes. Our approach is based on anomaly detection and allows monitoring the integrity of many remote Web resources automatically while remaining fully decoupled from them, in particular, without requiring any prior knowledge about those resources.
We evaluated our approach over a selection of dynamic resources and a set of publicly available defacements. The results are very satisfactory: all attacks are detected while keeping false positives to a minimum. We also assessed performance and scalability of our proposal and we found that it may indeed constitute the basis for actually deploying the proposed service on a large scale.
- Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., and Spyropoulos, C. D. 2000. An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’00). ACM Press, New York, 160--167. Google Scholar
Digital Library
- Anitha, A. and Vaidehi, V. 2006. Context based application level intrusion detection system. In Proceedings of the International Conference on Networking and Services (ICNS’06). IEEE Computer Society, Los Alamitos, CA, 16. Google Scholar
Digital Library
- Ballard, L. 2009. Show me the malware! Google online security blog. http://googleonlinesecurity.blogspot.com/2009/10/show-me-malware.html.Google Scholar
- Banikazemi, M., Poff, D., and Abali, B. 2005. Storage-based file system integrity checker. In Proceedings of the ACM Workshop on Storage Security and Survivability (StorageSS’05). ACM Press, New York, 57--63. Google Scholar
Digital Library
- Barreno, M., Nelson, B., Sears, R., Joseph, A. D., and Tygar, J. D. 2006. Can machine learning be secure? In Proceedings of the ACM Symposium on Information, Computer, and Communications Security. ACM, 16--25. Google Scholar
Digital Library
- Bartoli, A. and Medvet, E. 2006. Automatic integrity checks for remote Web resources. IEEE Intern. Comput. 10, 6, 56--62. Google Scholar
Digital Library
- Bartoli, A., Medvet, E., and Davanzo, G. 2009. The reaction time to Web site defacements. IEEE Intern. Comput. 13, 4, 52--58. Google Scholar
Digital Library
- Boser, B. E., Guyon, I. M., and Vapnik, V. N. 1992. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory. 144--152. Google Scholar
Digital Library
- Breunig, M. M., Kriegel, H.-P., Ng, R. T., and Sander, J. 2000. LOF: Identifying density-based local outliers. SIGMOD Rec. 29, 93--104. Google Scholar
Digital Library
- Broder, A. Z., Glassman, S. C., Manasse, M. S., and Zweig, G. 1997. Syntactic clustering of the Web. Comput. Netw. ISDN Syst. 29, 8-13, 1157--1166. Google Scholar
Digital Library
- CERT/CC. 2001. FedCIRC Advisory FA-2001-19 “Code Red” worm exploiting buffer overflow in IIS indexing service DLL. Advisory, US-Cert. http://www.us-cert.gov/federal/archive/advisories/FA-2001-19.html.Google Scholar
- Chang, H.-Y., Wu, S. F., and Jou, Y. F. 2001. Real-time protocol analysis for detecting link-state routing protocol attacks. ACM Trans. Inform. Syst. Secur. 4, 1, 1--36. Google Scholar
Digital Library
- Chari, S. N. and Cheng, P.-C. 2003. BlueBoX: A policy-driven, host-based intrusion detection system. ACM Trans. Inform. Syst. Secur. 6, 2, 173--200. Google Scholar
Digital Library
- Chilimbi, T. M. and Ganapathy, V. 2006. HeapMD: Identifying heap-based bugs using anomaly detection. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XII). ACM Press, New York, 219--228. Google Scholar
Digital Library
- Cormack, G. V. and Lynam, T. R. 2007. Online supervised spam filter evaluation. ACM Trans. Inform. Syst. 25, 3. Google Scholar
Digital Library
- Cranor, L. F. and LaMacchia, B. A. 1998. Spam! Comm. ACM 41, 8, 74--83. Google Scholar
Digital Library
- Danchev, D. 2009. Hackers hijack DNS records of high profile New Zealand sites. ZDNet. http://blogs.zdnet.com/security/?p=3185.Google Scholar
- Dasey, D. Oct. 2007. Cyber threat to personal details. The Sydney Morning Herald. http://www.smh.com.au/news/technology/cyber-threat-to-personal-details/2007/10/13/ 1191696235979.html.Google Scholar
- Dasient. 2009. Dasient Web anti-malware. http://www.dasient.com/.Google Scholar
- Davanzo, G., Medvet, E., and Bartoli, A. 2008. A comparative study of anomaly detection techniques in Web site defacement detection. In Proceedings of the 23rd International Information Security Conference. 711--716.Google Scholar
- Denning, D. E. 1987. An intrusion-detection model. IEEE Trans. Softw. Engin. 13, 2, 222--232. Google Scholar
Digital Library
- DSL. 2008. Comcast domain hacked. DSLReports.com. http://www.dslreports.com/shownews/Comcast-Hacked-94826.Google Scholar
- Fetterly, D., Manasse, M., Najork, M., and Wiener, J. L. 2004. A large-scale study of the evolution of Web pages. Softw. Pract. Exper. 34, 2, 213--237. Google Scholar
Digital Library
- Fone, W. and Gregory, P. 2002. Web page defacement countermeasures. In Proceedings of the 3rd International Symposium on Communication Systems Networks and Digital Signal Processing. IEE/IEEE/BCS, 26--29.Google Scholar
- Fu, A. Y., Wenyin, L., and Deng, X. 2006. Detecting phishing Web pages with visual similarity assessment based on earth mover’s distance (EMD). IEEE Trans. Depend. Secur. Comput. 3, 4, 301--311. Google Scholar
Digital Library
- Gehani, A., Chandra, S., and Kedem, G. 2006. Augmenting storage with an intrusion response primitive to ensure the security of critical data. In Proceedings of the ACM Symposium on Information, Computer, and Communications Security (ASIACCS’06). ACM Press, New York, 114--124. Google Scholar
Digital Library
- Goodman, J., Cormack, G. V., and Heckerman, D. 2007. Spam and the ongoing battle for the inbox. Comm. ACM 50, 2, 24--33. Google Scholar
Digital Library
- Gordon, L. A., Loeb, M. P., Lucyshyn, W., and Richardson, R. 2006. 2006 CSI/FBI Computer Crime and Security Survey. Security survey, Computer Security Institute.Google Scholar
- Gosh, A. K., Wanken, J., and Charron, F. 1998. Detecting anomalous and unknown intrusions against programs. In Proceedings of the 14th Annual Computer Security Applications Conference (ACSAC’98). IEEE Computer Society, Los Alamitos, CA, 259. Google Scholar
Digital Library
- Graham, P. 2003. Better Bayesian filtering. http://www.paulgraham.com/better.html.Google Scholar
- Handley, M., Paxson, V., and Kreibich, C. 2001. Network intrusion detection: Evasion, traffic normalization, and end-to-end protocol semantics. In Proceedings of the 10th Conference on USENIX Security Symposium. USENIX Association. Google Scholar
Digital Library
- Heberlein, L. T., Dias, G. V., Levitt, K. N., Mukherjee, B., Wood, J., and Wolber, D. 1990. A network security monitor. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Los Alamitos, CA, 296.Google Scholar
- IBM Rational. 2009. Malware Scanner Extension for IBM Rational AppScan. http://www.ibm.com/developerworks/rational/downloads/08/appscan_malwarescanner/ index.html.Google Scholar
- Kemp, T. 2005. Security’s Shaky State. Inform. Week. http://www.informationweek.com/industries/showArticle.jhtml?articleID=174900279.Google Scholar
- Kim, G. H. and Spafford, E. H. 1994. The design and implementation of tripwire: A file system integrity checker. In Proceedings of the 2nd ACM Conference on Computer and Communications Security (CCS’94). ACM Press, New York, 18--29. Google Scholar
Digital Library
- Kirk, J. 2007. Microsoft’s U.K. Web site hit by SQL injection attack. ComputerWorld Security. http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId= 9025941.Google Scholar
- Koza, J. R. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems). The MIT Press. Google Scholar
Digital Library
- Kruegel, C., Toth, T., and Kirda, E. 2002. Service specific anomaly detection for network intrusion detection. In Proceedings of the ACM Symposium on Applied Computing (SAC’02). ACM Press, New York, 201--208. Google Scholar
Digital Library
- Kruegel, C. and Vigna, G. 2003. Anomaly detection of Web-based attacks. In Proceedings of the 10th ACM Conference on Computer and Communications Security (CCS’03). ACM Press, New York, 251--261. Google Scholar
Digital Library
- Lazarevic, A., Ertöz, L., Kumar, V., Ozgur, A., and Srivastava, J. 2003. A comparative study of anomaly detection schemes in network intrusion detection. In Proceedings of the 3rd SIAM International Conference on Data Mining. SIAM, San Francisco, CA.Google Scholar
- Lippmann, R., Haines, J. W., Fried, D. J., Korba, J., and Das, K. 2000. Analysis and results of the 1999 DARPA offline intrusion detection evaluation. In Proceedings of the 3rd International Workshop on Recent Advances in Intrusion Detection (RAID’00). Springer-Verlag, 162--182. Google Scholar
Digital Library
- Liu, W., Deng, X., Huang, G., and Fu, A. Y. 2006. An antiphishing strategy based on visual similarity assessment. IEEE Intern. Comput. 10, 2, 58--65. Google Scholar
Digital Library
- Mahalanobis, P. C. 1936. On the generalized distance in statistics. In Proceedings of the National Institute of Science of India, 12, 49--55.Google Scholar
- McHugh, J. 2000. Testing intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Trans. Inform. Syst. Secur. 3, 4, 262--294. Google Scholar
Digital Library
- McMillan, R. 2007. Bad things lurking on government sites. InfoWorld. http://www.infoworld.com/article/07/10/04/Bad-things-lurking-on-government-sites_1.html.Google Scholar
- Medvet, E. and Bartoli, A. 2007. On the effects of learning set corruption in anomaly-based detection of Web defacements. In Proceedings of the 4th GI International Conference on Detection of Intrusions & Malware, and Vulnerability Assessment (DIMVA). Springer. Google Scholar
Digital Library
- Medvet, E., Fillon, C., and Bartoli, A. 2007. Detection of Web defacements by means of genetic programming. In Proceedings of the 3rd International Symposium on Information Assurance and Security. IAS, Manchester, UK. Google Scholar
Digital Library
- Michael, C. C. and Ghosh, A. 2002. Simple, state-based approaches to program-based anomaly detection. ACM Trans. Inform. Syst. Secur. 5, 3, 203--237. Google Scholar
Digital Library
- Mills, E. 2009. Puerto Rico sites redirected in a DNS attack. CNET. http://news.cnet.com/8301-1009_3-10228436-83.html.Google Scholar
- Mishne, G., Carmel, D., and Lempel, R. 2005. Blocking blog spam with language model disagreement. In Proceedings of the 1st International Workshop on Adversarial Information Retrieval on the Web (AIRWeb).Google Scholar
- Mukkamala, S., Janoski, G., and Sung, A. 2002. Intrusion detection using neural networks and support vector machines. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’02). 1702--1707.Google Scholar
- Mutz, D., Valeur, F., Vigna, G., and Kruegel, C. 2006. Anomalous system call detection. ACM Trans. Inform. Syst. Secur. 9, 1, 61--93. Google Scholar
Digital Library
- Ntoulas, A., Cho, J., and Olston, C. 2004. What’s new on the Web? The evolution of the Web from a search engine perspective. In Proceedings of the 13th International World Wide Web Conference. ACM Press, New York, 1--12. Google Scholar
Digital Library
- Page, L., Brin, S., Rajeev, M., and Terry, W. 1998. The PageRank Citation Ranking: Bringing Order to the Web. Tech. rep., Stanford University.Google Scholar
- Patcha, A. and Park, J.-M. 2007. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Netw. 51, 12, 3448--3470. Google Scholar
Digital Library
- Pennington, A. G., Strunk, J. D., Griffin, J. L., Soules, C. A., Goodson, G. R., and Ganger, G. R. 2003. Storage-based intrusion detection: Watching storage activity for suspicious behavior. In Proceedings of the 12th USENIX Security Symposium. USENIX. Google Scholar
Digital Library
- Prefect. 2010. Congressional Web site defacements follow the state of the union. Praetorian Prefect. http://praetorianprefect.com/archives/2010/01/congressional-Web-site-defacements-follow-the-state-of-the-union/.Google Scholar
- Provos, N., Mavrommatis, P., Rajab, M. A., and Monrose, F. 2008. All your iframes point to us. In Proceedings of the 17th Conference on Security Symposium (SS’08). USENIX Association, 1--15. Google Scholar
Digital Library
- Pulliam, D. Aug. 2006. Hackers deface federal executive board Web sites. http://www.govexec.com/story_page.cfm?articleid=34812.Google Scholar
- Ramachandran, A., Feamster, N., and Vempala, S. 2007. Filtering spam with behavioral blacklisting. In Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS’07). ACM, New York, 342--351. Google Scholar
Digital Library
- Ramaswamy, S., Rastogi, R., and Shim, K. 2000. Efficient algorithms for mining outliers from large data sets. SIGMOD Rec. 29, 427--438. Google Scholar
Digital Library
- Richardson, R. 2007. 2007 CSI Computer Crime and Security Survey. Security survey, Computer Security Institute.Google Scholar
- Sanka, A., Chamakura, S., and Chakravarthy, S. 2006. A dataflow approach to efficient change detection of HTML/XML documents in WebVigiL. Comput. Netw. 50, 10, 1547--1563. Google Scholar
Digital Library
- Sedaghat, S., Pieprzyk, J., and Vossough, E. 2002. On-the-fly Web content integrity check boosts users’ confidence. Comm. ACM 45, 11, 33--37. Google Scholar
Digital Library
- Sekar, R., Bendre, M., Dhurjati, D., and Bollineni, P. 2001. A fast automaton-based method for detecting anomalous program behaviors. In Proceedings of the IEEE Symposium on Security and Privacy (SP’01). Los Alamitos, CA, 144. Google Scholar
Digital Library
- Shavlik, J. and Shavlik, M. 2004. Selection, combination, and evaluation of effective software sensors for detecting abnormal computer usage. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’04). ACM Press, New York, 276--285. Google Scholar
Digital Library
- Shyu, M.-L., Chen, S.-C., Sarinnapakorn, K., and Chang, L. 2003. A novel anomaly detection scheme based on principal component classifier. In Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, in Conjunction with the 3rd IEEE International Conference on Data Mining (ICDM’03). IEEE, 172--179.Google Scholar
- Sivathanu, G., Wright, C. P., and Zadok, E. 2005. Ensuring data integrity in storage: Techniques and applications. In Proceedings of the ACM Workshop on Storage Security and Survivability (StorageSS’05). ACM Press, New York, 26--36. Google Scholar
Digital Library
- Smith, G. Feb. 2007. CRO Website hacked. Silicon Republic. http://www.siliconrepublic.com/news/news.nv?storyid=single7819.Google Scholar
- Tan, K., McHugh, J., and Killourhy, K. 2003. Hiding intrusions: From the abnormal to the normal and beyond. In Revised Papers from the 5th International Workshop on Information Hiding, Lecture Notes in Computer Science, vol. 2578. 1--17. Google Scholar
Digital Library
- UCSB. 2009. Wepawet---on line Web malware detection. http://wepawet.cs.ucsb.edu.Google Scholar
- Wanjiku, R. 2009. Google blames DNS insecurity for Web site defacements. Infoworld. http://www.infoworld.com/t/authentication-and-authorization/google-blames-dns-insecurity-Web-site-defacements-722.Google Scholar
- Ye, N., Emran, S. M., Chen, Q., and Vilbert, S. 2002. Multivariate statistical analysis of audit trails for host-based intrusion detection. IEEE Trans. Comput. 51, 7, 810--820. Google Scholar
Digital Library
- Yeung, D.-Y. and Chow, C. 2002. Parzen-window network intrusion detectors. In Proceedings of the 16th International Conference on Pattern Recognition. 385--388. Google Scholar
Digital Library
- Zanero, S. and Savaresi, S. M. 2004. Unsupervised learning techniques for an intrusion detection system. In Proceedings of the ACM Symposium on Applied Computing (SAC’04). ACM Press, New York, 412--419. Google Scholar
Digital Library
- Zone-H. 2006. Statistics on Web Server Attacks for 2005. http://www.zone-h.org.Google Scholar
Index Terms
A Framework for Large-Scale Detection of Web Site Defacements
Recommendations
Web site metadata
The currently established formats for how a Web site can publish metadata about a site's pages, the robots.txt file and sitemaps, focus on how to provide information to crawlers about where to not go and where to go on a site. This is sufficient as ...
Automatic Integrity Checks for Remote Web Resources
Existing tools for automatically detecting Web site defacement compare monitored Web resources with uncorrupted copies of thecontent kept in a safe place. This can be an expensive and difficult task, especially when working with dynamic resources. In ...
Detection of Hidden Fraudulent URLs within Trusted Sites Using Lexical Features
ARES '13: Proceedings of the 2013 International Conference on Availability, Reliability and SecurityInternet security threats often involve the fraudulent modification of a web site, often with the addition of new pages at URLs where no page should exist. Detecting the existence of such hidden URLs is very difficult because they do not appear during ...






Comments