Abstract
Visual complexities (VisComs) of Web pages significantly affect user experience, and automatic evaluation can facilitate a large number of Web-based applications. The construction of a model for measuring the VisComs of Web pages requires the extraction of typical features and learning based on labeled Web pages. However, as far as the authors are aware, little headway has been made on measuring VisCom in Web mining and machine learning. The present article provides a new approach combining Web mining techniques and machine learning algorithms for measuring the VisComs of Web pages. The structure of a Web page is first analyzed, and the layout is then extracted. Using a Web page as a semistructured image, three classes of features are extracted to construct a feature vector. The feature vector is fed into a learned measuring function to calculate the VisCom of the page.
In the proposed approach of the present study, the type of the measuring function and its learning depend on the quantification strategy for VisCom. Aside from using a category and a score to represent VisCom as existing work, this study presents a new strategy utilizing a distribution to quantify the VisCom of a Web page. Empirical evaluation suggests the effectiveness of the proposed approach in terms of both features and learning algorithms.
- Ahmad, A. -R., Basir, O., Hassanein, K., and Azam, S. 2008. An intelligent expert systems approach to layout decision analysis and design under uncertainty. Stud. Comput. Intell. 97, 321--364.Google Scholar
- Amazon. 2005. Amazon’s mechanical turk. https://www.mturk.com/mturk/welcome.Google Scholar
- Annett, J. 2002. Subjective rating scales: Science or art? Ergonomics 45, 14, 966--987.Google Scholar
Cross Ref
- Berlyne, D. 1974. Studies in the New Experimental Aesthetics. Hemi-sphere Publishing.Google Scholar
- Breiman, L. 2001. Random forests. Mach. Learn. 45, 1, 5--32. Google Scholar
Digital Library
- Cai, D., Yu, S., Wen, J. -R., and Ma, W. -Y. 2003a. Extracting content structure for web pages based on visual representation. In Proceedings of the 5th Asia-Pacific Web Conference on Web Technologies and Applications. 406--417. Google Scholar
Digital Library
- Cai, D., Yu, S., Wen, J. -R., and Ma, W.-Y. 2003b. Vips: A vision-based page segmentation algorithm. Tech. rep. MSR-TR-2003-79. Microsoft.Google Scholar
- Cao, L. J., Chua , K. S., and Chong, W. K. 2003. A comparison of pca, kpca and ica for dimensionality reduction in support vector machine. Neurocomput. 55, 1--2, 321--336.Google Scholar
Cross Ref
- Chen, G. and Choi, B. 2008. Web page genre classification. In Proceedings of the ACM Symposium on Applied Computing. 2353--2357. Google Scholar
Digital Library
- Cheng, H. and Cant-Paz, E. 2010. Personalized click prediction in sponsored search. In Proceedings of the ACM International Conference on Web Search and Data Mining. 351--360. Google Scholar
Digital Library
- CV. 2013. http://en.wikipedia.org/wiki/cross-validation statistics.Google Scholar
- Datta, R., Joshi, D., Li , J., and Wang, J. Z. 2006. Studying aesthetics in photographic images using a computational approach. In Proceedings of the European Conference on Computer Vision. 288--301. Google Scholar
Digital Library
- Donderi, D. C. 2006. Visual complexity: A review. Psychol. Bull. 132, 1, 73--97.Google Scholar
Cross Ref
- Duda, R. O., Hart, P. E., and Stork, D. G. 2001. Pattern Classification 2nd Ed. John Wiley & Sons. Google Scholar
Digital Library
- Dwork, C., Kumar, R., Naor, M., and Sivakumarc, D. 2001. Rank aggregation methods for the web. In Proceedings of the 10th International Conference on World Wide Web. ACM, 613--622. Google Scholar
Digital Library
- Fawcett, T. 2006. An introduction to roc analysis. Pattern Recogn. Lett. 27, 861--874. Google Scholar
Digital Library
- Fleiss, J. L. 1971. Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 5, 378--382.Google Scholar
Cross Ref
- Forsythe, A. 2009. Visual complexity: Is that all there is? In Proceedings of the 13th International Conference on Human-Computer Interaction. Lecture Notes in Artificial Intelligence, vol. 5639, Springer, 158--166. Google Scholar
Digital Library
- Forsythe, A., Sheehy, N., and Sawey, M. 2003. Measuring icon complexity: An automated analysis. Behav. Res. Methods Instrum. Comput. 32, 2, 334--342.Google Scholar
Cross Ref
- Franc, V. and Sonnenburg, S. 2008. Optimized cutting plane algorithm for support vector machines. In Proceedings of the International Conference on Machine Learning. 320--327. Google Scholar
Digital Library
- Geissler, G. L., Zinkhan, G. M., and Watson, R. T. 2006. The influence of home page complexity on consumer attention, attitudes, and purchase intent. J. Advertising 35, 2, 69--80.Google Scholar
Cross Ref
- Gero, J. S. and Kazakov, V. 2004. On measuring the visual complexity of 3d objects. J. Des. Sci. Technol. 12, 1, 35--44.Google Scholar
- Geusebroek, J. and Smeulders, A. 2005. A six-stimulus theory for stochastic texture. Int. J. Comput. Vision 62, 1--2, 7--16. Google Scholar
Digital Library
- Harper, S., Michailidou, E., and Stevens, R. 2009. Toward a definition of visual complexity as an implicit measure of cognitive load. ACM Trans. Appl. Percept. 6, 2, Artical 10. Google Scholar
Digital Library
- Hasler, S. and Susstrunk, S. 2003. Measuring colorfulness in real images. Proc. SPIE Electron. Imag:Hum. Vision Electron. 87--95.Google Scholar
- Jiang, D., Pei, J., and Li, H. 2010. Web search/browse log mining: Challenges, methods, and applications. In Proceedings of the International World Wide Web Conference. 1351--1352. Google Scholar
Digital Library
- Kim, J. and Wilhelm, T. 2008. What is a complex graph? Phys. A 387, 2637--2652.Google Scholar
Cross Ref
- Kohlschtter, C. and Nejdl, W. 2008. A densitometric approach to web page segmentation. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM). 1173--1182. Google Scholar
Digital Library
- Lam, F. C. and Longnecker, M. T. 1983. A modified wilconxon rank sum test for paired data. Biometrika 70, 510--513.Google Scholar
Cross Ref
- Levering, R. and Cutler, M. 2009. Cost-Sensitive feature extraction and selection in genre classification. J. Lang. Technol. Comput. Linguistics 24, 2, 57--72.Google Scholar
- Liu, B. 2007. Web Data Mining: Exploring Hyperlinks, Contents and Usage Data. Springer. Google Scholar
Digital Library
- Michailidou, E. 2009. Visual complexity rankings and accessibility metrics. Ph.D. thesis, University of Manchester.Google Scholar
- Michailidou, E., Harper, S., and Bechhofer, S. 2008. Visual complexity and aesthetic perception of web pages. In Proceedings of the ACM International Conference on Design of Communication (SIGDOC). 215--223. Google Scholar
Digital Library
- Mitchell, T. M. 1997. Machine Learning. McGraw Hill. Google Scholar
Digital Library
- Ninassi, A., Meur, O. L., Olivier, P. L., and Barba, D. 2009. Considering temporal variations of spatial visual distortions in video quality assessment. IEEE J. Sel. Top. Sign. Proces. 3, 2, 253--265.Google Scholar
Cross Ref
- Pandir, M. and Knight, J. 2006. Homepage aesthetics: The search for preference factors and the challenges of subjectivity. Interact. Comput. 18, 6, 1351--1370. Google Scholar
Digital Library
- Papachristos, E., Tselios, N., and Avouris, T. 2006. Bayesian modeling of impact of colour on web credibility. In Proceedings of the European Conference on Artificial Intelligence. 41--45. Google Scholar
Digital Library
- Park, S., Choi, D., and Kim, J. 2004. Critical factors for the aesthetic fidelity of web pages: Empirical studies with professional web designers and users. Interact. Comput. 16, 351--376.Google Scholar
- Pedro, J. S. and Siersdorfer, S. 2009. Ranking and classifying attractiveness of photos in folksonomies. In Proceedings of the International World Wide Web Conference. 771--780. Google Scholar
Digital Library
- Pieters, R., Wedel, M., and Batra, R. 2010. The stopping power of advertising: Measures and effects of visual complexity. J. Market. 74, 5, 48--60.Google Scholar
Cross Ref
- Pitler, E. and Nenkova, A. 2008. Revisiting readability: A unified framework for predicting text quality. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 186--195. Google Scholar
Digital Library
- Rosenholtz, R., Li, Y., and Nakano, L. 2007. Measuring visual clutter. J. Vision 7, 2, 1--22.Google Scholar
Cross Ref
- Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 1986. Learning representations by back-propagating errors. Nature 323, 6088, 533--536.Google Scholar
- Schaik, R. and Ling, J. 1991. The effects of screen ratio and order on information retrieval in web pages. IEEE Trans. Syst. Man Cybern. 21, 3, 660--674.Google Scholar
- Song, G. 2007. Analysis of web page complexity through visual segmentation. In Proceedings of the 12th International Conference on Human-Computer Interaction. Lecture Notes in Artificial Intelligence, vol. 4553, Springer, 114--123. Google Scholar
Digital Library
- Song, R., Liu, H., Wen, J. -R., and Ma, W. -Y. 2004. Learning block importance models for web pages. In Proceedings of the International World Wide Web Conference. 203--211. Google Scholar
Digital Library
- Sickel, C., Ebner, M., and Holzinger, A. 2010. The xaos metric - Understanding visual complexity as measure of usability. In Proceedings of the 6th Symposium of the Workgroup HCI & UE of the Austrian Computer Society (USAB’’10). 278--290. Google Scholar
Digital Library
- Thomas, C. and Tullis, S. 1998. A method for evaluating web page design concepts. In Proceedings of the International Conference on Human Factors in Computing Systems (CHI’98). 323--324. Google Scholar
Digital Library
- Tsochantaridis, I., Hofmann, T., Joachims, T., and Altun, Y. 2004. Support vector machine learning for interdependent and structured output spaces. In Proceedings of the International Conference on Machine Learning. 104--112. Google Scholar
Digital Library
- Tuch, A. N., Bargas-Avila, J., Opwis, K., and Wilhem, F. 2009. Visual complexity of websites: Effects on users’ experience, physiology, performance, and memory. Int. J. Hum. Comput. Stud. 67, 703--715. Google Scholar
Digital Library
- Tuch, A. N., Kreibig, S., Roth, S., Bargas-Avila, J., Opwis, K., and Wilhem, F. H. 2011. The role of visual complexity in affective reactions to web pages: Subjective, eye movement, and cardiovascular responses. IEEE Trans. Affective Comput. 2, 4, 230--236. Google Scholar
Digital Library
- Vapnik, V. 1998. Statistical Learning Theory. Wiley.Google Scholar
- Wang, M. and Hua, X.-S. 2011. Active learning in multimedia annotation and retrieval: A survey. ACM Trans. Intell. Syst. Technol. 2, 2, Article 10. Google Scholar
Digital Library
- Wu, O., Chen, Y., Li, B., and Hu, W. 2011. Evaluating the visual quality of web pages using a computational aesthetics approach. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM). 337--346. Google Scholar
Digital Library
- Zadeh, L. A. 1965. Fuzzy sets. Inf. Control 8, 338--353.Google Scholar
Digital Library
- Zheng, X. S., Chakraborty, I., Lin, J. J. -W., and Rauschenberger, R. 2008. Developing quantitative metrics to predict users’ perceptions of interface design. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (HFES). 2023--2027.Google Scholar
- Zheng, X. S., Chakraborty, I., Lin, J. J. -W., and Rauschenberger, R. 2009. Correlating low-level image statistics with users- rapid aesthetic and affective judgments of web pages. In Proceedings of the 27th International Conference on Human Factors in Computing Systems (CHI). 1--10. Google Scholar
Digital Library
Index Terms
Measuring the Visual Complexities of Web Pages
Recommendations
A Data-Driven Approach to Measure Web Site Navigability
Web site navigability refers to the degree to which a visitor can follow a Web site's hyperlink structure to successfully find information with efficiency and ease. In this study, we take a data-driven approach to measure Web site navigability using Web ...
Evaluating the visual quality of web pages using a computational aesthetic approach
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data miningCurrent Web mining explores useful and valuable information (content) online for users. However, there is scant research on the overall visual aspect of Web pages, even though visual elements such as aesthetics significantly influence user experience. A ...
Current challenges in web crawling
ICWE'13: Proceedings of the 13th international conference on Web EngineeringWeb crawling, a process of collecting web pages in an automated manner, is the primary and ubiquitous operation used by a large number of web systems and agents starting from a simple program for website backup to a major web search engine. Due to an ...






Comments