skip to main content
research-article

A Sketch-Based Approach for Interactive Organization of Video Clips

Published:04 September 2014Publication History
Skip Abstract Section

Abstract

With the rapid growth of video resources, techniques for efficient organization of video clips are becoming appealing in the multimedia domain. In this article, a sketch-based approach is proposed to intuitively organize video clips by: (1) enhancing their narrations using sketch annotations and (2) structurizing the organization process by gesture-based free-form sketching on touch devices. There are two main contributions of this work. The first is a sketch graph, a novel representation for the narrative structure of video clips to facilitate content organization. The second is a method to perform context-aware sketch recommendation scalable to large video collections, enabling common users to easily organize sketch annotations. A prototype system integrating the proposed approach was evaluated on the basis of five different aspects concerning its performance and usability. Two sketch searching experiments showed that the proposed context-aware sketch recommendation outperforms, in terms of accuracy and scalability, two state-of-the-art sketch searching methods. Moreover, a user study showed that the sketch graph is consistently preferred over traditional representations such as keywords and keyframes. The second user study showed that the proposed approach is applicable in those scenarios where the video annotator and organizer were the same person. The third user study showed that, for video content organization, using sketch graph users took on average 1/3 less time than using a mass-market tool Movie Maker and took on average 1/4 less time than using a state-of-the-art sketch alternative. These results demonstrated that the proposed sketch graph approach is a promising video organization tool.

References

  1. Brian P. Bailey, Joseph A. Konstan, and John V. Carlis. 2001. DEMAIS: Designing multimedia applications with interactive storyboards. In Proceedings of the 9th ACM International Conference on Multimedia (MULTIMEDIA'01). ACM Press, New York, 241--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. John B. Best. 1986. Cognitive Psychology. West Publishing Company.Google ScholarGoogle Scholar
  3. Rita Borgo, Min Chen, Ben Daubney, Edward Grundy, Heike Janicke, Gunther Heidemann, Benjamin Hoferlin, Markus Hoferlin, Daniel Weiskopf, and Xianghua Xie. 2011. A survey on video-based graphics and video visualization. In Proceedings of the Eurographics Conference: State-of-the-Art Reports. 1--23.Google ScholarGoogle Scholar
  4. Dick C. A. Bulterman and Lynda Hardman. 2005. Structured multimedia authoring. ACM Trans. Multimedia Comput. Comm. Appl. 1, 1, 89--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yang Cao, Changhu Wang, Liqing Zhang, and Lei Zhang. 2011. Edgel index for large-scale sketch-based image search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11). 761--768. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Carlos D. Correa and Kwan-Liu Ma. 2010. Dynamic video narratives. ACM Trans. Graph. 29, 4, 88:1--88:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Madirakshi Das and Shih-Ping Liou. 1998. A new hybrid approach to video organization for content-based indexing. In Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS'98). 372--381. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Li Fei-Fei, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28, 594--611. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brendan J. Frey and Delbert Dueck. 2007. Clustering by passing messages between data points. Sci. 315, 5814, 972--976.Google ScholarGoogle Scholar
  10. Qiu-Fang Fu, Yong-Jin Liu, Wen-Feng Chen, and Xiao-Lan Fu. 2013. Time course of natural scene categorization in human brain: Simple line-drawings vs. color photographs. J. Vis. 13, 9.Google ScholarGoogle ScholarCross RefCross Ref
  11. Komei Harada, Eiichiro Tanaka, Ryuichi Ogawa, and Yoshinori Hara. 1996. Anecdote: A multimedia storyboarding system with seamless authoring support. In Proceedings of the 4th ACM International Conference on Multimedia (MULTIMEDIA'96). ACM Press, New York, 341--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Michal Irani and P. Anandan. 1998. Video indexing based on mosaic representations. Proc. IEEE 86, 5, 905--921.Google ScholarGoogle ScholarCross RefCross Ref
  13. Timor Kadir and Michael Brady. 2001. Scale, saliency and image description. Int. J. Comput. Vis. 45, 2, 83--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Henry Kang, Seungyong Lee, and Charles Chui. 2007. Coherent line drawing. In Proceedings of the ACM Symposium on Non-Photorealistic Animation and Rendering (NPAR'07). 43--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gaetano Kanizsa. 1979. Organization in Vision: Essays in Gestalt Perception. Praeger, New York.Google ScholarGoogle Scholar
  16. Yong-Jin Liu, Qiu-Fang Fu, Ye Liu, and Xiaolan Fu. 2013a. A distributed computational cognitive model for object recognition. Sci. China 56, 9, 1--13.Google ScholarGoogle Scholar
  17. Yong-Jin Liu, Xi Luo, Ajay Joneja, Cui-Xia Ma, Xiao-Lan Fu, and Da-Wei Song. 2013b. User-adaptive sketch-based 3d cad model retrieval. IEEE Trans. Autom. Sci. Engin. 10, 3, 783--795.Google ScholarGoogle ScholarCross RefCross Ref
  18. Cui-Xia Ma, Yong-Jin Liu, Hong-An Wang, Dong-Xing Teng, and Guo-Zhong Dai. 2012. Sketch-based annotation and visualization in video authoring. IEEE Trans. Multimedia 14, 4, 1153--1165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Cui-Xia Ma, Yong-Jin Liu, Hai-Yan Yang, Dong-Xing Teng, Hong-An Wang, and Guo-Zhong Dai. 2011. KnitSketch: A sketch pad for conceptual design of 2d garment patterns. IEEE Trans. Autom. Sci. Engin. 8, 2, 431--437.Google ScholarGoogle ScholarCross RefCross Ref
  20. Tao Mei and Xian-Sheng Hua. 2008. Structure and event mining in sports video with efficient mosaic. Multimedia Tools Appl. 40, 1, 89--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tao Mei, Bo Yang, Shi-Qiang Yang, and Xian-Sheng Hua. 2008. Video collage: Presenting a video sequence using a single image. Vis. Comput. 25, 1, 39--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Emily Moxley, Tao Mei, and Bangalore S. Manjunath. 2010. Video annotation through search and graph reinforcement mining. IEEE Trans. Multimedia 12, 3, 184--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Paul A. Rodgers, Graham Green, and Alistair Mcgown. 2000. Using concept sketches to track design progress. Des. Studies 21, 5, 451--464.Google ScholarGoogle ScholarCross RefCross Ref
  24. Dean Rubine. 1991. Specifying gestures by example. In Proceedings of the 18th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'91). ACM Press, New York, 329--337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Xinghai Sun, Changhu Wang, Avneesh Sud, Chao Xu, and Lei Zhang. 2013. MagicBrush: Image search by color sketch. In Proceedings of the 21st ACM International Conference on Multimedia (MM'13). ACM Press, New York, 475--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zhenbang Sun, Changhu Wang, Liqing Zhang, and Lei Zhang. 2012. Free hand-drawn sketch segmentation. In Proceedings of the 12th European Conference on Computer Vision (ECCV'12). Lecture Notes in Computer Science, vol. 7572. Springer, 626--639. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ba Tu Truong and Svetha Venkatesh. 2007. Video abstraction: A systematic review and classification. ACM Trans. Multimedia Comput. Comm. Appl. 3, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jingdong Wang and Xian-Sheng Hua. 2011a. Interactive image search by color map. ACM Trans. Intell. Syst. Technol. 3, 1, 12:1--12:23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Meng Wang and Xian-Sheng Hua. 2011b. Active learning in multimedia annotation and retrieval: A survey. ACM Trans. Intell. Syst. Technol. 2, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Meng Wang, Xian-Sheng Hua, Jinhui Tang, and Hong Richang. 2009. Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. Multimedia 11, 3, 465--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Cheng-Chi Yu, Yong-Jin Liu, Matt Tianfu Wu, Kai-Yun Li, and Xiaolan Fu. 2014. A global energy optimization framework for 2.1d sketch extraction from monocular images. Graph. Models 76, 507--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jin-Kai Zhang, Cui-Xia Ma, Yong-Jin Liu, Qiu-Fang Fu, and Xiao-Lan Fu. 2013. Collaborative interaction for videos on mobile devices based on sketch gestures. J. Comput. Sci. Technol. 28, 5, 810--817.Google ScholarGoogle ScholarCross RefCross Ref
  33. Yu-Jin Zhang and Haibao Lu. 2002. A hierarchical organization scheme for video data. Pattern Recogn. 35, 11, 2381--2387.Google ScholarGoogle ScholarCross RefCross Ref
  34. Bin Zhao, Li Fei-Fei, and Eric P. Xing. 2011. Large-scale category structure aware image categorization. In Proceedings of the 25th Annual Conference on Neural Information Processing Systems (NIPS'11). 1251--1259.Google ScholarGoogle Scholar
  35. Xingquan Zhu, Ahmed Elmagarmid, Xiangyang Xue, Lide Wu, and Christine Catlin. 2005. Towards hierarchical video content organization for efficient browsing, summarization and retrieval. IEEE Trans. Multimedia 7, 4, 648--666. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 11, Issue 1
    August 2014
    151 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/2665935
    Issue’s Table of Contents

    Copyright © 2014 ACM

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 4 September 2014
    • Accepted: 1 June 2014
    • Revised: 1 April 2014
    • Received: 1 September 2013
    Published in tomm Volume 11, Issue 1

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!