Abstract
We present a set of tools designed to help editors place cuts and create transitions in interview video. To help place cuts, our interface links a text transcript of the video to the corresponding locations in the raw footage. It also visualizes the suitability of cut locations by analyzing the audio/visual features of the raw footage to find frames where the speaker is relatively quiet and still. With these tools editors can directly highlight segments of text, check if the endpoints are suitable cut locations and if so, simply delete the text to make the edit. For each cut our system generates visible (e.g. jump-cut, fade, etc.) and seamless, hidden transitions. We present a hierarchical, graph-based algorithm for efficiently generating hidden transitions that considers visual features specific to interview footage. We also describe a new data-driven technique for setting the timing of the hidden transition. Finally, our tools offer a one click method for seamlessly removing 'ums' and repeated words as well as inserting natural-looking pauses to emphasize semantic content. We apply our tools to edit a variety of interviews and also show how they can be used to quickly compose multiple takes of an actor narrating a story.
Supplemental Material
Available for Download
Supplemental material.
- Abel, J., and Glass, I. 1999. Radio: An illustrated guide. WBEZ Alliance Inc.Google Scholar
- Agarwala, A., Zheng, K., Pal, C., Agrawala, M., Cohen, M., Curless, B., Salesin, D., and Szeliski, R. 2005. Panoramic video textures. Proc. SIGGRAPH 24, 3, 821--827. Google Scholar
Digital Library
- Arya, S., Mount, D., Netanyahu, N., Silverman, R., and Wu, A. 1998. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. Journal of ACM 45, 6, 891--923. Google Scholar
Digital Library
- Athitsos, V., Alon, J., Sclaroff, S., and Kollios, G. 2004. BoostMap: A method for efficient approximate similarity rankings. Proc. CVPR, II:268--II:275. Google Scholar
Digital Library
- Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3D faces. In Proc. SIGGRAPH, 187--194. Google Scholar
Digital Library
- Boreczky, J., and Rowe, L. 1996. Comparison of video shot boundary detection techniques. JEI 5, 2, 122--128.Google Scholar
Cross Ref
- Bregler, C., and Omohundro, S. 1995. Nonlinear manifold learning for visual speech recognition. Proc. ICCV, 494--499. Google Scholar
Digital Library
- Bregler, C., Covell, M., and Slaney, M. 1997. Video rewrite: Driving visual speech with audio. In Proc. SIGGRAPH, 353--360. Google Scholar
Digital Library
- Brox, T., Bruhn, A., Papenberg, N., and Weickert, J. 2004. High accuracy optical flow estimation based on a theory for warping. Proc. ECCV, 25--36.Google Scholar
- Casares, J., Long, A., Myers, B., Bhatnagar, R., Stevens, S., Dabbish, L., Yocum, D., and Corbett, A. 2002. Simplifying video editing using metadata. In Proc. DIS, 157--166. Google Scholar
Digital Library
- Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In Proc. CVPR, 886--893. Google Scholar
Digital Library
- Dale, K., Sunkavalli, K., Johnson, M., Vlasic, D., Matusik, W., and Pfister, H. 2011. Video face replacement. Proc. SIGGRAPH ASIA 30, 6, 130:1--130:10. Google Scholar
- Dragicevic, P., Ramos, G., Bibliowitcz, J., Nowrouzezahrai, D., Balakrishnan, R., and Singh, K. 2008. Video browsing by direct manipulation. Proc. CHI, 237--246. Google Scholar
Digital Library
- Fowlkes, C., Belongie, S., Chung, F., and Malik, J. 2004. Spectral grouping using the nystrom method. PAMI 26, 2, 214--225. Google Scholar
Digital Library
- Girgensohn, A., Boreczky, J., Chiu, P., Doherty, J., Foote, J., Golovchinsky, G., Uchihashi, S., and Wilcox, L. 2000. A semi-automatic approach to home video editing. Proc. UIST, 81--89. Google Scholar
Digital Library
- Goldman, D., Gonterman, C., Curless, B., Salesin, D., and Seitz, S. 2008. Video object annotation, navigation, and composition. Proc. UIST, 3--12. Google Scholar
Digital Library
- Gomes, J. 1999. Warping and morphing of graphical objects, vol. 1. Morgan Kaufmann. Google Scholar
Digital Library
- Karrer, T., Weiss, M., Lee, E., and Borchers, J. 2008. DRAGON: A direct manipulation interface for frame-accurate in-scene video navigation. Proc. CHI, 247--250. Google Scholar
Digital Library
- Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. 2010. Being John Malkovich. Proc. ECCV, 341--353. Google Scholar
Digital Library
- Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., and Seitz, S. 2011. Exploring photobios. ACM Trans. on Graph. (Proc. SIGGRAPH) 30, 4, 61:1--61:10. Google Scholar
Digital Library
- Kwatra, V., Schodl, A., Essa, I., Turk, G., and Bobick, A. 2003. Graphcut textures: Image and video synthesis using graph cuts. Proc. SIGGRAPH 22, 3, 277--286. Google Scholar
Digital Library
- Mahajan, D., Huang, F., Matusik, W., Ramamoorthi, R., and Belhumeur, P. 2009. Moving gradients: A path-based method for plausible image interpolation. Proc. SIGGRAPH 28, 3, 42:1--42:11. Google Scholar
Digital Library
- O'Steen, B. 2009. The Invisible Cut: How Editors Make Movie Magic. Michael Wiese Productions.Google Scholar
- Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D. 1998. Synthesizing realistic facial expressions from photographs. Proc. SIGGRAPH, 75--84. Google Scholar
Digital Library
- Potamianos, G., Neti, C., Gravier, G., Garg, A., and Senior, A. 2003. Recent advances in the automatic recognition of audiovisual speech. Proc. IEEE 91, 9, 1306--1326.Google Scholar
- Ranjan, A., Birnholtz, J., and Balakrishnan, R. 2008. Improving meeting capture by applying television production principles with audio and motion detection. In Proc. CHI, ACM, 227--236. Google Scholar
Digital Library
- Saragih, J., Lucey, S., and Cohn, J. 2009. Face alignment through subspace constrained mean-shifts. ICCV, 1034--1041.Google Scholar
- Schödl, A., and Essa, I. 2002. Controlled animation of video sprites. In Proc. SCA, 121--127. Google Scholar
Digital Library
- Schödl, A., Szeliski, R., Salesin, D., and Essa, I. 2000. Video textures. Proc. SIGGRAPH, 489--498. Google Scholar
Digital Library
- Shechtman, E., Rav-Acha, A., Irani, M., and Seitz, S. 2010. Regenerative morphing. Proc. CVPR, 615--622.Google Scholar
- Truong, B., and Venkatesh, S. 2007. Video abstraction: A systematic review and classification. ACM TOMCCAP 3, 1. Google Scholar
Digital Library
- Ueda, H., Miyatake, T., and Yoshizawa, S. 1991. IMPACT: An interactive natural-motion-picture dedicated multimedia authoring system. Proc. CHI, 343--350. Google Scholar
Digital Library
- Virage. Audio analysis. http://www.virage.com/.Google Scholar
- Wexler, Y., Shechtman, E., and Irani, M. 2007. Space-time completion of video. PAMI 29, 3, 463--476. Google Scholar
Digital Library
- Zhang, H., Low, C., Smoliar, S., and Wu, J. 1995. Video parsing, retrieval and browsing: an integrated and content-based solution. Proc. Multimedia, 15--24. Google Scholar
Digital Library
- Zhang, L., Snavely, N., Curless, B., and Seitz, S. 2004. Spacetime faces: High resolution capture for modeling and animation. Proc. SIGGRAPH, 548--558. Google Scholar
Digital Library
Index Terms
Tools for placing cuts and transitions in interview video
Recommendations
Life-sketch: a framework for sketch-based modelling and animation of 3D objects
AUIC '10: Proceedings of the Eleventh Australasian Conference on User Interface - Volume 106The design and animation of digital 3D models is an essential task for many applications in science, engineering, education, medicine and arts. In many instances only an approximate representation is required and a simple and intuitive modelling and ...
Interview with Bill Kinder: January 13, 2005
Theoretical and Practical Computer Applications in EntertainmentBill Kinder is the Director of Editorial and Postproduction at the Pixar Animation Studios. He is also the DVD Producer for Pixar movies, including the Academy award-winning films "The Incredibles (2004)" and "Finding Nemo (2003)," which earned the ...
Programmable transitions for video stream editing
AFRIGRAPH '09: Proceedings of the 6th International Conference on Computer Graphics, Virtual Reality, Visualisation and Interaction in AfricaVideo editing applications provide a facility to transition from one video stream to another, or to filter a video stream in some way. New transitions are usually developed using a custom API for the particular package. In this article we present a ...





Comments