Contact The DL Team Contact Us | Switch to tabbed view

top of pageABSTRACT

The performance of haptic interaction across communication networks critically depends on the successful reconstruction of the bidirectionally transmitted haptic signals, and hence on the quality of the communication channel. We propose a novel error-resilient data reduction scheme for haptic communication which exploits known limits of human haptic perception. Particularly, we show that missing haptic information due to packet loss may strongly impair the user's experience during haptic interaction. We present and compare methods that eliminate the disturbing artifacts resulting out of packet loss. Our approach keeps the estimated impact of packet losses below human perception thresholds. A tree of possible cases (packets received or not received) and their respective occurrence probabilities is maintained at the sender side, and the system predicts unacceptable error cases to decide whether extra packets should be sent. We introduce different criteria that can be employed to trigger additional packets. In our experiments, we evaluate both the objective data reduction performance and the subjective system transparency by performing extensive tests using packet loss probability and round trip time as parameters. The proposed scheme shows excellent performances in terms of data reduction while sustaining good subjective ratings for a wide range of packet loss values and round trip times.

Advertisements



top of pageAUTHORS



Author image not provided  Fernanda Brandi

No contact information provided yet.

Bibliometrics: publication history
Publication years2009-2012
Publication count6
Citation Count12
Available for download3
Downloads (6 Weeks)0
Downloads (12 Months)18
Downloads (cumulative)299
Average downloads per article99.67
Average citations per article2.00
View colleagues of Fernanda Brandi


Author image not provided  Julius Kammerl

No contact information provided yet.

Bibliometrics: publication history
Publication years2007-2012
Publication count11
Citation Count20
Available for download3
Downloads (6 Weeks)0
Downloads (12 Months)15
Downloads (cumulative)441
Average downloads per article147.00
Average citations per article1.82
View colleagues of Julius Kammerl


Author image not provided  Eckehard Steinbach

No contact information provided yet.

Bibliometrics: publication history
Publication years1996-2014
Publication count52
Citation Count124
Available for download22
Downloads (6 Weeks)28
Downloads (12 Months)429
Downloads (cumulative)6,243
Average downloads per article283.77
Average citations per article2.38
View colleagues of Eckehard Steinbach

top of pageREFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
W. R. Ferrell and T. B. Sheridan. Supervisory control of remote manipulation. IEEE Spectrum, 4(10):81--88, October 1967.
 
4
J. Greenspan and S. Bolanowski. Pain and Touch, chapter The Psychophysics of Tactile Perception and Its Peripheral Physiological Basis. Academic Press Inc., New York, 1996.
 
5
 
6
P. Hinterseer and E. Steinbach. Model-based data compression for 3d virtual haptic teleinteraction. In Proc. of the IEEE Int. Conf. on Consumer Electronics, Las Vegas, NV, USA, January 2006.
 
7
P. Hinterseer, E. Steinbach, S. Hirche, and M. Buss. A novel, psychophysically motivated transmission approach for haptic data streams in telepresence and teleaction systems. In Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pages 1097--1100, Philadelphia, PA, USA, March 2005.
 
8
A. Ortega and Y. Liu. Lossy compression of haptic data. In Touch in Virtual Environments: Haptics and the Design of Interactive Systems, pages 119--136. Prentice Hall, 2002.
 
9
C. Shahabi, A. Ortega, and M. R. Kolahdouzan. A comparison of different haptic compression techniques. In Proc. of the IEEE Int. Conf. on Multimedia & Expo, Lausanne, Switzerland, August 2002.
 
10
C. Sherrick and J. Craig. The psychophysics of touch. In Tactual Perception: A Sourcebook. Cambridge University Press, 1982.
 
11
M. Yajnik, S. Moon, J. Kurose, and D. Towsley. Measurement and modelling of the temporal dependence in packet loss. In Proc. of InfoCom, pages 345--352, New York, NY, USA, March 1999.

top of pageCITED BY

top of pageINDEX TERMS

The ACM Computing Classification System (CCS rev.2012)

Note: Larger/Darker text within each node indicates a higher relevance of the materials to the taxonomic classification.

top of pagePUBLICATION

Title MM '10 Proceedings of the 18th ACM international conference on Multimedia table of contents
General Chairs Alberto del Bimbo University of Florence, Italy
Shih-Fu Chang Columbia University, USA
Program Chairs Arnold Smeulders University of Amsterdam, NL
Pages 351-360
Publication Date2010-10-25 (yyyy-mm-dd)
Sponsor SIGMULTIMEDIA ACM Special Interest Group on Multimedia
PublisherACM New York, NY, USA ©2010
ISBN: 978-1-60558-933-6 Order Number: 433107 doi>10.1145/1873951.1874000
Conference MMInternational Multimedia Conference MM logo
Paper Acceptance Rate 396 of 974 submissions, 41%
Overall Acceptance Rate 1,375 of 5,525 submissions, 25%
Year Submitted Accepted Rate
MULTIMEDIA '97 142 40 28%
MULTIMEDIA '02 330 46 14%
MULTIMEDIA '03 255 43 17%
MULTIMEDIA '04 331 55 17%
MULTIMEDIA '05 312 49 16%
MULTIMEDIA '06 292 48 16%
MULTIMEDIA '07 298 57 19%
MM '08 516 136 26%
MM '09 305 50 16%
MM '10 974 396 41%
MM '11 666 230 35%
MM '12 331 67 20%
MM '13 235 47 20%
MM '14 286 55 19%
MM '15 252 56 22%
Overall 5,525 1,375 25%

APPEARS IN
Software
Applications
Interaction
Digital Content
Networking

top of pageREVIEWS


Reviews are not available for this item
Computing Reviews logo

top of pageCOMMENTS

Be the first to comment To Post a comment please sign in or create a free Web account

top of pageTable of Contents

Proceedings of the 18th ACM international conference on Multimedia
Table of Contents
Welcome to ACM MULTIMEDIA 2010
Alberto del Bimbo, Shih-Fu Chang, Arnold Smeulders
Pages: i
doi>10.1145/1873951.1913787
Full text: Mp4Mp4
SIGMM Award Presentation: Life - Experiences (Events) + Vision
Ramesh Jain
Pages: ii
doi>10.1145/1873951.1913788
Full text: Mp4Mp4
SIGMM Award Presentation: Geometry-aware analysis of high-dimentional visual information sets
Effrosyni Kokiopoulou
Pages: iii
doi>10.1145/1873951.1913789
Full text: Mp4Mp4
SESSION: Plenary -- P1
Shih-Fu Chang, Alberto del Bimbo
Using the web to do social science
Duncan Watts
Pages: 1-2
doi>10.1145/1873951.1873953
Full text: PDFPDF
Other formats: Mp4Mp4

Social science is often concerned with the emergence of collective behavior out of the interactions of large numbers of individuals, but in this regard it has long suffered from a severe measurement problem - namely that individual-level behavior and ...
expand
Visual crowd surveillance is like hydrodynamics
Mubarak Shah
Pages: 3-4
doi>10.1145/1873951.1873954
Full text: PDFPDF
Other formats: Mp4Mp4

Video Surveillance and Monitoring is very active area of research in Computer Vision. However, most of the current approaches assume that the observed scene is not crowded, and that reliable tracks of objects are available over longer durations. Therefore, ...
expand
SESSION: Full - F1/content track/automatic image tagging
Jiebo Luo
Leveraging loosely-tagged images and inter-object correlations for tag recommendation
Yi Shen, Jianping Fan
Pages: 5-14
doi>10.1145/1873951.1873956
Full text: PDFPDF

Large-scale loosely-tagged images (i.e., multiple object tags are given loosely at the image level) are available on Internet, and it is very attractive to leverage such loosely-tagged images for automatic image annotation applications. In this paper, ...
expand
Multi-label boosting for image annotation by structural grouping sparsity
Fei Wu, Yahong Han, Qi Tian, Yueting Zhuang
Pages: 15-24
doi>10.1145/1873951.1873957
Full text: PDFPDF

We can obtain high-dimensional heterogenous features from real-world images to describe their various aspects of visual characteristics, such as color, texture and shape etc.Different kinds of heterogenous features have different intrinsic discriminative ...
expand
Unified tag analysis with multi-edge graph
Dong Liu, Shuicheng Yan, Yong Rui, Hong-Jiang Zhang
Pages: 25-34
doi>10.1145/1873951.1873958
Full text: PDFPDF

Image tags have become a key intermediate vehicle to organize, index and search the massive online image repositories. Extensive research has been conducted on different yet related tag analysis tasks, e.g., tag refinement, tag-to-region assignment, ...
expand
Efficient large-scale image annotation by probabilistic collaborative multi-label propagation
Xiangyu Chen, Yadong Mu, Shuicheng Yan, Tat-Seng Chua
Pages: 35-44
doi>10.1145/1873951.1873959
Full text: PDFPDF

Annotating large-scale image corpus requires huge amount of human efforts and is thus generally unaffordable, which directly motivates recent development of semi-supervised or active annotation methods. In this paper we revisit this notoriously challenging ...
expand
SESSION: Full - F6/applications/human-centered multimedia track/user-adapted media access
Ralf Steinmetz
Sketch-based 3D model retrieval using diffusion tensor fields of suggestive contours
Sang Min Yoon, Maximilian Scherer, Tobias Schreck, Arjan Kuijper
Pages: 193-200
doi>10.1145/1873951.1873961
Full text: PDFPDF

The number of available 3D models in various areas increase steadily. Effective methods to search for those 3D models by content, rather than textual annotations, are crucial. For this purpose, we propose a new approach for content based 3D model retrieval ...
expand
Crowdsourced automatic zoom and scroll for video retargeting
Axel Carlier, Vincent Charvillat, Wei Tsang Ooi, Romulus Grigoras, Geraldine Morin
Pages: 201-210
doi>10.1145/1873951.1873962
Full text: PDFPDF

Screen size and display resolution limit the experience of watching videos on mobile devices. The viewing experience can be improved by determining important or interesting regions within the video (called regions of interest, or ROIs) and displaying ...
expand
Personalized photograph ranking and selection system
Che-Hua Yeh, Yuan-Chen Ho, Brian A. Barsky, Ming Ouhyoung
Pages: 211-220
doi>10.1145/1873951.1873963
Full text: PDFPDF

In this paper, we propose a novel personalized ranking system for amateur photographs. Although some of the features used in our system are similar to previous work, new features, such as texture, RGB color, portrait (through face detection), and black-and-white, ...
expand
SESSION: Full - F3/content track/classification of content elements
Alan Smeaton
Affective image classification using features inspired by psychology and art theory
Jana Machajdik, Allan Hanbury
Pages: 83-92
doi>10.1145/1873951.1873965
Full text: PDFPDF

Images can affect people on an emotional level. Since the emotions that arise in the viewer of an image are highly subjective, they are rarely indexed. However there are situations when it would be helpful if images could be retrieved based on their ...
expand
CO3 for ultra-fast and accurate interactive segmentation
Yibiao Zhao, Song-Chun Zhu, Siwei Luo
Pages: 93-102
doi>10.1145/1873951.1873966
Full text: PDFPDF

This paper presents an interactive image segmentation framework which is ultra-fast and accurate. Our framework, termed "CO3", consists of three components: COupled representation, COnditional model and COnvex inference. (i) In representation, we pose ...
expand
A generic framework for event detection in various video domains
Tianzhu Zhang, Changsheng Xu, Guangyu Zhu, Si Liu, Hanqing Lu
Pages: 103-112
doi>10.1145/1873951.1873967
Full text: PDFPDF

Event detection is essential for the extensively studied video analysis and understanding area. Although various approaches have been proposed for event detection, there is a lack of a generic event detection framework that can be applied to various ...
expand
Image segmentation with patch-pair density priors
Xiaobai Liu, Jiashi Feng, Shuicheng Yan, Hai Jin
Pages: 113-122
doi>10.1145/1873951.1873968
Full text: PDFPDF

In this paper, we investigate how an unlabeled image corpus can facilitate the segmentation of any given image. A simple yet efficient multi-task joint sparse representation model is presented to augment the patch-pair similarities by harnessing the ...
expand
SESSION: Full - F4/applications track/applications of geo-tagging
Touradj Ebrahimi
W2Go: a travel guidance system by automatic landmark ranking
Yue Gao, Jinhui Tang, Richang Hong, Qionghai Dai, Tat-Seng Chua, Ramesh Jain
Pages: 123-132
doi>10.1145/1873951.1873970
Full text: PDFPDF

In this paper, we present a travel guidance system W2Go (Where to Go), which can automatically recognize and rank the landmarks for travellers. In this system, a novel Automatic Landmark Ranking (ALR) method is proposed by utilizing the tag and geo-tag ...
expand
Mining people's trips from large scale geo-tagged photos
Yuki Arase, Xing Xie, Takahiro Hara, Shojiro Nishio
Pages: 133-142
doi>10.1145/1873951.1873971
Full text: PDFPDF

Photo sharing is one of the most popular Web services. Photo sharing sites provide functions to add tags and geo-tags to photos to make photo organization easy. Considering that people take photos to record something that attracts them, geo-tagged photos ...
expand
Photo2Trip: generating travel routes from geo-tagged photos for trip planning
Xin Lu, Changhu Wang, Jiang-Ming Yang, Yanwei Pang, Lei Zhang
Pages: 143-152
doi>10.1145/1873951.1873972
Full text: PDFPDF

Travel route planning is an important step for a tourist to prepare his/her trip. As a common scenario, a tourist usually asks the following questions when he/she is planning his/her trip in an unfamiliar place: 1) Are there any travel route suggestions ...
expand
Retrieving landmark and non-landmark images from community photo collections
Yannis Avrithis, Yannis Kalantidis, Giorgos Tolias, Evaggelos Spyrou
Pages: 153-162
doi>10.1145/1873951.1873973
Full text: PDFPDF

State of the art data mining and image retrieval in community photo collections typically focus on popular subsets, e.g. images containing landmarks or associated to Wikipedia articles. We propose an image clustering scheme that, seen as vector quantization ...
expand
SESSION: Full - F5/content track/learning concepts in images
Rita Cucchiara
S3MKL: scalable semi-supervised multiple kernel learning for image data mining
Shuhui Wang, Shuqiang Jiang, Qingming Huang, Qi Tian
Pages: 163-172
doi>10.1145/1873951.1873975
Full text: PDFPDF

For large scale image data mining, a challenging problem is to design a method that could work efficiently under the situation of little ground-truth annotation and a mass of unlabeled or noisy data. As one of the major solutions, semi-supervised learning ...
expand
Discriminative codeword selection for image representation
Lijun Zhang, Chun Chen, Jiajun Bu, Zhengguang Chen, Shulong Tan, Xiaofei He
Pages: 173-182
doi>10.1145/1873951.1873976
Full text: PDFPDF

Bag of features (BoF) representation has attracted an increasing amount of attention in large scale image processing systems. BoF representation treats images as loose collections of local invariant descriptors extracted from them. The visual codebook ...
expand
Supervised reranking for web image search
Linjun Yang, Alan Hanjalic
Pages: 183-192
doi>10.1145/1873951.1873977
Full text: PDFPDF

Visual search reranking that aims to improve the text-based image search with the help from visual content analysis has rapidly grown into a hot research topic. The interestingness of the topic stems mainly from the fact that the search reranking is ...
expand
SESSION: Full - F2/systems track/improving media delivery
Shervin Shirmohammadi
Tenor: making coding practical from servers to smartphones
Hassan Shojania, Baochun Li
Pages: 45-54
doi>10.1145/1873951.1873979
Full text: PDFPDF

It has been theoretically shown that performing coding in networked systems, including Reed-Solomon codes, fountain codes, and random network coding, has a clear advantage with respect to simplifying the design of protocols. These coding techniques can ...
expand
Improving online gaming quality using detour paths
Cong Ly, Cheng-Hsin Hsu, Mohamed Hefeeda
Pages: 55-64
doi>10.1145/1873951.1873980
Full text: PDFPDF

We study the problem of improving the user perceived quality of online games in which multiple players form a game session and exchange game-state updates over an overlay network. We propose an Indirect Relay System (IRS) to forward game-state updates ...
expand
Subjective evaluation of scalable video coding for content distribution
Jong-Seok Lee, Francesca De Simone, Naeem Ramzan, Zhijie Zhao, Engin Kurutepe, Thomas Sikora, Jörn Ostermann, Ebroul Izquierdo, Touradj Ebrahimi
Pages: 65-72
doi>10.1145/1873951.1873981
Full text: PDFPDF

This paper investigates the influence of the combination of the scalability parameters in scalable video coding (SVC) schemes on the subjective visual quality. We aim at providing guidelines for an adaptation strategy of SVC that can select the optimal ...
expand
Self-diagnostic peer-assisted video streaming through a learning framework
Di Niu, Baochun Li, Shuqiao Zhao
Pages: 73-82
doi>10.1145/1873951.1873982
Full text: PDFPDF

Quality control and resource optimization are challenging problems in peer-assisted video streaming systems, due to their large scales and unreliable peer behavior. Such systems are also prone to per- formance degradation in the event of drastic demand ...
expand
SESSION: Full - F7/applications/content track/multimodal image and video search
Bernard Merialdo
iLike: integrating visual and textual features for vertical search
Yuxin Chen, Nenghai Yu, Bo Luo, Xue-wen Chen
Pages: 221-230
doi>10.1145/1873951.1873984
Full text: PDFPDF

Content-based image search on the Internet is a challenging problem, mostly due to the semantic gap between low-level visual features and high-level content, as well as the excessive computation brought by huge amount of images and high dimensional features. ...
expand
Feature map hashing: sub-linear indexing of appearance and global geometry
Yannis Avrithis, Giorgos Tolias, Yannis Kalantidis
Pages: 231-240
doi>10.1145/1873951.1873985
Full text: PDFPDF

We present a new approach to image indexing and retrieval, which integrates appearance with global image geometry in the indexing process, while enjoying robustness against viewpoint change, photometric variations, occlusion, and background clutter. ...
expand
TalkMiner: a lecture webcast search engine
John Adcock, Matthew Cooper, Laurent Denoue, Hamed Pirsiavash, Lawrence A. Rowe
Pages: 241-250
doi>10.1145/1873951.1873986
Full text: PDFPDF

The design and implementation of a search engine for lecture webcasts is described. A searchable text index is created allowing users to locate material within lecture videos found on a variety of websites such as YouTube and Berkeley webcasts. The index ...
expand
A new approach to cross-modal multimedia retrieval
Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Gert R.G. Lanckriet, Roger Levy, Nuno Vasconcelos
Pages: 251-260
doi>10.1145/1873951.1873987
Full text: PDFPDF

The problem of joint modeling the text and image components of multimedia documents is studied. The text component is represented as a sample from a hidden topic model, learned with latent Dirichlet allocation, and images are represented as bags of visual ...
expand
SESSION: Full - F8/applications track/assisted authoring of media content
Mohan Kankanhalli
Color and luminance compensation for mobile panorama construction
Yingen Xiong, Kari Pulli
Pages: 261-270
doi>10.1145/1873951.1873989
Full text: PDFPDF

This paper addresses the problem of color and luminance compensation for sequences of overlapping images where the source images have very different colors and luminance. We apply the method for panoramic image construction on mobile phones. A simple ...
expand
A framework for photo-quality assessment and enhancement based on visual aesthetics
Subhabrata Bhattacharya, Rahul Sukthankar, Mubarak Shah
Pages: 271-280
doi>10.1145/1873951.1873990
Full text: PDFPDF
Other formats: Mp4Mp4

We present an interactive application that enables users to improve the visual aesthetics of their digital photographs using spatial recomposition. Unlike earlier work that focuses either on photo quality assessment or interactive tools for photo editing, ...
expand
Automated aesthetic enhancement of videos
Yang Yang Xiang, Mohan S. Kankanhalli
Pages: 281-290
doi>10.1145/1873951.1873991
Full text: PDFPDF

In this paper, we present a content based single-shot video editing scheme. We follow the classic long take directing and editing schemes. This system automatically adjusts the projection velocity of raw video clips to enhance the aesthetic interest. ...
expand
Learning to photograph
Bin Cheng, Bingbing Ni, Shuicheng Yan, Qi Tian
Pages: 291-300
doi>10.1145/1873951.1873992
Full text: PDFPDF

In this paper, we propose an intelligent photography system, which automatically and professionally generates/recommends user-favorite photo(s) from a wide view or a continuous view sequence. This task is quite challenging given that the evaluation of ...
expand
SESSION: Full - F9/human-centered multimedia/systems track/enriched and extended media presentation
Maja Pantic
Ink jet olfactory display enabling instantaneous switches of scents
Sayumi Sugimoto, Daisuke Noguchi, Yuichi Bannnai, Kenichi Okada
Pages: 301-310
doi>10.1145/1873951.1873994
Full text: PDFPDF

Trials on transmitting olfactory information together with audio/visual information are currently being conducted in the field of multimedia. However, continuous emission of scents creates problems of olfactory adaptations and scents lingering in the ...
expand
e-Fovea: a multi-resolution approach with steerable focus to large-scale and high-resolution monitoring
Kuan-Wen Chen, Chih-Wei Lin, Mike Y. Chen, Yi-Ping Hung
Pages: 311-320
doi>10.1145/1873951.1873995
Full text: PDFPDF

This paper presents e-Fovea, a system that combines both multi-resolution camera input and multi-resolution steerable projector output to support large-scale and high-resolution visual monitoring. e-Fovea utilizes a design similar to the human eyes, ...
expand
Impact of zooming and enhancing region of interests for optimizing user experience on mobile sports video
Wei Song, Dian W. Tjondronegoro, Shu-Hsien Wang, Michael J. Docherty
Pages: 321-330
doi>10.1145/1873951.1873996
Full text: PDFPDF

In mobile videos, small viewing size and bitrate limitation often cause unpleasant viewing experiences, which is particularly important for fast-moving sports videos. For optimizing the overall user experience of viewing sports videos on mobile phones, ...
expand
Webcams in context: web interfaces to create live 3D environments
Austin D. Abrams, Robert B. Pless
Pages: 331-340
doi>10.1145/1873951.1873997
Full text: PDFPDF

Web services supporting deep integration between video data and geographic information systems (GIS) empower a large user base to build on popular tools such as Google Earth and Google Maps. Here we extend web interfaces designed explicitly for novice ...
expand
SESSION: Full - F10/human-centered multimedia track/improved interactivity
Jie Yang
Toward more efficient user interfaces for mobile video browsing: an in-depth exploration of the design space
Jochen Huber, Jürgen Steimle, Max Mühlhäuser
Pages: 341-350
doi>10.1145/1873951.1873999
Full text: PDFPDF

Increasingly powerful mobile devices enable users to access and watch videos in mobile settings. While some concepts for mobile video browsing have been presented, the field still lacks a general understanding of the design space and of the characteristics ...
expand
Error-resilient perceptual coding for networked haptic interaction
Fernanda Brandi, Julius Kammerl, Eckehard Steinbach
Pages: 351-360
doi>10.1145/1873951.1874000
Full text: PDFPDF

The performance of haptic interaction across communication networks critically depends on the successful reconstruction of the bidirectionally transmitted haptic signals, and hence on the quality of the communication channel. We propose a novel error-resilient ...
expand
FACT: fine-grained cross-media interaction with documents via a portable hybrid paper-laptop interface
Chunyuan Liao, Hao Tang, Qiong Liu, Patrick Chiu, Francine Chen
Pages: 361-370
doi>10.1145/1873951.1874001
Full text: PDFPDF

FACT is an interactive paper system for fine-grained interaction with documents across the boundary between paper and computers. It consists of a small camera-projector unit, a laptop, and ordinary paper documents. With the camera-projector unit pointing ...
expand
An immersive system for browsing and visualizing surveillance video
Philip DeCamp, George Shaw, Rony Kubat, Deb Roy
Pages: 371-380
doi>10.1145/1873951.1874002
Full text: PDFPDF

HouseFly is an interactive data browsing and visualization system that synthesizes audio-visual recordings from multiple sensors, as well as the meta-data derived from those recordings, into a unified viewing experience. The system is being applied to ...
expand
SESSION: Full - F11/applications/content track/novel aids for music retrieval
Gerald Friedland
Combining multi-probe histogram and order-statistics based LSH for scalable audio content retrieval
Yi Yu, Michel Crucianu, Vincent Oria, Ernesto Damiani
Pages: 381-390
doi>10.1145/1873951.1874004
Full text: PDFPDF

In order to improve the reliability and the scalability of content-based retrieval of variant audio tracks from large music databases, we suggest a new multi-stage LSH scheme that consists in (i) extracting compact but accurate representations from audio ...
expand
Music recommendation by unified hypergraph: combining social media information and music content
Jiajun Bu, Shulong Tan, Chun Chen, Can Wang, Hao Wu, Lijun Zhang, Xiaofei He
Pages: 391-400
doi>10.1145/1873951.1874005
Full text: PDFPDF
Other formats: Mp4Mp4

Acoustic-based music recommender systems have received increasing interest in recent years. Due to the semantic gap between low level acoustic features and high level music concepts, many researchers have explored collaborative filtering techniques in ...
expand
Large-scale music tag recommendation with explicit multiple attributes
Zhendong Zhao, Xinxi Wang, Qiaoliang Xiang, Andy M. Sarroff, Zhonghua Li, Ye Wang
Pages: 401-410
doi>10.1145/1873951.1874006
Full text: PDFPDF

Social tagging can provide rich semantic information for large-scale retrieval in music discovery. Such collaborative intelligence, however, also generates a high degree of tags unhelpful to discovery, some of which obfuscate critical information. Towards ...
expand
Social audio features for advanced music retrieval interfaces
Michael Kuhn, Roger Wattenhofer, Samuel Welten
Pages: 411-420
doi>10.1145/1873951.1874007
Full text: PDFPDF

The size of personal music collections has constantly increased over the past years. As a result, the traditional metadata based lists to browse these collections have reached their limits. Interfaces that are based on music similarity offer an alternative ...
expand
SESSION: Full - F16/systems track/3D video
Wolfgang Effelsberg
A cognitive approach for effective coding and transmission of 3D video
Simone Milani, Giancarlo Calvagno
Pages: 581-590
doi>10.1145/1873951.1874009
Full text: PDFPDF
Other formats: Mp4Mp4

Reliable delivery of 3D video contents to a wide set of users is expected to be the next big revolution in multimedia applications provided that it is possible to grant a certain level of Quality-of-Experience (QoE) to the end user. During the last years, ...
expand
Modeling 3D facial expressions using geometry videos
Jiazhi Xia, Ying He, Dao P.T. Quynh, Xiaoming Chen, Steven C.H. Hoi
Pages: 591-600
doi>10.1145/1873951.1874010
Full text: PDFPDF

The significant advances in developing high-speed shape acquisition devices make it possible to capture the moving and deforming objects at video speeds. However, due to its complicated nature, it is technically challenging to effectively model and store ...
expand
A high-quality low-delay remote rendering system for 3D video
Shu Shi, Mahsa Kamali, Klara Nahrstedt, John C. Hart, Roy H. Campbell
Pages: 601-610
doi>10.1145/1873951.1874011
Full text: PDFPDF

As an emerging technology, 3D video has shown a great potential to become the next generation media for tele-immersion. However, streaming and rendering this dynamic 3D data in real-time requires tremendous network bandwidth and computing resources. ...
expand
SESSION: Full - F12/applications/human-centered multimedia track/narrowing the experience gap
Abed El Saddik
Dynamic captioning: video accessibility enhancement for hearing impairment
Richang Hong, Meng Wang, Mengdi Xu, Shuicheng Yan, Tat-Seng Chua
Pages: 421-430
doi>10.1145/1873951.1874013
Full text: PDFPDF
Other formats: Mp4Mp4

There are more than 66 million people su®ering from hearing impairment and this disability brings them di±culty in the video content understanding due to the loss of audio information. If scripts are available, captioning technology can help ...
expand
The third eye: mining the visual cognition across multi-language communities
Chunxi Liu, Qingming Huang, Shuqiang Jiang, Changsheng Xu
Pages: 431-440
doi>10.1145/1873951.1874014
Full text: PDFPDF

Existing research work in the multimedia domain mainly focuses on image/video indexing, retrieval, annotation, tagging, re-ranking, etc. However, little work has been contributed to people's visual cognition. In this paper, we propose a novel framework ...
expand
Green multimedia: informing people of their carbon footprint through two simple sensors
Aiden R. Doherty, Zhengwei Qiu, Colum Foley, Hyowon Lee, Cathal Gurrin, Alan F. Smeaton
Pages: 441-450
doi>10.1145/1873951.1874015
Full text: PDFPDF

In this work we discuss a new, but highly relevant, topic to the multimedia community; systems to inform individuals of their carbon footprint, which could ultimately effect change in community carbon footprint-related activities. The reduction of carbon ...
expand
Bridging low-level features and high-level semantics via fMRI brain imaging for video classification
Xintao Hu, Fan Deng, Kaiming Li, Tuo Zhang, Hanbo Chen, Xi Jiang, Jinglei Lv, Dajiang Zhu, Carlos Faraco, Degang Zhang, Arsham Mesbah, Junwei Han, Xiansheng Hua, Li Xie, Stephen Miller, Lei Guo, Tianming Liu
Pages: 451-460
doi>10.1145/1873951.1874016
Full text: PDFPDF

The multimedia content analysis community has made significant effort to bridge the gap between low-level features and high-level semantics perceived by human cognitive systems such as real-world objects and concepts. In the two fields of multimedia ...
expand
SESSION: Full - F14/applications/content track/detection of near-duplicate content
Alan Hanjalic
Building contextual visual vocabulary for large-scale image applications
Shiliang Zhang, Qingming Huang, Gang Hua, Shuqiang Jiang, Wen Gao, Qi Tian
Pages: 501-510
doi>10.1145/1873951.1874018
Full text: PDFPDF

Not withstanding its great success and wide adoption in Bag-of-visual Words representation, visual vocabulary created from single image local features is often shown to be ineffective largely due to three reasons. First, many detected local features ...
expand
Spatial coding for large scale partial-duplicate web image search
Wengang Zhou, Yijuan Lu, Houqiang Li, Yibing Song, Qi Tian
Pages: 511-520
doi>10.1145/1873951.1874019
Full text: PDFPDF

The state-of-the-art image retrieval approaches represent images with a high dimensional vector of visual words by quantizing local features, such as SIFT, in the descriptor space. The geometric clues among visual words in an image is usually ignored ...
expand
Monitoring near duplicates over video streams
Xiangmin Zhou, Lei Chen
Pages: 521-530
doi>10.1145/1873951.1874020
Full text: PDFPDF

Since near duplicates are ubiquitous over different data sources, increasing research efforts have been put to near duplicate detection recently. Among all the near duplicate detection tasks, an important one is continuous near duplicate monitoring over ...
expand
Real-time large scale near-duplicate web video retrieval
Lifeng Shang, Linjun Yang, Fei Wang, Kwok-Ping Chan, Xian-Sheng Hua
Pages: 531-540
doi>10.1145/1873951.1874021
Full text: PDFPDF

Near-duplicate video retrieval is becoming more and more important with the exponential growth of the Web. Though various approaches have been proposed to address this problem, they are mainly focusing on the retrieval accuracy while infeasible to query ...
expand
SESSION: Full - F15/applications/human-centered multimedia track/automatic generation of media content
Mohamed M. Hefeeda
Automatic mashup generation from multiple-camera concert recordings
Prarthana Shrestha, Peter H.N. de With, Hans Weda, Mauro Barbieri, Emile H.L. Aarts
Pages: 541-550
doi>10.1145/1873951.1874023
Full text: PDFPDF

A large number of videos are captured and shared by the audience from musical concerts. However, such recordings are typically perceived as boring mainly because of their limited view, poor visual quality and incomplete coverage. It is our objective ...
expand
Toward an automatically generated soundtrack from low-level cross-modal correlations for automotive scenarios
Marco Cristani, Anna Pesarin, Carlo Drioli, Vittorio Murino, Antonio Rodà, Michele Grapulin, Nicu Sebe
Pages: 551-560
doi>10.1145/1873951.1874024
Full text: PDFPDF

In this paper, we propose a novel recommendation policy for driving scenarios. While driving a car, listening to an audio track may enrich the atmosphere, conveying emotions that let the driver sense a more arousing experience. Here, we are introducing ...
expand
Supporting personal photo storytelling for social albums
Pere Obrador, Rodrigo de Oliveira, Nuria Oliver
Pages: 561-570
doi>10.1145/1873951.1874025
Full text: PDFPDF

Information overload is one of today's major concerns. As high-resolution digital cameras become increasingly pervasive, unprecedented amounts of social media are being uploaded to online social networks on a daily basis. In order to support users on ...
expand
Multimedia content creation using societal-scale ubiquitous camera networks and human-centric wearable sensing
Mathew Laibowitz, Nan-wei Gong, Joseph A. Paradiso
Pages: 571-580
doi>10.1145/1873951.1874026
Full text: PDFPDF

We present a novel approach to the creation of user-generated, documentary video using a distributed network of sensor-enabled video cameras and wearable on-body sensor devices. The wearable sensors are used to identify the subjects in view of the camera ...
expand
SESSION: Full - F13/applications/content/human-centered multimedia track/processing of social media
Yong Rui
Image tag refinement towards low-rank, content-tag prior and error sparsity
Guangyu Zhu, Shuicheng Yan, Yi Ma
Pages: 461-470
doi>10.1145/1873951.1874028
Full text: PDFPDF

The vast user-provided image tags on the popular photo sharing websites may greatly facilitate image retrieval and management. However, these tags are often imprecise and/or incomplete, resulting in unsatisfactory performances in tag related applications. ...
expand
Quantifying tag representativeness of visual content of social images
Aixin Sun, Sourav S. Bhowmick
Pages: 471-480
doi>10.1145/1873951.1874029
Full text: PDFPDF

Social tags describe images from many aspects including the visual content observable from the images, the context and usage of images, user opinions and others. Not all tags are therefore useful for image search and are appropriate for tag recommendation ...
expand
Social pixels: genesis and evaluation
Vivek K. Singh, Mingyan Gao, Ramesh Jain
Pages: 481-490
doi>10.1145/1873951.1874030
Full text: PDFPDF

Huge amounts of social multimedia is being created daily by a combination of globally distributed disparate sensors, including human-sensors (e.g. tweets) and video cameras. Taken together, this represents information about multiple aspects of the evolving ...
expand
Image retagging
Dong Liu, Xian-Sheng Hua, Meng Wang, Hong-Jiang Zhang
Pages: 491-500
doi>10.1145/1873951.1874031
Full text: PDFPDF

Online social media repositories such as Flickr and Zooomr allow users to manually annotate their images with freely-chosen tags, which are then used as indexing keywords to facilitate image search and other applications. However, these tags are frequently ...
expand
SESSION: Short - S1/applications/human-centered multimedia track
Marcel Worring
Movie2Comics: a feast of multimedia artwork
Richang Hong, Xiao-Tong Yuan, Mengdi Xu, Meng Wang, Shuicheng Yan, Tat-Seng Chua
Pages: 611-614
doi>10.1145/1873951.1874033
Full text: PDFPDF

As a type of artwork, comics are prevalent and popular around the world. However, although there are several assistive software and tools available, the creation of comics is still a tedious and labor intensive process. This paper proposes a scheme that ...
expand
NudgeCam: toward targeted, higher quality media capture
Scott Carter, John Adcock, John Doherty, Stacy Branham
Pages: 615-618
doi>10.1145/1873951.1874034
Full text: PDFPDF

NudgeCam is a mobile application that can help users capture more relevant, higher quality media. To guide users to capture media more relevant to a particular project, third-party template creators can show users media that demonstrates relevant content ...
expand
Tagging tags
Kuiyuan Yang, Xian-Sheng Hua, Meng Wang, Hong-Jiang Zhang
Pages: 619-622
doi>10.1145/1873951.1874035
Full text: PDFPDF

Social image sharing websites like Flickr have successfully motivated users around the world to annotate images with tags, which greatly facilitate search and organization of social image content. However, these manually-input tags are far from a comprehensive ...
expand
i-m-Space: interactive multimedia-enhanced space for rehabilitation of breast cancer patients
Ju-Chun Ko, Wei-Han Chen, Meng-Chieh Yu, Han-Hung Lin, Jin-Yao Lin, Szu-Wei Wu, Yi-Yu Chung, I-Ling Hu, Wei-Ting Peng, Shih-Yao Lin, Chia Han Chang, Pei-Hsuan Chou, King-Jen Chang, Mei-Lan Chang, Sue-huei Chen, Jin-Shing Chen, Ming-Sui Lee, Mike Y. Chen, Yi-Ping Hung
Pages: 623-626
doi>10.1145/1873951.1874036
Full text: PDFPDF

This paper presents i-m-Space, an interactive multimedia rehabilitation space that helps the post-surgery recovery of breast cancer patients. Our goal is to improve patients' physical therapy and psychological relaxation experience through careful applications ...
expand
A music search engine for therapeutic gait training
Zhonghua Li, Qiaoliang Xiang, Jason Hockman, Jianqing Yang, Yu Yi, Ichiro Fujinaga, Ye Wang
Pages: 627-630
doi>10.1145/1873951.1874037
Full text: PDFPDF

A music retrieval system is introduced that incorporate tempo, cultural, and beat strength features to help music therapists provide appropriate music for gait training for Parkinson's patients. Unlike current methods available to music therapists (e.g., ...
expand
Beyond GPS: determining the camera viewing direction of a geotagged image
Minwoo Park, Jiebo Luo, Robert T. Collins, Yanxi Liu
Pages: 631-634
doi>10.1145/1873951.1874038
Full text: PDFPDF

Increasingly, geographic information is being associated with personal photos. Recent research results have shown that the additional global positioning system (GPS) information helps visual recognition for geotagged photos by providing location context. ...
expand
Real-world trajectory extraction for attack pattern analysis in soccer video
Zhenxing Niu, Qi Tian, Xinbo Gao
Pages: 635-638
doi>10.1145/1873951.1874039
Full text: PDFPDF

Most existing approaches on tactic analysis of soccer video are based on mosaic trajectory analysis, which loses much semantic information comparing to the real-world trajectory. Without effective extraction of real-world trajectory, the tactic of soccer ...
expand
Tag transformer
Yicheng Song, Juan Cao, Zhineng Chen, Yongdong Zhang, Jintao Li
Pages: 639-642
doi>10.1145/1873951.1874040
Full text: PDFPDF

Human annotations (titles and tags) of web videos facilitate most web video applications. However, the raw tags are noisy, sparse and structureless, which limit the effectiveness of tags. In this paper, we propose a tag transformer schema to solve these ...
expand
Gaze awareness and interaction support in presentations
Kar-Han Tan, Dan Gelb, Ramin Samadani, Ian Robinson, Bruce Culbertson, John Apostolopoulos
Pages: 643-646
doi>10.1145/1873951.1874041
Full text: PDFPDF

Modern digital presentation systems use rich media to bring highly sophisticated information visualization and highly effective storytelling capabilities to classrooms and corporate boardrooms. In this paper we address a number of issues that arise when ...
expand
Digesting omni-video along routes for navigation
Hongyuan Cai, Jiang Yu Zheng
Pages: 647-650
doi>10.1145/1873951.1874042
Full text: PDFPDF

Omni-directional video records complete visual information along a route. Though replaying an omni-video presents reality, it requires significant amount of memory and communication bandwidth. This work extracts distinct views from an omni-video to form ...
expand
Building book inventories using smartphones
David M. Chen, Sam S. Tsai, Bernd Girod, Cheng-Hsin Hsu, Kyu-Han Kim, Jatinder Pal Singh
Pages: 651-654
doi>10.1145/1873951.1874043
Full text: PDFPDF

Manual generation of a book inventory is time-consuming and tedious, while deployment of barcode and radio-frequency identification (RFID) management systems is costly and affordable only to large institutions. In this paper, we design and implement ...
expand
Templated recursive image composition
C. Brian Atkins, Nicholas P. Lyons, Xuemei Zhang, Daniel R. Tretter
Pages: 655-658
doi>10.1145/1873951.1874044
Full text: PDFPDF

With the proliferation of image acquisition and consumption, there is an increasing need for solutions that help ordinary people create high quality image composites. In most solutions today, image layouts are provided as fixed templates, which offer ...
expand
Putting the pieces together: multimodal analysis of social attention in meetings
Ramanathan Subramanian, Jacopo Staiano, Kyriaki Kalimeri, Nicu Sebe, Fabio Pianesi
Pages: 659-662
doi>10.1145/1873951.1874045
Full text: PDFPDF

This paper presents a multimodal framework employing eye-gaze, head-pose and speech cues to explain observed social attention patterns in meeting scenes. We first investigate a few hypotheses concerning social attention and characterize meetings and ...
expand
AIR conferencing: accelerated instant replay for in-meeting multimodal review
Kori Inkpen, Rajesh Hegde, Sasa Junuzovic, Christopher Brooks, John C. Tang, Zhengyou Zhang
Pages: 663-666
doi>10.1145/1873951.1874046
Full text: PDFPDF

When people attend meetings they may miss parts of the discussion if they, for example, step out to take a phone call, go to the bathroom, or have a momentary lapse in concentration. As a result, they may need to catch up on what they missed upon returning ...
expand
Making computers look the way we look: exploiting visual attention for image understanding
Harish Katti, Ramanathan Subramanian, Mohan Kankanhalli, Nicu Sebe, Tat-Seng Chua, Kalpathi R. Ramakrishnan
Pages: 667-670
doi>10.1145/1873951.1874047
Full text: PDFPDF

Human Visual attention (HVA) is an important strategy to focus on specific information while observing and understanding visual stimuli. HVA involves making a series of fixations on select locations while performing tasks such as object recognition, ...
expand
MOGCLASS: a collaborative system of mobile devices forclassroom music education
Yinsheng Zhou, Graham Percival, Xinxi Wang, Ye Wang, Shengdong Zhao
Pages: 671-674
doi>10.1145/1873951.1874048
Full text: PDFPDF

We introduce MOGCLASS: a system of networked mobile devices to amplify and extend children's capabilities to perceive, perform and produce music collaboratively in classroom context. MOGCLASS includes various features for students to enhance their motivation, ...
expand
Adaptive combination of tag and link-based user similarity in flickr
Nhat Hai Phan, Van Duc Thong Hoang, Hyoseop Shin
Pages: 675-678
doi>10.1145/1873951.1874049
Full text: PDFPDF

Finding similar users is one of the probable applications in social media. The similarity between users can be measured in two different approaches: the semantic similarity and the similarity in terms of social relations. These two approaches can be ...
expand
Multi-display map touring with tangible widget
Marco Piovesana, Ying-Jui Chen, Neng-Hao Yu, Hsiang-Tao Wu, Li-Wei Chan, Yi-Ping Hung
Pages: 679-682
doi>10.1145/1873951.1874050
Full text: PDFPDF

Many map systems are created to help the user finding a place or define a route to follow. Google Map extends the concept of "surfing the map" by adding a street view that allows the user to explore a place from real pictures, creating the same feeling ...
expand
"Stray": a new multimedia music composition using the andantephone
Ryan Janzen, Steve Mann
Pages: 683-686
doi>10.1145/1873951.1874051
Full text: PDFPDF

The andantephone is an instrument that allows a performed to physically step through a piece of music by walking. Each note or chord of the piece is assigned to one footstep, so expressively varying velocity varies the tempo in turn. A new, more flexible ...
expand
A user study of visual versus sonically-enhanced interfaces for use while walking
Yaohua Yu, Zhengjie Liu
Pages: 687-690
doi>10.1145/1873951.1874052
Full text: PDFPDF

This paper presents a user study on interaction with a mobile device. We investigate the use of non-speech sound in mobile interfaces and design a sonically-enhanced interface. The sonically-enhanced interface is compared to a visual interface when users ...
expand
Fast image rearrangement via multi-scale patch copying
Jiayao Hu, Shifeng Chen, Jianzhuang Liu, Xiaoou Tang
Pages: 691-694
doi>10.1145/1873951.1874053
Full text: PDFPDF

In this paper, we propose a simple interactive way for a novel type of image synthesis called image rearrangement whose goal is to construct a new image based on some objects cropped from source images. The synthesis results are obtained by copying patches ...
expand
Learning parts-based representation for face transition
Xiong Li, Liwei Wang, Huanxi Liu, Yuncai Liu
Pages: 695-698
doi>10.1145/1873951.1874054
Full text: PDFPDF

This paper proposes to learn parts-based face representation from real face samples and then applies it to face transition. It differs from previous works in two aspects. First, we learn flexible face decomposition from real faces unsupervisedly instead ...
expand
Gesture and touch controlled video player interface for mobile devices
Shelley Buchinger, Ewald Hotop, Helmut Hlavacs, Francesca De Simone, Touradj Ebrahimi
Pages: 699-702
doi>10.1145/1873951.1874055
Full text: PDFPDF

Today, mobile communication devices allow users to access a wide variety of multimedia contents and services. In order to improve user experience and device usability, the design of interfaces and interaction techniques for mobile devices have focused ...
expand
Eyes do not lie: spontaneous versus posed smiles
Hamdi Dibeklioglu, Roberto Valenti, Albert Ali Salah, Theo Gevers
Pages: 703-706
doi>10.1145/1873951.1874056
Full text: PDFPDF

Automatic detection of spontaneous versus posed facial expressions received a lot of attention in recent years. However, almost all published work in this area use complex facial features or multiple modalities, such as head pose and body movements with ...
expand
SESSION: Short - S2/content/systems track
Zhengyou Zhang
Integrating web 2.0 resources by wikipedia
Chen Liu, Bing Cui, Anthony K.H. Tung
Pages: 707-710
doi>10.1145/1873951.1874058
Full text: PDFPDF

The concept of Web 2.0 becomes prevalent and popular in the past few years. People are able to share and manage their own resources in Web 2.0 Systems. The abundance of Web 2.0 resources in various media formats calls for better resource integration, ...
expand
Vicept: link visual features to concepts for large-scale image understanding
Zhipeng Wu, Shuqiang Jiang, Liang Li, Peng Cui, Qingming Huang, Wen Gao
Pages: 711-714
doi>10.1145/1873951.1874059
Full text: PDFPDF

On noticing the paradox of visual polysemia and concept poly-morphism, this paper proposes a new perspective called "Vicept" to associate elementary visual features and cognitive concepts. Firstly, a carefully prepared large image dataset and associate ...
expand
Analyzing and predicting sentiment of images on the social web
Stefan Siersdorfer, Enrico Minack, Fan Deng, Jonathon Hare
Pages: 715-718
doi>10.1145/1873951.1874060
Full text: PDFPDF

In this paper we study the connection between sentiment of images expressed in metadata and their visual content in the social photo sharing environment Flickr. To this end, we consider the bag-of-visual words representation as well as the color distribution ...
expand
Landmark image classification using 3D point clouds
Xian Xiao, Changsheng Xu, Jinqiao Wang
Pages: 719-722
doi>10.1145/1873951.1874061
Full text: PDFPDF

Most of the existing approaches for landmark image classification utilize either holistic features or interest of points in the whole image to train the classification model, which may lead to unsatisfactory result due to involvement of much information ...
expand
Portfolio theory of multimedia fusion
Xiangyu Wang, Mohan Kankanhalli
Pages: 723-726
doi>10.1145/1873951.1874062
Full text: PDFPDF

The number of multimedia applications has been increasing over the past two decades. Multimedia information fusion has therefore attracted significant attention with many techniques having been proposed. However, the uncertainty and correlation among ...
expand
Exploiting noisy visual concept detection to improve spoken content based video retrieval
Stevan Rudinac, Martha Larson, Alan Hanjalic
Pages: 727-730
doi>10.1145/1873951.1874063
Full text: PDFPDF

In this paper, we present a technique for unsupervised construction of concept vectors, concept-based representations of complete video units, from the noisy shot-level output of a set of visual concept detectors. We deploy these vectors to improve spoken-content-based ...
expand
End-to-end stochastic scheduling of scalable video overtime-varying channels
Nesrine Changuel, Nicholas Mastronarde, Mihaela Van der Schaar, Bessem Sayadi, Michel Kieffer
Pages: 731-734
doi>10.1145/1873951.1874064
Full text: PDFPDF

This paper addresses the problem of video on demand delivery over a time-varying wireless channel. Packet scheduling and buffer management are jointly considered for scalable video transmission to adapt to the changing channel conditions. A proxy-based ...
expand
Context dependent SVMs for interconnected image network annotation
Hichem Sahbi, Xi Li
Pages: 735-738
doi>10.1145/1873951.1874065
Full text: PDFPDF

The exponential growth of interconnected networks, such as Flickr, currently makes them the standard way to share and explore data where users put contents and refer to others. These interconnections create valuable information in order to enhance the ...
expand
A novel video hash algorithm
Li Weng, Bart Preneel
Pages: 739-742
doi>10.1145/1873951.1874066
Full text: PDFPDF

Perceptual hashing is an emerging solution for identification and authentication of multimedia content. In this work, a video hash algorithm is proposed. This algorithm computes a 180-bit hash value for videos of arbitrary lengths. The hash value can ...
expand
Age classification for pose variant and occluded faces
Wei-Ta Chu, Wen-Long Liu, Jen-Yu Yu
Pages: 743-746
doi>10.1145/1873951.1874067
Full text: PDFPDF

We extend the object class invariant (OCI) model to age classification, for pose variant and occluded faces. With the OCI model, we first localize faces from images captured in arbitrary views, and then determine the most distinctive features. Relationships ...
expand
Movie genre classification via scene categorization
Howard Zhou, Tucker Hermans, Asmita V. Karandikar, James M. Rehg
Pages: 747-750
doi>10.1145/1873951.1874068
Full text: PDFPDF

This paper presents a method for movie genre categorization of movie trailers, based on scene categorization. We view our approach as a step forward from using only low-level visual feature cues, towards the eventual goal of high-level seman- tic understanding ...
expand
Unsupervised summarization of rushes videos
Yang Liu, Feng Zhou, Wei Liu, Fernando De la Torre, Yan Liu
Pages: 751-754
doi>10.1145/1873951.1874069
Full text: PDFPDF

This paper proposes a new framework to formulate summarization of rushes video as an unsupervised learning problem. We pose the problem of video summarization as one of time-series clustering, and proposed Constrained Aligned Cluster Analysis (CACA). ...
expand
Negotiating multimedia advertising with attention owners
Yue Zhang, Nadeem Jamali
Pages: 755-758
doi>10.1145/1873951.1874070
Full text: PDFPDF

Advertising is increasingly an integral part of multimedia delivery over the Internet. Traditionally, brokers -- intermediaries between content providers, advertisers, and viewers -- have determined the fine balance between the content de- sired by viewers ...
expand
ReDi: an interactive virtual display system for ubiquitous devices
Wen Sun, Yan Lu, Shipeng Li
Pages: 759-762
doi>10.1145/1873951.1874071
Full text: PDFPDF

In this paper, we present an interactive virtual display system to facilitate the ubiquitous user interaction with heterogeneous devices. By using small-size programmable hardware and wearable sensors, any display device (referred to as display surface) ...
expand
A proxy-based mobile web browser
Huifeng Shen, Zhaotai Pan, Haicheng Sun, Yan Lu, Shipeng Li
Pages: 763-766
doi>10.1145/1873951.1874072
Full text: PDFPDF

In this paper, we present a proxy-based mobile web browser with rich experiences. We use the server-side web parsing and rendering to leverage the browser computing logic. We use a composite screen format to represent the display of the web content, ...
expand
Optimal collusion attack for digital fingerprinting
Hui Feng, Hefei Ling, Fuhao Zou, Weiqi Yan, Zhengding Lu
Pages: 767-770
doi>10.1145/1873951.1874073
Full text: PDFPDF

The collusion attack is a cost-efficient attack against digital finger-printing where classes of users combine their fingerprinted content for the purpose of attenuating or removing the fingerprints. A recently introduced gradient attack which appeared ...
expand
Novel framework for single/multi-frame super-resolution using sequential Monte Carlo method
Toshie Misu, Yasutaka Matsuo, Shinichi Sakaida, Yoshiaki Shishikui
Pages: 771-774
doi>10.1145/1873951.1874074
Full text: PDFPDF

We propose a novel super-resolution (SR) framework based on a sequential Monte Carlo (SMC) method, which is capable of robust optimization, for solving the inverse problem of degradation processes of imagery and sampling. The SR image is estimated from ...
expand
Similarity content search in content centric networks
Petros Daras, Theodoros Semertzidis, Lambros Makris, Michael G. Strintzis
Pages: 775-778
doi>10.1145/1873951.1874075
Full text: PDFPDF

Content searching and downloading are the two dominant actions of the Internet users today, despite the fact that the Internet was not originally architected to serve such actions. Content Centric Networking is the new trend in the research community ...
expand
Accelerated IPTV channel change with transcoded unicast bursting
Zhi Li, Ali C. Begen, Xiaoqing Zhu, Bernd Girod
Pages: 779-782
doi>10.1145/1873951.1874076
Full text: PDFPDF

We study video transcoding for accelerated channel changes in IPTV systems. Video transcoding at the Retransmission Server not only reduces the channel change latency, but also reduces the duration and data size of the unicast burst stream used for rapid ...
expand
Qoe-based rate adaptation scheme selection for resource-constrained wireless video transmission
Srisakul Thakolsri, Wolfgang Kellerer, Eckehard Steinbach
Pages: 783-786
doi>10.1145/1873951.1874077
Full text: PDFPDF

This paper proposes a Quality of Experience (QoE) based rate adaptation scheme selection approach for multi-user wireless video delivery. Transcoding and packet dropping are used as examples of rate adaptation schemes, and we investigate their impact ...
expand
Precise indoor localization using smart phones
Eladio Martin, Oriol Vinyals, Gerald Friedland, Ruzena Bajcsy
Pages: 787-790
doi>10.1145/1873951.1874078
Full text: PDFPDF

We present an indoor localization application leveraging the sensing capabilities of current state of the art smart phones. To the best of our knowledge, our application is the first one to be implemented in smart phones and integrating both offline ...
expand
Trading bandwidth for playback lag: can active peers help?
Dongbo Huang, Jin Zhao, Xin Wang
Pages: 791-794
doi>10.1145/1873951.1874079
Full text: PDFPDF

P2P live streaming systems suffer a lot from long playback lag in lag-sensitive scenarios. In this paper, we propose a new approach to reducing the playback lag in P2P live streaming systems. According to measurement studies, there exist a certain amount ...
expand
3D video transcoding for virtual views
Shujie Liu, Chang Wen Chen
Pages: 795-798
doi>10.1145/1873951.1874080
Full text: PDFPDF

Recent emerging development of three dimensional video (3DV) has been vigorously driving the Multiview Video Coding (MVC) standard developed by Joint Video Team as an amendment to H.264/AVC and the new 3DV standard developed by MPEG. It is expected that ...
expand
Pull-patching: a combination of multicast and adaptive segmented HTTP streaming
Espen Jacobsen, Carsten Griwodz, Pål Halvorsen
Pages: 799-802
doi>10.1145/1873951.1874081
Full text: PDFPDF

Multicast delivery for video streaming gains credibility with the introduction of commercial IPTV. We therefore revisit patching, a video-on-demand idea from the 1990s. We have built Pull-Patching, an approach that combines the patching ideas ...
expand
SESSION: Short - S3/applications/content track
Max Muehlhaeuser
K-way min-max cut for image clustering and junk images filtering from Google images
Feng Xie, Yi Shen, Xiaofei He
Pages: 803-806
doi>10.1145/1873951.1874083
Full text: PDFPDF

Currently most existing image search engines such as Google Images index web images majorly using text keywords extracted from the context, which may return large amount of junk information. We propose a novel clustering based filtering method to filter ...
expand
Smart video systems in police cars
Amirali Jazayeri, Hongyuan Cai, Mihran Tuceryan, Jiang Yu Zheng
Pages: 807-810
doi>10.1145/1873951.1874084
Full text: PDFPDF

The use of video cameras in police cars has been found to have significant value and the number of such installed systems has been increasing. In addition to recording the events in routine traffic stops for later use in legal settings, in-car video ...
expand
Interactive inquiry for object of interest in video playback by motion-augmented graph cut
Po-Nung Tseng, Yen-Liang Lin, Winston H. Hsu
Pages: 811-814
doi>10.1145/1873951.1874085
Full text: PDFPDF

The touch-based displays (devices) have entailed rich interactions between the videos and users. The objects appearing in videos usually interest users in wanting to know relative knowledge about them. In this paper, we proposed a video playback system ...
expand
GPS, compass, or camera?: investigating effective mobile sensors for automatic search-based image annotation
An-Jung Cheng, Fang-Erh Lin, Yin-Hsi Kuo, Winston H. Hsu
Pages: 815-818
doi>10.1145/1873951.1874086
Full text: PDFPDF

Recently, more and more types of sensors are being equipped on the smart phones, which provide different aspects into conside-ration. When a user takes a photo, the information it provides like the image content, the location and even the direction the ...
expand
TwitterSigns: microblogging on the walls
Markus Buzeck, Jörg Müller
Pages: 819-822
doi>10.1145/1873951.1874087
Full text: PDFPDF

In this paper we present TwitterSigns, an approach to display microblogs on public displays. Two different kinds of microblog entries (tweets) are selected for display: Tweets that were posted in the immediate environment of the display, and tweets that ...
expand
Multi-exposure imaging on mobile devices
Natasha Gelfand, Andrew Adams, Sung Hee Park, Kari Pulli
Pages: 823-826
doi>10.1145/1873951.1874088
Full text: PDFPDF

Many natural scenes have a dynamic range that is larger than the dynamic range of a camera's image sensor. A popular approach to producing an image without under- and over-exposed areas is to capture several input images with varying exposure settings, ...
expand
Towards aesthetics: a photo quality assessment and photo selection system
Congcong Li, Alexander C. Loui, Tsuhan Chen
Pages: 827-830
doi>10.1145/1873951.1874089
Full text: PDFPDF

Automatic photo quality assessment and selection systems are helpful for managing the large mount of consumer photos. In this paper, we present such a system based on evaluating the aesthetic quality of consumer photos. The proposed system focuses on ...
expand
Cast2Face: character identification in movie with actor-character correspondence
Mengdi Xu, Xiaotong Yuan, Jialie Shen, Shuicheng Yan
Pages: 831-834
doi>10.1145/1873951.1874090
Full text: PDFPDF

We investigate the problem of automatically identifying characters in a movie with the supervision of actor-character name correspondence provided by the movie cast. Our proposed framework, namely Cast2Face, is featured by: (i) we restrict the names ...
expand
Visual security evaluation for video encryption
Lingling Tong, Feng Dai, Yongdong Zhang, Jintao Li
Pages: 835-838
doi>10.1145/1873951.1874091
Full text: PDFPDF

Video encryption plays an important role in data security guarantee, which is increasingly important with the development of multimedia technology. A great deal of effort has been made in recent years to develop video encryption methods. However, few ...
expand
Automatic trailer generation
Go Irie, Takashi Satou, Akira Kojima, Toshihiko Yamasaki, Kiyoharu Aizawa
Pages: 839-842
doi>10.1145/1873951.1874092
Full text: PDFPDF

This paper presents a content-based movie trailer generation method, named Vid2Trailer (V2T). Since trailers are intended to advertise movies, they must show specific symbols such as the title logo and the main theme music. Moreover, it is expected to ...
expand
Extracting captions from videos using temporal feature
Xiaoqian Liu, Weiqiang Wang
Pages: 843-846
doi>10.1145/1873951.1874093
Full text: PDFPDF

Captions in videos provide much useful semantic information for indexing and retrieving video contents. In this paper, we present an effective approach to extracting captions from videos. Its novelty comes from exploiting the temporal information in ...
expand
Automatic role recognition based on conversational and prosodic behaviour
Hugues Salamin, Alessandro Vinciarelli, Khiet Truong, Gelareh Mohammadi
Pages: 847-850
doi>10.1145/1873951.1874094
Full text: PDFPDF

This paper proposes an approach for the automatic recognition of roles in settings like news and talk-shows, where roles correspond to specific functions like Anchorman, Guest or Interview Participant. The approach is based on purely nonverbal vocal ...
expand
VERT: automatic evaluation of video summaries
Yingbo Li, Bernard Merialdo
Pages: 851-854
doi>10.1145/1873951.1874095
Full text: PDFPDF

Video Summarization has become an important tool for multimedia information processing, but the automatic evaluation of a video summarization system remains a challenge. A major issue is that an ideal "best" summary does not exist, although people can ...
expand
Character-based movie summarization
Jitao Sang, Changsheng Xu
Pages: 855-858
doi>10.1145/1873951.1874096
Full text: PDFPDF

A decent movie summary is helpful for movie producer to promote the movie as well as audience to capture the theme of the movie before watching the whole movie. Most exiting automatic movie summarization approaches heavily rely on video content only, ...
expand
Supervised manifold learning for image and video classification
Yang Liu, Yan Liu, Keith C.C. Chan
Pages: 859-862
doi>10.1145/1873951.1874097
Full text: PDFPDF

This paper presents a supervised manifold learning model for dimensionality reduction in image and video classification tasks. Unlike most manifold learning models that emphasize the distance preserving, we propose a novel algorithm called maximum distance ...
expand
Unsupervised object category discovery via information bottleneck method
Zhengzheng Lou, Yangdong Ye, Dong Liu
Pages: 863-866
doi>10.1145/1873951.1874098
Full text: PDFPDF

We present a novel approach to automatically discover object categories from a collection of unlabeled images. This is achieved by the Information Bottleneck method, which finds the optimal partitioning of the image collection by maximally preserving ...
expand
Probabilistic visual concept trees
Lexing Xie, Rong Yan, Jelena Tešić, Apostol Natsev, John R. Smith
Pages: 867-870
doi>10.1145/1873951.1874099
Full text: PDFPDF

This paper presents probabilistic visual concept trees, a model for large visual semantic taxonomy structures and its use in visual concept detection. Organizing visual semantic knowledge systematically is one of the key challenges towards large-scale ...
expand
A conditional random field viewpoint of symbolic audio-to-score matching
Cyril Joder, Slim Essid, Gaël Richard
Pages: 871-874
doi>10.1145/1873951.1874100
Full text: PDFPDF

We present a new approach of symbolic audio-to-score alignment, with the use of Conditional Random Fields (CRFs). Unlike Hidden Markov Models, these graphical models allow the calculation of state conditional probabilities to be made on the basis of ...
expand
Bilingual query translation and expansion for supporting more effective cross-language image retrieval
Yuejie Zhang, Lei Cen, Cheng Jin, Xiangyang Xue, Ning Zhou
Pages: 875-878
doi>10.1145/1873951.1874101
Full text: PDFPDF

To support more effective Cross-Language Image Retrieval (ImageCLIR), a novel algorithm is developed by integrating a bilingual semantic network to achieve more precise bilingual query translation and expansion. An English-Chinese bilingual parallel ...
expand
The idiap wolf corpus: exploring group behaviour in a competitive role-playing game
Hayley Hung, Gokul Chittaranjan
Pages: 879-882
doi>10.1145/1873951.1874102
Full text: PDFPDF

In this paper we present the Idiap Wolf Database. This is a audio-visual corpus containing natural conversational data of volunteers who took part in a competitive role-playing game. Four groups of 8-12 people were recorded. In total, just over 7 hours ...
expand
Face hallucination with shape parameters projection constraint
Chengdong Lan, Ruimin Hu, Kebin Huang, Zhen Han
Pages: 883-886
doi>10.1145/1873951.1874103
Full text: PDFPDF

In real surveillance scenarios, a variety of factors have an impact on the quality of images, which leads to pixel distortion and aliasing. Traditional face super-resolution algorithms only use the difference of image pixel values as similarity criterion, ...
expand
Automatic detection of malicious sound using segmental two-dimensional mel-frequency cepstral coefficients and histograms of oriented gradients
Myung Jong Kim, Younggwan Kim, JaeDeok Lim, Hoirin Kim
Pages: 887-890
doi>10.1145/1873951.1874104
Full text: PDFPDF

This paper addresses the problem of recognizing malicious sounds, such as sexual scream or moan, to detect and block the objectionable multimedia contents. The malicious sounds show the distinct characteristics that have large temporal variations and ...
expand
Automatic interesting object extraction from images using complementary saliency maps
Haonan Yu, Jia Li, Yonghong Tian, Tiejun Huang
Pages: 891-894
doi>10.1145/1873951.1874105
Full text: PDFPDF

Automatic interesting object extraction is widely used in many image applications. Among various extraction approaches, saliency-based ones usually have a better performance since they well accord with human visual perception. However, nearly all existing ...
expand
Interactive retrieval of targets for wide area surveillance
Saad Ali, Omar Javed, Neils Haering, Takeo Kanade
Pages: 895-898
doi>10.1145/1873951.1874106
Full text: PDFPDF

We address the problem of interactive search for a target of interest in surveillance imagery. Our solution consists of iteratively learning a distance metric for retrieval, based on user feedback. The approach employs (retrieval) rank based constraints ...
expand
SESSION: Short - S4/applications/content track
Lexing Xie
Restoration of out-of-focus lecture video by automatic slide matching
Ngai-Man Cheung, David Chen, Vijay Chandrasekhar, Sam S. Tsai, Gabriel Takacs, Sherif A. Halawa, Bernd Girod
Pages: 899-902
doi>10.1145/1873951.1874108
Full text: PDFPDF

Restoring the fine detail in the slide area of a defocused lecture video is a challenging task. In this work, we propose to use clean images of slides available along with the defocused lecture video to help the restoration. Our proposed method uses ...
expand
Increasing interactivity in street view web navigation systems
Alexandre Devaux, Nicolas Paparoditis
Pages: 903-906
doi>10.1145/1873951.1874109
Full text: PDFPDF

This paper presents some interactive features we have added on our street-view web navigation application. Our system allows to navigate through a huge amount of data (panoramas and laser clouds) and also to interact with it. We will detail 4 aspects ...
expand
Improving face clustering using social context
Peng Wu, Feng Tang
Pages: 907-910
doi>10.1145/1873951.1874110
Full text: PDFPDF

In this paper we describe an algorithm to improve the performance of face clustering using the social relationship of people. One common challenge in face clustering techniques is that very often the faces of the same person are clustered into different ...
expand
Region categorization with mobile applications
Jiang Gao
Pages: 911-914
doi>10.1145/1873951.1874111
Full text: PDFPDF

We explore how to optimally categorize regions for faster and more reliable image matching and registration. We propose using the entropy of histogram of oriented gradients(HOG) features to characterize image regions, and propose a region-sensitive feature ...
expand
Topic discovery of web video using star-structured K-partite graph
Jian Shao, Wentao Yin, Shuai Ma, Yueting Zhuang
Pages: 915-918
doi>10.1145/1873951.1874112
Full text: PDFPDF

As the explosive growth of web videos on video-shared sites like YouTube, the discovery of video topics has become a hot research area. In order to utilize all kinds of characteristics in web video such as visual features (SIFT, shape or color) and contextual ...
expand
Video retargeting for aesthetic enhancement
Yang-Yang Xiang, Mohan S. Kankanhalli
Pages: 919-922
doi>10.1145/1873951.1874113
Full text: PDFPDF

In this paper, we present a post-editing scheme for camera-work. It is based on video retargeting, but aims to enhance the aesthetic interest of home produced video sequences. The essential part of video clips are emphasized by automatically zooming ...
expand
FireVolleyball: multi-player interactive game providing a sense of touching fire
Sei Ikeda, Yuki Uranishi, Yoshitsugu Manabe, Kunihiro Chihara
Pages: 923-926
doi>10.1145/1873951.1874114
Full text: PDFPDF

This paper describes a novel game system which provides multiple players with a sense of touching fire with their own hands. Players in this game are divided into two teams in front of a wall-type flat display and try to score points by grounding a fireball ...
expand
Memory matrix: a novel user experience for home video
Qianqian Xu, Zhipeng Wu, Guorong Li, Lei Qin, Shuqiang Jiang, Qingming Huang
Pages: 927-930
doi>10.1145/1873951.1874115
Full text: PDFPDF

Nowadays, various efforts have sprung up aiming to automatically analyze home videos and provide users satisfactory experiences. In this paper, we present a novel user experience for home video called Memory Matrix, which could facilitate users to re-experience ...
expand
Artistic paper-cut of human portraits
Meng Meng, Mingtian Zhao, Song-Chun Zhu
Pages: 931-934
doi>10.1145/1873951.1874116
Full text: PDFPDF

This paper presents a method to render artistic paper-cut of human portraits. Rendering paper-cut images from photographs can be considered as an inhomogeneous image binarization problem, to which ideal solutions should reproduce vivid image details ...
expand
Robust hashing for music copyright protection by combining beat segmentation and chroma
Wei Li, Zhurong Wang, Bilei Zhu, Xiangyang Xue
Pages: 935-938
doi>10.1145/1873951.1874117
Full text: PDFPDF

Time-scale modification and pitching shifting are two recognized challenging attacks to music copyright protection. To resist them simultaneously, a novel robust hashing method is proposed by combining the strength of music beat segmentation and chroma-based ...
expand
Explicit and implicit concept-based video retrieval with bipartite graph propagation model
Lei Bao, Juan Cao, Yongdong Zhang, Jintao Li, Ming-yu Chen, Alexander G. Hauptmann
Pages: 939-942
doi>10.1145/1873951.1874118
Full text: PDFPDF

The major scientific problem for content-based video retrieval is the semantic gap. Generally speaking, there are two appropriate ways to bridge the semantic gap: the first one is from human perspective (top-down) and the other one is from computer perspective ...
expand
Lightweight TV logo recognition based on image moment
Masaru Sugano, Shigeyuki Sakazawa
Pages: 943-946
doi>10.1145/1873951.1874119
Full text: PDFPDF

TV logo recognition is one of the suggested solutions for preventing unauthorized duplication and redistribution. The major problem of the previous logo recognition is that the matching process against the reference logos requires much time. Since millions ...
expand
Representative views re-ranking for 3D model retrieval with multi-bipartite graph reinforcement model
Yue Gao, You Yang, Qionghai Dai, Naiyao Zhang
Pages: 947-950
doi>10.1145/1873951.1874120
Full text: PDFPDF

In this paper, we propose a multi-bipartite graph reinforcement model for representative views re-ranking in 3D model retrieval. Given the views of one query 3D model, all query views are grouped into clusters to generate representative views and corresponding ...
expand
Sorted label classifier chains for learning images with multi-label
Xi Liu, Zhiping Shi, Zhixin Li, Xishun Wang, Zhongzhi Shi
Pages: 951-954
doi>10.1145/1873951.1874121
Full text: PDFPDF

In the real world, images always have several visual objects instead of only one, which makes it difficult for conventional object recognition methods to deal with them. In this paper, we present a topologically sorted classifier chain method for learning ...
expand
3D object retrieval with bag-of-region-words
Yue Gao, You Yang, Qionghai Dai, Naiyao Zhang
Pages: 955-958
doi>10.1145/1873951.1874122
Full text: PDFPDF

View-based method becomes an essential approach to 3D object retrieval in recent years. In the view-based 3D object retrieval framework, each object is described by a set of views and representative features are extracted from these views to match the ...
expand
3D object search through semantic component
Chunjing Xu, Zhengwu Zhang, Jianzhuang Liu, Xiaoou Tang
Pages: 959-962
doi>10.1145/1873951.1874123
Full text: PDFPDF

In this paper, we present a novel concept named semantic component for 3D object search which describes a key component that semantically defines a 3D object. In most cases, the semantic component is intra-category stable and therefore can be used to ...
expand
Keep moving!: revisiting thumbnails for mobile video retrieval
Wolfgang Hürst, Cees G.M. Snoek, Willem-Jan Spoel, Mate Tomin
Pages: 963-966
doi>10.1145/1873951.1874124
Full text: PDFPDF

Motivated by the increasing popularity of video on handheld devices and the resulting importance for effective video retrieval, this paper revisits the relevance of thumbnails in a mobile video retrieval setting. Our study indicates that users are quite ...
expand
Semantic video indexing by fusing explicit and implicit context spaces
Yingbin Zheng, Renzhong Wei, Hong Lu, Xiangyang Xue
Pages: 967-970
doi>10.1145/1873951.1874125
Full text: PDFPDF

This paper addresses the problem of context-based concept fusion (CBCF) for concept detection and semantic video indexing. We introduce a novel framework based on constructing context spaces of concepts, such that the contextual correlations are used ...
expand
Effective logo retrieval with adaptive local feature selection
Jianlong Fu, Jinqiao Wang, Hanqing Lu
Pages: 971-974
doi>10.1145/1873951.1874126
Full text: PDFPDF

Towards building a practical large-scale logo retrieval system, we propose a novel approach to extract and combine local features for effective logo retrieval. Instead of global feature extraction by modeling the web logo as a whole, we extract the local ...
expand
Mining and cropping common objects from images
Gangqiang Zhao, Junsong Yuan
Pages: 975-978
doi>10.1145/1873951.1874127
Full text: PDFPDF

Discovering common objects that appear frequently in a number of images is a challenging problem, due to (1) the appearance variations of the same common object and (2) the enormous computational cost involved in exploring the huge solution space, including ...
expand
Saliency detection based on 2D log-gabor wavelets and center bias
Min Wang, Jia Li, Tiejun Huang, Yonghong Tian, Lingyu Duan, Guochen Jia
Pages: 979-982
doi>10.1145/1873951.1874128
Full text: PDFPDF

Visual saliency can be a useful tool for image content analysis such as automatic image cropping and image compression. In existing methods on visual saliency detection, most of them are related to the model of receptive field. In this paper, we propose ...
expand
Heterogeneous feature selection by group lasso with logistic regression
Fei Wu, Ying Yuan, Yueting Zhuang
Pages: 983-986
doi>10.1145/1873951.1874129
Full text: PDFPDF

The selection of groups of discriminative features is critical for image understanding since the irrelevant features could deteriorate the performance of image understanding. This paper formulates the selection of groups of discriminative features by ...
expand
A novel audio fingerprinting method robust to time scale modification and pitch shifting
Bilei Zhu, Wei Li, Zhurong Wang, Xiangyang Xue
Pages: 987-990
doi>10.1145/1873951.1874130
Full text: PDFPDF

A novel audio fingerprinting method that is highly robust to Time Scale Modification (TSM) and pitch shifting is proposed. Instead of simply employing spectral or tempo-related features, our system is based on computer-vision techniques. We transform ...
expand
Image classification using the web graph
Dhruv Kumar Mahajan, Malcolm Slaney
Pages: 991-994
doi>10.1145/1873951.1874131
Full text: PDFPDF

Image classification is a well-studied and hard problem in computer vision. We extend a proven solution for classifying web spam to handle images. We exploit the link structure of the web graph: a web page related to a given category is normally linked ...
expand
SESSION: Short - S5/content/human-centered multimedia track
Marc Cavazza
Interactive learning of heterogeneous visual concepts with local features
Wajih Ouertani, Michel Crucianu, Nozha Boujemaa
Pages: 995-998
doi>10.1145/1873951.1874133
Full text: PDFPDF

In the context of computer-assisted plant identification we are facing challenging information retrieval problems because of the very high within-class variability and of the limited number of training examples. To address these problems, we suggest ...
expand
Index support for content-based multimedia exploration
Christian Beecks, Philip Driessen, Thomas Seidl
Pages: 999-1002
doi>10.1145/1873951.1874134
Full text: PDFPDF

Content-based multimedia exploration systems support users in browsing and searching voluminous multimedia databases in an interactive and playful way. Guiding the user navigation and exploration process through the database contents, similarity-based ...
expand
Hybrid active learning for cross-domain video concept detection
Huan Li, Yuan Shi, Ming-yu Chen, Alexander G. Hauptmann, Zhang Xiong
Pages: 1003-1006
doi>10.1145/1873951.1874135
Full text: PDFPDF

Cross-domain video concept detection is a challenging task due to the distribution difference between the source domain and target domain. In order to avoid expensive labeling the target-domain data, Active Learning can be used to incrementally learn ...
expand
Hierarchical image feature extraction and classification
Min-Hsuan Tsai, Shen-Fu Tsai, Thomas S. Huang
Pages: 1007-1010
doi>10.1145/1873951.1874136
Full text: PDFPDF

In the field of machine learning and pattern recognition, an alternative to conventional classification is hierarchical classification that exploits hierarchical relations between concepts of interest. To the best of our knowledge, all hierarchical classification ...
expand
Revealing real quality of double compressed MP3 audio
Mengyu Qiao, Andrew H. Sung, Qingzhong Liu
Pages: 1011-1014
doi>10.1145/1873951.1874137
Full text: PDFPDF

MP3 is the most popular format for audio storage and a de facto standard of digital audio compression for the transfer and playback. The flexibility of compression ratio of MP3 coding enables users to choose their customized configuration in the trade-off ...
expand
Prediction of favourite photos using social, visual, and textual signals
Roelof van Zwol, Adam Rae, Lluis Garcia Pueyo
Pages: 1015-1018
doi>10.1145/1873951.1874138
Full text: PDFPDF

This paper focuses on the prediction of users' favourite photos in Flickr. We propose a multi-modal, machine learned approach that combines social, visual and textual signals into a single prediction system. Although each individual user has different ...
expand
One person labels one million images
Jinhui Tang, Qiang Chen, Shuicheng Yan, Tat-Seng Chua, Ramesh Jain
Pages: 1019-1022
doi>10.1145/1873951.1874139
Full text: PDFPDF

Targeting the same objective of alleviating the manual work as automatic annotation, in this paper, we propose a novel framework with minimal human effort to manually annotate a large-scale image corpus. In this framework, a dynamic multi-scale cluster ...
expand
A novel virtual world based HCI paradigm for multimedia scholarly communication
Arturo Nakasone, Tiago da Silva, Andreas Budde, Kugamoorthy Gajananan, Tri T. Truong, Helmut Prendinger
Pages: 1023-1026
doi>10.1145/1873951.1874140
Full text: PDFPDF

The sharing of academic knowledge through printed publications has been widely and successfully utilized for more than a hundred years. However, the need to process huge amounts of data in scientific analysis and communicate its results to the scientific ...
expand
Kodak moments and Flickr diamonds: how users shape large-scale media
Radu Andrei Negoescu, Alexander C. Loui, Daniel Gatica-Perez
Pages: 1027-1030
doi>10.1145/1873951.1874141
Full text: PDFPDF

In today's age of digital multimedia deluge, a clear understanding of the dynamics of online communities is capital. Users have abandoned their role of passive consumers and are now the driving force behind large-scale media repositories, whose dynamics ...
expand
Inter-ACT: an affective and contextually rich multimodal video corpus for studying interaction with robots
Ginevra Castellano, Iolanda Leite, Andre Pereira, Carlos Martinho, Ana Paiva, Peter W. McOwan
Pages: 1031-1034
doi>10.1145/1873951.1874142
Full text: PDFPDF

The Inter-ACT (INTEracting with Robots - Affect Context Task) corpus is an affective and contextually rich multimodal video corpus containing affective expressions of children playing chess with an iCat robot. It contains videos that capture the interaction ...
expand
Multi-scale entropy analysis of dominance in social creative activities
Donald Glowinski, Paolo Coletta, Gualtiero Volpe, Antonio Camurri, Carlo Chiorri, Andrea Schenone
Pages: 1035-1038
doi>10.1145/1873951.1874143
Full text: PDFPDF

Our research focused on ensemble musical performance, an ideal test-bed for the development of models and techniques for measuring creative social interaction in an ecologically valid framework. Starting from expressive behavioral data of a string quartet, ...
expand
Evaluation of digital games using QOL measurements
Yukari Hori, Akira Baba
Pages: 1039-1042
doi>10.1145/1873951.1874144
Full text: PDFPDF

Digital Games have become part of everyday life all over the world. In this article, we suggest to use Quality of Life (QOL) to investigate the characteristics of game play and to evaluate games. We measured emotional changes caused by game play using ...
expand
MuVis: an application for interactive exploration of large music collections
Ricardo Dias, Manuel J. Fonseca
Pages: 1043-1046
doi>10.1145/1873951.1874145
Full text: PDFPDF

In this paper we present MuVis, an interactive visualization and exploration tool for large music collections, based on music content and metadata. We combined a user-centered design with three main components: information visualization techniques (based ...
expand
From photo networks to social networks, creation and use of a social network derived with photos
Michel Plantié, Michel Crampes
Pages: 1047-1050
doi>10.1145/1873951.1874146
Full text: PDFPDF

With the new possibilities in communication and information management, social networks and photos have received plenty of attention in the digital age. In this paper, we show how social photos, captured during family events, representing individuals ...
expand
Enriching audio-visual chat with conversation-based image retrieval and display
Jeroen Vanattenhoven, Christof van Nimwegen, Matthias Strobbe, Olivier Van Laere, Bart Dhoedt
Pages: 1051-1054
doi>10.1145/1873951.1874147
Full text: PDFPDF

This paper presents the results of a user study carried out to evaluate an application prototype in which an audio-visual chat conversation between two users is augmented by pictures related to the topics of that conversation. The prototype analyses ...
expand
A shape-free, designable 6-DoF marker tracking method for camera-based interaction in mobile environment
Hiroki Nishino
Pages: 1055-1058
doi>10.1145/1873951.1874148
Full text: PDFPDF

We developed a novel marker tracking method with shape-free designable markers, which can be visually meaningful to users. The method can work fast enough to provide a real-time camerabased interaction even on low performance CPUs such as one used in ...
expand
iWalk: a tool for interacting with geo-located data through movement and gesture
Visruth Premraj, Margaret Schedel, Tamara L. Berg
Pages: 1059-1062
doi>10.1145/1873951.1874149
Full text: PDFPDF

In this work, we present iWalk, a multimedia exploration tool that provides an interactive virtual environment for physically exploring geo-tagged data. This tool is flexible enough for users to easily explore their own collections, or existing collections ...
expand
The colour of life: novel visualisations of population lifestyles
Philip Kelly, Aiden R. Doherty, Alan F. Smeaton, Cathal Gurrin, Noel E. O'Connor
Pages: 1063-1066
doi>10.1145/1873951.1874150
Full text: PDFPDF

Colour permeates our daily lives, yet we rarely take notice of it. In this work we utilise the SenseCam (a visual lifelogging tool), to investigate the predominant colours in one million minutes of human life that a group of 20 individuals encounter ...
expand
AudioFeeds: a mobile auditory application for monitoring online activities
Tilman Dingler, Stephen Brewster
Pages: 1067-1070
doi>10.1145/1873951.1874151
Full text: PDFPDF

User participation has transformed the way news travel the globe. With the rise of the 'Web 2.0' phenomenon users have been empowered with the means of creating and distributing informational items, which we call social feeds. Platforms like Twitter ...
expand
User driven audio content navigation for spoken web
Ketki A. Dhanesha, Nitendra Rajput, Kundan Srivastava
Pages: 1071-1074
doi>10.1145/1873951.1874152
Full text: PDFPDF

It is a common practice for us to skim textual content on a web page. While skimming, we usually skip words or phrases that are not of interest to us and we slow down our speed when the content seems to be of relevance to us. But when we listen to audio ...
expand
Structuring ordered nominal data for event sequence discovery
Chreston A. Miller, Francis Quek, Naren Ramakrishnan
Pages: 1075-1078
doi>10.1145/1873951.1874153
Full text: PDFPDF

This work investigates using n-gram processing and a temporal relation encoding to providing relational information about events extracted from media streams. The event information is temporal and nominal in nature being categorized by a descriptive ...
expand
Automated sleep quality measurement using EEG signal: first step towards a domain specific music recommendation system
Wei Zhao, Xinxi Wang, Ye Wang
Pages: 1079-1082
doi>10.1145/1873951.1874154
Full text: PDFPDF

With the rapid pace of modern life, millions of people suffer from sleep problems. Music therapy, as a non-medication approach to mitigating sleep problems, has attracted increasing attention recently. However the adaptability of music therapy is limited ...
expand
Deducing user's fatigue from haptic data
Abdelwahab Hamam, Nicolas D. Georganas, Fawaz Alsulaiman, Abdulmotaleb El Saddik
Pages: 1083-1086
doi>10.1145/1873951.1874155
Full text: PDFPDF

Undesired physical fatigue reduces the overall Quality of Experience (QoE) of virtual reality haptics applications. Detecting fatigue is the first step in rectifying this problem. Fatigue in usability analysis is usually detected through conducting questionnaires ...
expand
Context-based indoor object detection as an aid to blind persons accessing unfamiliar environments
Xiaodong Yang, YingLi Tian, Chucai Yi, Aries Arditi
Pages: 1087-1090
doi>10.1145/1873951.1874156
Full text: PDFPDF

Independent travel is a well known challenge for blind or visually impaired persons. In this paper, we propose a computer vision-based indoor wayfinding system for assisting blind people to independently access unfamiliar buildings. In order to find ...
expand
SESSION: Short - S6/applications/content track
Wei Tsang Ooi
Multimedia cross-platform content distribution for mobile peer-to-peer networks using network coding
Morten Videbæk Pedersen, Janus Heide, Péter Vingelmann, László Blázovics, Frank H.P. Fitzek
Pages: 1091-1094
doi>10.1145/1873951.1874158
Full text: PDFPDF

This paper is looking into the possibility of multimedia content distribution over multiple mobile platforms forming wireless peer--to--peer networks. State of the art mobile networks are centralized and base station or access point oriented. Current ...
expand
Topical summarization of web videos by visual-text time-dependent alignment
Song Tan, Hung-Khoon Tan, Chong-Wah Ngo
Pages: 1095-1098
doi>10.1145/1873951.1874159
Full text: PDFPDF

Search engines are used to return a long list of hundreds or even thousands of videos in response to a query topic. Efficient navigation of videos becomes difficult and users often need to painstakingly explore the search list for a gist of the search ...
expand
Improved saliency detection based on superpixel clustering and saliency propagation
Zhixiang Ren, Yiqun Hu, Liang-Tien Chia, Deepu Rajan
Pages: 1099-1102
doi>10.1145/1873951.1874160
Full text: PDFPDF

Saliency detection is useful for high level applications such as adaptive compression, image retargeting, object recognition, etc. In this paper, we introduce an effective region-based solution for saliency detection. We first use the adaptive mean shift ...
expand
Refining video annotation by exploiting inter-shot context
Jian Yi, Yuxin Peng, Jianguo Xiao
Pages: 1103-1106
doi>10.1145/1873951.1874161
Full text: PDFPDF

This paper proposes a new approach to refine video annotation by exploiting the inter-shot context. Our method is mainly novel in two ways. On one hand, to refine annotation result of the target concept, we model the sequence of shots in video as a conditional ...
expand
Web video categorization based on Wikipedia categories and content-duplicated open resources
Zhineng Chen, Juan Cao, Yicheng Song, Yongdong Zhang, Jintao Li
Pages: 1107-1110
doi>10.1145/1873951.1874162
Full text: PDFPDF

This paper presents a novel approach for web video categorization by leveraging Wikipedia categories (WikiCs) and open resources describing the same content as the video, i.e., content-duplicated open resources (CDORs). Note that current approaches only ...
expand
Supporting children's social communication skills through interactive narratives with virtual characters
Mary Ellen Foster, Katerina Avramides, Sara Bernardini, Jingying Chen, Christopher Frauenberger, Oliver Lemon, Kaska Porayska-Pomsta
Pages: 1111-1114
doi>10.1145/1873951.1874163
Full text: PDFPDF

The development of social communication skills in children relies on multimodal aspects of communication such as gaze, facial expression, and gesture. We introduce a multimodal learning environment for social skills which uses computer vision to estimate ...
expand
Automatic image tagging via category label and web data
Shenghua Gao, Zhengxiang Wang, Liang-Tien Chia, Ivor Wai-Hung Tsang
Pages: 1115-1118
doi>10.1145/1873951.1874164
Full text: PDFPDF

Image tagging is an important technique for the image content understanding and text based image processing. Given a selection of images, how to tag these images efficiently and effectively is an interesting problem. In this paper, a novel semi-auto ...
expand
Auto-tagging of images in non-english languages using tag language conversion
Keiichiro Hoashi, Hiromi Ishizaki, Hjalmar Wennerstrom, Yasuhiro Takishima
Pages: 1119-1122
doi>10.1145/1873951.1874165
Full text: PDFPDF

Utilization of web images with social tags as training data has been a major trend for the development of automatic image tagging/classification systems. While the amount of information available on web sites such as Flickr is abundant, the majority ...
expand
Landmark image retrieval using visual synonyms
Efstratios Gavves, Cees G.M. Snoek
Pages: 1123-1126
doi>10.1145/1873951.1874166
Full text: PDFPDF

In this paper, we consider the incoherence problem of the visual words in bag-of-words vocabularies. Different from existing work, which performs assignment of words based solely on closeness in descriptor space, we focus on identifying pairs of independent, ...
expand
Approximate image color correlograms
Claudio Taranto, Nicola Di Mauro, Stefano Ferilli, Floriana Esposito
Pages: 1127-1130
doi>10.1145/1873951.1874167
Full text: PDFPDF

The recent explosion in Internet usage and the growing amount of digital images caused by the more and more ubiquitous presence of digital cameras has created a demand for effective and flexible techniques for automatic image retrieval. As the volume ...
expand
Data-oriented locality sensitive hashing
Wei Zhang, Ke Gao, Yong-dong Zhang, Jin-tao Li
Pages: 1131-1134
doi>10.1145/1873951.1874168
Full text: PDFPDF

Locality Sensitive Hashing (LSH) has been proposed as a scalable and high-dimensional index for approximate similarity search. Euclidean LSH is a variation of LSH and has been successfully used in many multimedia applications. However, hash functions ...
expand
Automatically protecting privacy in consumer generated videos using intended human object detector
Yuta Nakashima, Noboru Babaguchi, Jianping Fan
Pages: 1135-1138
doi>10.1145/1873951.1874169
Full text: PDFPDF

The growing popularity of video sharing services such as YouTube enables us to upload and share consumer generated videos (CGVs) easily, resulting in disclosure of the privacy sensitive information (PSI) of persons, i.e., their appearances. Therefore, ...
expand
Non-parametric anomaly detection exploiting space-time features
Lorenzo Seidenari, Marco Bertini
Pages: 1139-1142
doi>10.1145/1873951.1874170
Full text: PDFPDF

In this paper a real-time anomaly detection system for video streams is proposed. Spatio-temporal features are exploited to capture scene dynamic statistics together with appearance. Anomaly detection is performed in a non-parametric fashion, evaluating ...
expand
A scalable cover identification engine
Emanuele Di Buccio, Nicola Montecchio, Nicola Orio
Pages: 1143-1146
doi>10.1145/1873951.1874171
Full text: PDFPDF

This paper describes the implementation of a content-based cover song identification system which has been released under an open source license. The system is centered around the Apache Lucene text search engine library, and proves how classic techniques ...
expand
Interactive visual object search through mutual information maximization
Jingjing Meng, Junsong Yuan, Yuning Jiang, Nitya Narasimhan, Venu Vasudevan, Ying Wu
Pages: 1147-1150
doi>10.1145/1873951.1874172
Full text: PDFPDF

Searching for small objects (e.g., logos) in images is a critical yet challenging problem. It becomes more difficult when target objects differ significantly from the query object due to changes in scale, viewpoint or style, not to mention partial occlusion ...
expand
Nearest-neighbor classification using unlabeled data for real world image application
Shuhui Wang, Qingming Huang, Shuqiang Jiang, Qi Tian
Pages: 1151-1154
doi>10.1145/1873951.1874173
Full text: PDFPDF

Currently, Nearest-Neighbor approaches (NN) have been widely applied to real world image data mining. These approaches have the following three disadvantages: (i) the performance is inferior on small datasets; (ii) the performance of approximated nearest ...
expand
Behavior and properties of spatio-temporal local features under visual transformations
Julian Stöttinger, Bogdan Tudor Goras, Nicu Sebe, Allan Hanbury
Pages: 1155-1158
doi>10.1145/1873951.1874174
Full text: PDFPDF

Successful state-of-the-art video retrieval and classification applications are predominantly carried out by means of spatio-temporal features. Typically, the evaluation of these tasks is exclusively done based on their final performance but no systematic ...
expand
Boosting-based multiple kernel learning for image re-ranking
I-Hong Jhuo, D. T. Lee
Pages: 1159-1162
doi>10.1145/1873951.1874175
Full text: PDFPDF

Re-ranking the returned images from a query relies on two important steps to improve its effectiveness: the estimation of the image relevance and the enhancement of the similarity function. However, attaining an effective visual similarity and an accurate ...
expand
SESSION: Short - S7/content/systems track
Pascal Frossard
Multi-layer stereo video matting: video matting
M. Jiang, Danny Crookes, Min Chen
Pages: 1163-1166
doi>10.1145/1873951.1874177
Full text: PDFPDF

In this paper, an unsupervised scheme for stereo video matting is presented, where stereo motion analysis is combined to provide an automatic multi-layer clustering scheme of alpha components. With this multi-layer matting scheme, objects in both foreground ...
expand
GPU acceleration of Eff2 descriptors using CUDA
Kristleifur Daðason, Herwig Lejsek, Ársæll Þ. Jóhansson, Björn Þór Jónsson, Laurent Amsaleg
Pages: 1167-1170
doi>10.1145/1873951.1874178
Full text: PDFPDF

Video analysis using local descriptors requires a high-throughput descriptor creation process. This speed can be obtained from modern GPUs. In this paper, we adapt the computation of the Eff2 descriptors, a SIFT variant, to the GPU. We compare our GPU-Eff ...
expand
Dynamic multi-cue tracking with detection responses association
Guochen Jia, Yonghong Tian, Yaowei Wang, Tiejun Huang, Min Wang
Pages: 1171-1174
doi>10.1145/1873951.1874179
Full text: PDFPDF

Multi-cue integration has proved successful at increasing the robustness of tracking algorithms and overcoming the failure cases of individual cue. But considering dynamic appearance of objects or clutter background, the integration based on constant ...
expand
KPB-SIFT: a compact local feature descriptor
Gangqiang Zhao, Ling Chen, Gencai Chen, Junsong Yuan
Pages: 1175-1178
doi>10.1145/1873951.1874180
Full text: PDFPDF

Invariant feature descriptors such as SIFT and GLOH have been demonstrated to be very robust for image matching and object recognition. However, such descriptors are typically of high dimensionality, e.g. 128-dimension in the case of SIFT. This limits ...
expand
Fast feature selection and training for AdaBoost-based concept detection with large scale datasets
Shi Chen, Jinqiao Wang, Yang Liu, Changsheng Xu, Hanqing Lu
Pages: 1179-1182
doi>10.1145/1873951.1874181
Full text: PDFPDF

AdaBoost has been proved a successful statistical learning method for concept detection with high performance of discrimination and generalization. However, it is computationally expensive to train a concept detector using boosting, especially on large ...
expand
Large-scale robust visual codebook construction
Darui Li, Linjun Yang, Xian-Sheng Hua, Hong-Jiang Zhang
Pages: 1183-1186
doi>10.1145/1873951.1874182
Full text: PDFPDF

The web-scale image retrieval system demands a large-scale visual codebook, which is difficult to be generated by the commonly adopted K-means vector quantization due to the applicability issue. While approximate K-means is proposed to scale up the visual ...
expand
Image annotation using multi-correlation probabilistic matrix factorization
Zechao Li, Jing Liu, Xiaobin Zhu, Tinglin Liu, Hanqing Lu
Pages: 1187-1190
doi>10.1145/1873951.1874183
Full text: PDFPDF

The image-word correlation estimation is an essential issue in image annotation. In this paper, we propose a multi-correlation probabilistic matrix factorization (MPMF) algorithm for the correlation estimation. Different from the traditional solutions ...
expand
Interactive panoramic video streaming system over restricted bandwidth network
Masayuki Inoue, Hideaki Kimata, Katsuhiko Fukazawa, Norihiko Matsuura
Pages: 1191-1194
doi>10.1145/1873951.1874184
Full text: PDFPDF

Many new applications are being created around the panoramic video service. The typical system divides the high resolution panoramic video into tiles and the sender transmits a set of tiles, the partial panoramic video. Coding each tile at a uniform ...
expand
Understanding the security and robustness of SIFT
Thanh-Toan Do, Ewa Kijak, Teddy Furon, Laurent Amsaleg
Pages: 1195-1198
doi>10.1145/1873951.1874185
Full text: PDFPDF

Many content-based retrieval systems (CBIRS) describe images using the SIFT local features because of their very robust recognition capabilities. While SIFT features proved to cope with a wide spectrum of general purpose image distortions, its security ...
expand
Implementation and demonstration of a credit-based home access point
Choong-Soo Lee, Mark Claypool, Robert Kinicki
Pages: 1199-1202
doi>10.1145/1873951.1874186
Full text: PDFPDF

The increasing availability of high speed Internet access and the decreasing cost of wireless technologies has increased the number of devices in the home that wirelessly connect to the Internet. While home user applications often have different network ...
expand
Spatially refined inter-sequence error concealment for a multi-broadcast receiver using frequency selective approximation
Tobias Tröger, Jürgen Seiler, André Kaup
Pages: 1203-1206
doi>10.1145/1873951.1874187
Full text: PDFPDF

Mobile reception of digital TV often suffers from severe signal degradations. Inter-sequence error concealment reconstructs lost image blocks of a distorted high-resolution TV signal by inserting corresponding error-free blocks from a low-resolution ...
expand
Performance improvement of distributed video coding by using block mode selection
Bo-Ruei Chiou, Yun-Chung Shen, Han-Ping Cheng, Ja-Ling Wu
Pages: 1207-1210
doi>10.1145/1873951.1874188
Full text: PDFPDF

Block mode selection is one new way to improve the performance of distributed video coding (DVC). Since there are many factors influencing the correctness of the block mode selection, the decision of block mode is not an easy work. In this paper, a low ...
expand
Fast decoding for LDPC based distributed video coding
Yu-Shan Pai, Han-Ping Cheng, Yun-Chung Shen, Ja-Ling Wu
Pages: 1211-1214
doi>10.1145/1873951.1874189
Full text: PDFPDF

Distributed video coding (DVC) is a new coding paradigm targeting applications with the need for low-complexity encoding at the cost of a higher decoding complexity. In the DVC architecture based on a feedback channel, the high decoding complexity is ...
expand
Shape-stable region boundary extraction via affine morphological scale space (AMSS)
Petros Kapsalas, Stefanos Kollias
Pages: 1215-1218
doi>10.1145/1873951.1874190
Full text: PDFPDF

In this paper we present a new approach towards the extraction of affine image regions based on detecting shape-stable boundaries from a multi-scale image representation. We construct an affine morphological scale space (AMSS) representation [1], which ...
expand
Robust digital watermarking in videos based on geometric transformations
Philipp Schaber, Stephan Kopf, Fabian Bauer, Wolfgang Effelsberg
Pages: 1219-1222
doi>10.1145/1873951.1874191
Full text: PDFPDF

In the efforts to fight piracy of high-valued media content, forensic digital watermarking as a passive content security scheme is a potential alternative to current, restrictive approaches like DRM. In this paper, we present a novel watermarking scheme ...
expand
Joint layered video and digital fountain coding for multi-channel video broadcasting
Wen Ji, Zhu Li
Pages: 1223-1226
doi>10.1145/1873951.1874192
Full text: PDFPDF

In this paper, we consider a scenario where multiple video content channels are broadcasted to a set of heterogeneous mobile users with diverse display devices and different channel conditions. The objective is to design a joint coding and rate allocation ...
expand
A novel P2P and cloud computing hybrid architecture for multimedia streaming with QoS cost functions
Irena Trajkovska, Joaquin Salvachua Rodriguez, Alberto Mozo Velasco
Pages: 1227-1230
doi>10.1145/1873951.1874193
Full text: PDFPDF

Since its appearance, peer-to-peer technology has given raise to various multimedia streaming applications. Today, cloud computing offers different service models as a base for successful end user applications. In this paper we propose joining peer-to-peer ...
expand
Hybrid load balancing for online games
Rynson W.H. Lau
Pages: 1231-1234
doi>10.1145/1873951.1874194
Full text: PDFPDF

As massively multiplayer online games are becoming very popular, how to support a large number of concurrent users while maintaining the game performance has become an important research topic. There are two main research directions based on the multi-server ...
expand
SESSION: Brave new ideas - BNI1 track
Nozha Boujemaa
The wisdom of social multimedia: using flickr for prediction and forecast
Xin Jin, Andrew Gallagher, Liangliang Cao, Jiebo Luo, Jiawei Han
Pages: 1235-1244
doi>10.1145/1873951.1874196
Full text: PDFPDF

Social multimedia hosting and sharing websites, such as Flickr, Facebook, Youtube, Picasa, ImageShack and Photobucket, are increasingly popular around the globe. A major trend in the current studies on social multimedia is using the social media sites ...
expand
Multimodal location estimation
Gerald Friedland, Oriol Vinyals, Trevor Darrell
Pages: 1245-1252
doi>10.1145/1873951.1874197
Full text: PDFPDF

In this article we define a multimedia content analysis problem, which we call multimodal location estimation: Given a video/image/audio file, the task is to determine where it was recorded. A single indication, such as a unique landmark, might already ...
expand
Video genetics: a case study from YouTube
John R. Kender, Matthew L. Hill, Apostol (Paul) Natsev, John R. Smith, Lexing Xie
Pages: 1253-1258
doi>10.1145/1873951.1874198
Full text: PDFPDF

We explore in a single but large case study how videos within YouTube, competing for view counts, are like organisms within an ecology, competing for survival. We develop this analogy, whose core idea shows that short video clips, best detected across ...
expand
Content without context is meaningless
Ramesh Jain, Pinaki Sinha
Pages: 1259-1268
doi>10.1145/1873951.1874199
Full text: PDFPDF

We revisit one of the most fundamental problems in multimedia that is receiving enormous attention from researchers without making much progress in solving it: the problem of bridging the semantic gap. Research in this area has focused on developing ...
expand
SESSION: Brave new ideas - BNI2: track
Alejandro Jaimes
Human animal machine interaction: animal behavior awareness and digital experience
Karin Fahlquist, Johannes Karlsson, Haibo Li, Li Liu, Keni Ren, Shafiq ur Réhman, Tim Wark
Pages: 1269-1274
doi>10.1145/1873951.1874201
Full text: PDFPDF

This paper proposes an intuitive wireless sensor/actuator based communication network for human animal interaction for a digital zoo. In order to enhance effective observation and control over wild life, we have built a wireless sensor network. 25 video ...
expand
Enriching social situational awareness in remote interactions: insights and inspirations from disability focused research
Sreekar Krishna, Vineeth Balasubramanian, Sethuraman Panchanathan
Pages: 1275-1284
doi>10.1145/1873951.1874202
Full text: PDFPDF

In this paper we present a new perspective into developing technologies for enriching social presence among remote interaction partners. Inspired by the abilities and limitations faced by people who are disabled during their everyday social interactions, ...
expand
Requirements and design space for interactive public displays
Jörg Müller, Florian Alt, Daniel Michelis, Albrecht Schmidt
Pages: 1285-1294
doi>10.1145/1873951.1874203
Full text: PDFPDF

Digital immersion is moving into public space. Interactive screens and public displays are deployed in urban environments, malls, and shop windows. Inner city areas, airports, train stations and stadiums are experiencing a transformation from traditional ...
expand
SESSION: Video - VID1 track
Shin'ichi Satoh, Jenny Benois Pineau
Acqua vellutata sospesa: interactive video painting
Laurel L. Johannesson
Pages: 1295-1298
doi>10.1145/1873951.1874205
Full text: PDFPDF
Other formats: MovMov

In this paper I present the interactive video painting artwork "Acqua Vellutata Sospesa". I will describe the viewer interface for the interactive component as well as the conceptual approach to the project. Additionally, a comprehensive survey ...
expand
The IMMED project: wearable video monitoring of people with age dementia
Rémi Mégret, Vladislavs Dovgalecs, Hazem Wannous, Svebor Karaman, Jenny Benois-Pineau, Elie El Khoury, Julien Pinquier, Philippe Joly, Régine André-Obrecht, Yann Gaëstel, Jean-François Dartigues
Pages: 1299-1302
doi>10.1145/1873951.1874206
Full text: PDFPDF
Other formats: M4vM4v

In this paper, we describe a new application for multimedia indexing, using a system that monitors the instrumental activities of daily living to assess the cognitive decline caused by dementia. The system is composed of a wearable camera device designed ...
expand
A multimodal virtual environment for interacting with 3d deformable models
Ziying Tang, Anant Patel, Xiaohu Guo, Balakrishnan Prabhakaran
Pages: 1303-1306
doi>10.1145/1873951.1874207
Full text: PDFPDF
Other formats: MovMov

In this video presentation, we introduce an immersive multimodal virtual environment which supports real-time interactions with 3D deformable model through a haptic device. We include a system called "FakeSpace" to imitate 3D environment, and a PHAMTOM ...
expand
Real-time detection of unusual regions in image streams
Rene Schuster, Roland Mörzinger, Werner Haas, Helmut Grabner, Luc Van Gool
Pages: 1307-1310
doi>10.1145/1873951.1874208
Full text: PDFPDF

Automatic and real-time identification of unusual incidents is important for event detection and alarm systems. In today's camera surveillance solutions video streams are displayed on-screen for human operators, e.g. in large multi-screen control centers. ...
expand
Video exploration: from multimedia content analysis to interactive visualization
Marie-luce Viaud, Olivier Buisson, Agnes Saulnier, Clement Guenais
Pages: 1311-1314
doi>10.1145/1873951.1874209
Full text: PDFPDF
Other formats: Mp4Mp4

This paper presents 3 interfaces to access video contents. The stream explorer allows to explore and to segment video streams. The video explorer shows a synthetic view of structured TV programmes. The collection explorer proposes cartographies of large ...
expand
A 3d data intensive tele-immersive grid
Benjamin Petit, Thomas Dupeux, Benoit Bossavit, Joeffrey Legaux, Bruno Raffin, Emmanuel Melin, Jean-Sébastien Franco, Ingo Assenmacher, Edmond Boyer
Pages: 1315-1318
doi>10.1145/1873951.1874210
Full text: PDFPDF
Other formats: MovMov

Networked virtual environments like Second Life enable distant people to meet for leisure as well as work. But users are represented through avatars controlled by keyboards and mouses, leading to a low sense of presence especially regarding body language. ...
expand
Real-time soccer player tracking method by utilizing shadow regions
Nozomu Kasuya, Itaru Kitahara, Yoshinari Kameda, Yuichi Ohta
Pages: 1319-1322
doi>10.1145/1873951.1874211
Full text: PDFPDF
Other formats: MovMov

Our research aims to generate a player's view video stream by using a 3D free-viewpoint video technique. Since player trajectories are necessary to generate the video, we propose a real-time player trajectory estimation method by utilizing the shadow ...
expand
The mediamill search engine video
Cees G.M. Snoek
Pages: 1323-1324
doi>10.1145/1873951.1874212
Full text: PDFPDF
Other formats: MovMov

In this video demonstration, we advertise the MediaMill video search engine, a system that facilitates semantic access to video based on a large lexicon of visual concept detectors and interactive video browsers. With an ultimate aim to disseminate video ...
expand
SESSION: Interactive art -- IA1/cultural heritage tools track
James Wang
Determining the sexual identities of prehistoric cave artists using digitized handprints: a machine learning approach
James Z. Wang, Weina Ge, Dean R. Snow, Prasenjit Mitra, C. Lee Giles
Pages: 1325-1332
doi>10.1145/1873951.1874214
Full text: PDFPDF

The sexual identities of human handprints inform hypotheses regarding the roles of males and females in prehistoric contexts. Sexual identity has previously been manually determined by measuring the ratios of the lengths of the individual's fingers as ...
expand
Enhanced exploration of oral history archives through processed video and synchronized text transcripts
Michael G. Christel, Scott M. Stevens, Bryan S. Maher, Julieanna Richardson
Pages: 1333-1342
doi>10.1145/1873951.1874215
Full text: PDFPDF

A digital video library of over 900 hours of video and 18000 stories from The HistoryMakers was used by 266 students, faculty, librarians, and life-long learners interacting with a system providing multiple search and viewing capabilities over a trial ...
expand
Surfing on artistic documents with visually assisted tagging
Daniele Borghesani, Costantino Grana, Rita Cucchiara
Pages: 1343-1352
doi>10.1145/1873951.1874216
Full text: PDFPDF

This paper describes a complete architecture for the interactive exploration and annotation of artistic collections. In particular the focus is on Renaissance illuminated manuscripts, which typically contain thousands of pictures, used to comment or ...
expand
SESSION: Interactive art -- IA2/art and multimedia track
Tiziana Catarci
Bateau ivre: an artistic markerless outdoor mobile augmented reality installation on a riverboat
Christian Jacquemin, Wai Kit Chan, Mathieu Courgeon
Pages: 1353-1362
doi>10.1145/1873951.1874218
Full text: PDFPDF

Bateau Ivre is a project presented on the Seine River to make a large audience aware of the possible developments of Augmented Reality through an artistic installation in a mobile outdoor environment. The installation could be viewed from a ship ...
expand
Sonify your face: facial expressions for sound generation
Roberto Valenti, Alejandro Jaimes, Nicu Sebe
Pages: 1363-1372
doi>10.1145/1873951.1874219
Full text: PDFPDF

We present a novel visual creativity tool that automatically recognizes facial expressions and tracks facial muscle movements in real time to produce sounds. The facial expression recognition module detects and tracks a face and outputs a feature vector ...
expand
Flow: an interactive public artwork
Fiona Bowie, Sidney Fels, Morgan Hibbert
Pages: 1373-1382
doi>10.1145/1873951.1874220
Full text: PDFPDF

This paper describes the conceptual, aesthetic, hardware, and software design of Flow, a photo/media-based permanent public interactive artwork in Vancouver, Canada. The work is located at street level in a new local community centre at one of the city's ...
expand
Ozone: continuous state-based media choreography system for live performance
Xin Wei Sha, Michael Fortin, Navid Navab, Timothy Sutton
Pages: 1383-1392
doi>10.1145/1873951.1874221
Full text: PDFPDF

This paper describes Ozone, a new media choreography system based on layered, continuous physical models, designed for building a diverse range of interactive spaces that coordinate arbitrary streams of video and audio synthesized in real-time ...
expand
SESSION: Interactive art exhibit track
Luca Farulli, Frank Nack, Andruid Kerne
Tempo universale
Giovanna Bianco, Pino Valente
Pages: 1395-1396
doi>10.1145/1873951.1874223
Full text: PDFPDF

This paper describes the artwork 'Tempo Universale', a video installation: composed out of 3 projections and a 10 channels audio track, which is performed as an endless loop. Tempo Universale is presented at the ACM MM 2010 art exhibition.
expand
Liquid views: memory stage activating perception
Monika Fleischmann, Wolfgang Strauss
Pages: 1397-1398
doi>10.1145/1873951.1874224
Full text: PDFPDF

The central theme of interactive media art installation Liquid Views is the well in which Narcissus discovers his reflection. The work from 1992-93 was first exhibited at Siggraph 1993 in Simon Penny's Machine Culture show. It was exhibited in over 50 ...
expand
Exploring touch and breath in networked wearable installation design
Thecla Schiphorst, Jinsil Seo, Norm Jaffe
Pages: 1399-1400
doi>10.1145/1873951.1874225
Full text: PDFPDF

This paper describes the artistic design concepts for the interactive wearable artworks tendrils and exhale exhibited at ACM Multimedia 2010 Interactive Art Program in Firenze Italy at the at the Palazzo Medici-Riccardi from 25 October ...
expand
Living wall: programmable wallpaper for interactive spaces
Leah Buechley, David Mellis, Hannah Perner-Wilson, Emily Lovell, Bonifaz Kaufmann
Pages: 1401-1402
doi>10.1145/1873951.1874226
Full text: PDFPDF

The Living Wall project explores the construction and application of interactive wallpaper. Using conductive, resistive, and magnetic paints we produced wallpaper that enables us to create dynamic, reconfigurable, programmable spaces. The wallpaper consists ...
expand
Blue morph: metaphor and metamorphosis
Victoria Vesna, James K. Gimzewski
Pages: 1403-1404
doi>10.1145/1873951.1874227
Full text: PDFPDF

The authors describe the Blue Morph installation they developed and produced in full collaboration as an art/science hybrid. Together, Vesna and Gimzewski created an art | science project that uses nano-scale images and sounds derived from the metamorphosis ...
expand
Chromatic perspectives... scaling my art
Franz Fischnaller
Pages: 1405-1406
doi>10.1145/1873951.1874228
Full text: PDFPDF

This paper attempts to describe Chromatic Perspectives... Scaling my Art; which addresses the results of a trans-medial exploration departing from an"unframed"process of creativity and multi layered convergence within traditional media Art and virtual ...
expand
CCC trilogy: the italian garden
Davide Venturini, Francesco Gandi
Pages: 1407-1408
doi>10.1145/1873951.1874229
Full text: PDFPDF

In this paper we outline the interactive installation 'The Italian garden', which is based on our CCC [children cheering carpet] technology (created with Max/Msp Jitter). The installation, which invites users to play in a typical Italian Renaissance ...
expand
SESSION: Interactive art short -- IA3 track
Sethuraman Panchanathan
Building with a memory: responsive color interventions
Andreea Danielescu, Ryan Spicer, David Tinapple, Aisling Kelliher, Ellen Campana
Pages: 1409-1412
doi>10.1145/1873951.1874231
Full text: PDFPDF

Building with a Memory is a subtle responsive intervention that aims to provide cohesion and community awareness through the use of light and color. The installation delivers thought-provoking information by capturing, analyzing and rendering real-time ...
expand
The rumentarium project
Andrea Valle
Pages: 1413-1416
doi>10.1145/1873951.1874232
Full text: PDFPDF

The paper describes the design, production and usage of the "Rumentarium", a computer-based sound generating system involving physical objects as sound sources. The Rumentarium is a set of handmade resonators, acoustically excited by DC motors, interfaced ...
expand
Alan01: slivers of color, media and a soul
Mika Tuomola, Teemu Korpilahti, Jaakko Pesonen
Pages: 1417-1420
doi>10.1145/1873951.1874233
Full text: PDFPDF

This paper introduces the interactive art installation Alan01, which wakes up the 1952 criminally convicted Alan Turing as a piece of code within the art work - thus fulfilling Turing's own vision of preserving human consciousness in a computer.
expand
Encounter (resonances)
Hayley Hung, Christian Jacquemin
Pages: 1421-1424
doi>10.1145/1873951.1874234
Full text: PDFPDF

This work is about the remediation of one of Mark Rothko's Seagram murals through the composition of several online sources and additional digital rendering. Based on reproductions of Rothko's "Red on Maroon" found on the Internet, and using computer ...
expand
Thrii
Nicole Lehrer, David Tinapple, Tatyana Koziupa, Meng Chen, Assegid Kidane, Stjepan Rajko, Isaac Wallis, Michael Baran, David Lorig, Diana Siwiak, Loren Olson
Pages: 1425-1428
doi>10.1145/1873951.1874235
Full text: PDFPDF

Thrii is a multimodal interactive installation that explores levels of movement similarity among its participants. Each of the three participants manipulates a large spherical object whose movement is tracked via an embedded accelerometer. An ...
expand
HUM, an interactive and collaborative art installation
Jean-Julien Filatriau, François Zajéga
Pages: 1429-1432
doi>10.1145/1873951.1874236
Full text: PDFPDF

This paper describes HUM, an interactive art installation which interprets the behavior of the visitors on different time scales to render visual and sonic artwork in real-time. HUM was presented at BRASS cultural center (Brussels, Belgium) ...
expand
Coming together: composition by negotiation
Arne Eigenfeldt
Pages: 1433-1436
doi>10.1145/1873951.1874237
Full text: PDFPDF

In this paper, we describe a software system that generates unique musical compositions in realtime, created by four autonomous multi-agents. Given no explicit musical data, agents explore their environment, building beliefs through interactions with ...
expand
RTiVISS: real-time video interactive systems for sustainability
Mónica Mendes, Nuno Correia
Pages: 1437-1440
doi>10.1145/1873951.1874238
Full text: PDFPDF

RTiVISS is an exploratory project that proposes to investigate innovative concepts and design methods regarding environmental and sustainability issues. It is concerned with natural resources, specially forests, and their preservation, through critical ...
expand
Chroma space: affective colors in interactive 3d world
Wendy Ann C. Mansilla, Jordi Puig, Andrew Perkis, Touradj Ebrahimi
Pages: 1441-1444
doi>10.1145/1873951.1874239
Full text: PDFPDF

We have developed an installation called Chroma Space to serve as a platform for experimenting the novel usage of affective colors in an interactive synthetic scenario. Chroma Space demonstrates the effective impacts of using a stylistic ...
expand
An interactive multimedia framework for digital heritage narratives
Neeharika Adabala, Naren Datha, Joseph Joy, Chinmay Kulkarni, Ajay Manchepalli, Aditya Sankar, Rebecca Walton
Pages: 1445-1448
doi>10.1145/1873951.1874240
Full text: PDFPDF

The cultural heritage of a region is conveyed by both tangible physical artifacts and intangible aspects in the form of stories, dance styles, rituals, etc. Hitherto, the task of creating digital representations for each of these aspects has been addressed ...
expand
Natural interaction for cultural heritage: the archaeological site of Shawbak
Thomas Matteo Alisi, Gianpaolo D'Amico, Andrea Ferracani, Lea Landucci, Nicola Torpei
Pages: 1449-1452
doi>10.1145/1873951.1874241
Full text: PDFPDF

One of the most interesting issues in the field of cultural heritage is the adoption of multimedia systems for the visualization and organization of information. In this paper we present a natural interaction based system designed to represent multimedia ...
expand
Yongzheng emperor's interactive tabletop: seamless multimedia system in a museum context
Chun-Ko Hsieh, I-Ling Liu, Neng-Hao Yu, Yueh-Hsuan Chiang, Hsiang-Tao Wu, Ying-Jui Chen, Yi-Ping Hung
Pages: 1453-1456
doi>10.1145/1873951.1874242
Full text: PDFPDF

In this paper, we propose the seamless multimedia system Yongzheng Emperor's interactive tabletop, which has been incorporated into the special exhibition "Harmony and Integrity: The Yongzheng Emperor and His Times" at the National Palace Museum ...
expand
SESSION: Interactive art open workshop: interactive multimedia computing for creativity and expression track
andruid Kerne
Interactive multimedia computing for creativity and expression
Andruid Kerne, Frank Nack, Luca Farulli
Pages: 1457-1458
doi>10.1145/1873951.1874244
Full text: PDFPDF

In this paper we outline the aims and organization of the ACM MM 10 workshop on 'Interactive Multimedia Computing for Creativity and Expression'.
expand
SESSION: Open source software competition -- OS1 track
Nicu Sebe
Opensmile: the munich versatile and fast open-source audio feature extractor
Florian Eyben, Martin Wöllmer, Björn Schuller
Pages: 1459-1462
doi>10.1145/1873951.1874246
Full text: PDFPDF

We introduce the openSMILE feature extraction toolkit, which unites feature extraction algorithms from the speech processing and the Music Information Retrieval communities. Audio low-level descriptors such as CHROMA and CENS features, loudness, Mel-frequency ...
expand
Open SVC decoder: a flexible SVC library
Médéric Blestel, Mickaël Raulet
Pages: 1463-1466
doi>10.1145/1873951.1874247
Full text: PDFPDF

This paper describes the Open SVC Decoder project, an open source library which implements the Scalable Video Coding (SVC) standard, the latest standardized by the Joint Video Team (JVT). This library has been integrated into open source players The ...
expand
Sonic visualiser: an open source application for viewing, analysing, and annotating music audio files
Chris Cannam, Christian Landone, Mark Sandler
Pages: 1467-1468
doi>10.1145/1873951.1874248
Full text: PDFPDF

Sonic Visualiser is a friendly and flexible end-user desktop application for analysis, visualisation, and annotation of music audio files. Its stated goal is to be "the first program you reach for when want to study a musical recording rather than simply ...
expand
Vlfeat: an open and portable library of computer vision algorithms
Andrea Vedaldi, Brian Fulkerson
Pages: 1469-1472
doi>10.1145/1873951.1874249
Full text: PDFPDF

VLFeat is an open and portable library of computer vision algorithms. It aims at facilitating fast prototyping and reproducible research for computer vision scientists and students. It includes rigorous implementations of common building blocks such ...
expand
TOP-SURF: a visual words toolkit
Bart Thomee, Erwin M. Bakker, Michael S. Lew
Pages: 1473-1476
doi>10.1145/1873951.1874250
Full text: PDFPDF

TOP-SURF is an image descriptor that combines interest points with visual words, resulting in a high performance yet compact descriptor that is designed with a wide range of content-based image retrieval applications in mind. TOP-SURF offers the flexibility ...
expand
SESSION: Open source software competition -- OS2 track
Marco Bertini
FALCON: FAst Lucene-based Cover sOng identification
Emanuele Di Buccio, Nicola Montecchio, Nicola Orio
Pages: 1477-1480
doi>10.1145/1873951.1874252
Full text: PDFPDF

We present FALCON, an open-source engine for content-based cover song identification written in Java. The popular Lucene search engine library is used as the core of the software, proving that textual methods in information retrieval can be successfully ...
expand
The python computer vision framework
Bertrand Nouvel, Shin'Ichi Satoh
Pages: 1481-1484
doi>10.1145/1873951.1874253
Full text: PDFPDF

PyCVF is an open source framework for computer vision and video-mining. It allows rapid development of applications and it provides standardized tools for common operations such as : browsing datasets, applying transformations to one dataset on-the-fly, ...
expand
Torchvision the machine-vision package of torch
Sébastien Marcel, Yann Rodriguez
Pages: 1485-1488
doi>10.1145/1873951.1874254
Full text: PDFPDF

This paper presents Torchvision an open source machine vision package for Torch. Torch is a machine learning library providing a series of the state-of-the-art algorithms such as Neural Networks, Support Vector Machines, Gaussian Mixture Models, Hidden ...
expand
The openip open source image processing library
György Kovács, János István Iván, Árpád Pányik, Attila Fazekas
Pages: 1489-1492
doi>10.1145/1873951.1874255
Full text: PDFPDF

The openIP open source image processing library is a set of c++ libraries providing tools for education, research and industrial purposes. The aim of the development is to fill in the gap between the academic and commercial utilization of image ...
expand
An open-source SIFTLibrary
Rob Hess
Pages: 1493-1496
doi>10.1145/1873951.1874256
Full text: PDFPDF

Recent years have seen an explosion in the use of invariant keypoint methods across nearly every area of computer vision research. Since its introduction, the scale-invariant feature transform (SIFT) has been one of the most effective and widely-used ...
expand
SESSION: Industrial exhibit -- IE1 track
Berna Erol
Virtual environment for surprises
Lara Oliveti, Marcella Albiero, Paolo Giordano
Pages: 1497-1498
doi>10.1145/1873951.1874258
Full text: PDFPDF

Creation of a virtual interactive and highly evolved environment with Surprises characters.
expand
RAPID: a reliable protocol for improving delay
Sanjeev Mehrotra, Jin Li, Cheng Huang
Pages: 1499-1500
doi>10.1145/1873951.1874259
Full text: PDFPDF

Recently, there has been a dramatic increase in interactive cloud based software applications (e.g. working on remote machines, online games, interactive websites such as financial, web search) and other soft real-time applications (traffic within data ...
expand
A location based reminder system for advertisement
Yiqun Li, Aiyuan Guo, Siying Liu, Yan Gao, Yan-Tao Zheng
Pages: 1501-1502
doi>10.1145/1873951.1874260
Full text: PDFPDF

In this paper, we propose a location based reminder system with image recognition technology. With this system, mobile phone users can actively capture pictures from their favorite product or event promotional materials. After the phone user sends the ...
expand
Embedded media marker: linking multimedia to paper
Qiong Liu, Chunyuan Liao, Lynn Wilcox, Anthony Dunnigan, Bee Liew
Pages: 1503-1504
doi>10.1145/1873951.1874261
Full text: PDFPDF

An Embedded Media Marker (EMM) is a transparent mark printed on a paper document that signifies the availability of additional media associated with that part of the document. Users take a picture of the EMM using a camera phone, and the media associated ...
expand
The virtual chocolate factory: mixed reality industrial collaboration and control
Maribeth Back, Don Kimber, Eleanor Rieffel, Anthony Dunnigan, Bee Liew, Sagar Gattepally, Jonathan Foote, Jun Shingu, James Vaughan
Pages: 1505-1506
doi>10.1145/1873951.1874262
Full text: PDFPDF

We show several aspects of a complex mixed reality system that we have built and deployed in a real-world factory setting. In our system, virtual worlds, augmented realities, and social and mobile applications are all fed from the same infrastructure. ...
expand
TalkMiner: a search engine for online lecture video
John Adcock, Matthew Cooper, Laurent Denoue, Hamed Pirsiavash, Lawrence A. Rowe
Pages: 1507-1508
doi>10.1145/1873951.1874263
Full text: PDFPDF

TalkMiner is a search engine for lecture webcasts. Lecture videos are processed to recover a set of distinct slide images and OCR is used to generate a list of indexable terms from the slides. On our prototype system, users can search and browse lists ...
expand
Multi-sensor fusion for interactive visual computing in mixed environment
Peng Patricia Wang, Tao Wang, Dayong Ding, Yimin Zhang, Kai Miao, Cynthia K. Pickering, Phil Tian, Jinxue Zhang
Pages: 1509-1510
doi>10.1145/1873951.1874264
Full text: PDFPDF

Mobile Augmented Reality, as an emerging application for handheld devices, explores more natural interactions in real and virtual environments. For the purpose of accurate system response and manipulating objects in real-time, extensive efforts have ...
expand
Uffizi touch®: a new experience with art
Marco Cappellini, Paolo De Rocco, Leonardo Serni
Pages: 1511-1512
doi>10.1145/1873951.1874265
Full text: PDFPDF

Centrica (www.centrica.it) has developed Uffizi Touch®, the most famous art Gallery worldwide in an interactive digital signage solution you can place in your personal space. Demo videos at www.uffizitouch.com.
expand
Social recommendation and visual analysis on the TV
Cathal Gurrin, Hyowon Lee, Paul Ferguson, Alan F. Smeaton, Noel E. O'Connor, Yoonhee Choi, Heeseon Park
Pages: 1513-1514
doi>10.1145/1873951.1874266
Full text: PDFPDF

In this paper, we present prototype interactive TV software that incorporates visual content analysis tools and social networking in the home TV. We present the challenges of working with the living room TV environment and outline how we have utilized ...
expand
Multimedia security technologies for movie protection
Michael Arnold, Séverine Baudry, Peter Baum, Xiao-Ming Chen, Bertrand Chupeau, Olivier Courtay, Gwenaël Doërr, Ulrich Gries, Frédéric Lefèbvre, Michel Morvan, Antoine Robert, Charles Salmon-Legagneur, Christophe Vincent, Mario de Vito
Pages: 1515-1516
doi>10.1145/1873951.1874267
Full text: PDFPDF

In this industrial exhibit, Technicolor will showcase various technologies developed to protect digital audio-visual material all along the media value chain, from production to distribution, including forensics services in case pirated content is detected. ...
expand
Efficient and robust near-duplicate detection in large and growing image data-sets
Thomas Pönitz, Julian Stöttinger
Pages: 1517-1518
doi>10.1145/1873951.1874268
Full text: PDFPDF

Due to the increasing flood of digital images and the overall increase of storage capacity, large scale image databases are common these days. This work deals with the problem of finding replicas in image databases containing more than 100000 images. ...
expand
File-based media workflows using ltfs tapes
Arnon Amir, David Pease, Rainer Richter, Brian Biskeborn, Michael Richmond, Lucas Villa Real
Pages: 1519-1520
doi>10.1145/1873951.1874269
Full text: PDFPDF

While digital video cameras have existed for over two decades digital video cassettes are still the primary storage medium in professional video archives. One of the major inhibitors in the transition to file-based workflows and media archives ...
expand
PopApp: the first 3D popular application for non-linear presentations
Claudio Mazzanti
Pages: 1521-1522
doi>10.1145/1873951.1874270
Full text: PDFPDF

PopApp is an application aimed to organize and publish multimedia contents in a 3d graphical interface, with a non-linear hierarchical structure. Presentations could be quickly and easily created and edited using the PopApp Editor. A step forward from ...
expand
Large scale partially duplicated web image retrieval
Wengang Zhou, Yijuan Lu, Houqiang Li, Yibing Song, Qi Tian
Pages: 1523-1524
doi>10.1145/1873951.1874271
Full text: PDFPDF

The state-of-the-art image retrieval approaches represent images with a high dimensional vector of visual words by quantizing local features, such as SIFT, in the descriptor space. The geometric clues among visual words in an image is usually ignored ...
expand
Visual search applications for connecting published works to digital material
Jamey Graham, Jorge Moraleda, Jonathan J. Hull, Timothee Bailloeul, Xu Liu, Andrea Mariotta
Pages: 1525-1526
doi>10.1145/1873951.1874272
Full text: PDFPDF

Visual search connects physical (offline) objects with (online) digital media. Using objects from the environment, like newspapers, magazines, books and posters, we can retrieve supplemental information from the online world. In this demonstration, ...
expand
SCOTT: set cover tracing technology
Dulce Ponceleon, Jeff Lostpiech, Hongxia Jin, Eric Wilcox
Pages: 1527-1528
doi>10.1145/1873951.1874273
Full text: PDFPDF

In this paper, we describe SCOTT: a demonstration system that uses the Set Cover Tracing algorithm for determining the source of pirate content. This algorithm is very efficient in dealing with collusion attacks - the performance is close to linear in ...
expand
Physical hyperlinks for citizen interaction
Dustin Haisler, Phil Tate
Pages: 1529-1530
doi>10.1145/1873951.1874274
Full text: PDFPDF

In 2008, the City of Manor deployed Quick Response barcodes, also known as QR-codes, throughout the community of 6,500 people. The QR-codes were initially intended as a document management solution, but eventually turned into a powerful and engaging ...
expand
Mobile document scanning and copying
Jian Fan, Qian Lin, Jerry Liu
Pages: 1531-1532
doi>10.1145/1873951.1874275
Full text: PDFPDF

In this paper, we show a multimedia system for processing mobile camera captured documents. Using a client application on a mobile phone, a user can capture a document image, and send the image to a processing server so that the document image can be ...
expand
Data-driven behavioural algorithms for online advertising
Antonio Tomarchio, Francesco Bellacci, Filippo Privitera
Pages: 1533-1534
doi>10.1145/1873951.1874276
Full text: PDFPDF

In this paper, we describe an innovative data-driven behavioural approach that we developed for the optimization of performance online advertising on Simply, the new international adnetwork developed by Dada spa.
expand
DEMONSTRATION SESSION: Demo - D1 track
Daniel Gatica-Perez
Crowdsourcing rock n' roll multimedia retrieval
Cees G.M. Snoek, Bauke Freiburg, Johan Oomen, Roeland Ordelman
Pages: 1535-1538
doi>10.1145/1873951.1874278
Full text: PDFPDF

In this technical demonstration, we showcase a multimedia search engine that facilitates semantic access to archival rock n' roll concert video. The key novelty is the crowdsourcing mechanism, which relies on online users to improve, extend, and share, ...
expand
Media distribution over 2D communication sheet
Youiti Kado, Bing Zhang, Jiang Yu Zheng
Pages: 1539-1542
doi>10.1145/1873951.1874279
Full text: PDFPDF

This paper will demonstrate a media infrastructure that can distribute multimedia signals and power via a two dimensional communication sheet. Small and low power devices placed on top of the sheet will be able to receive audio and video signals transmitted ...
expand
Melog
Hongzhi Li, Xian-Sheng Hua, Xijia Liu
Pages: 1543-1546
doi>10.1145/1873951.1874280
Full text: PDFPDF

We demonstrate Melog, a "mobile + cloud" multimedia system enabling efficient and near-realtime experience sharing through automatic blogging and micro-blogging, which are based on multi-modal media content analyses and syntheses. Unlike existing mobile ...
expand
Color and luminance compensation for mobile panorama construction
Yingen Xiong, Kari Pulli
Pages: 1547-1550
doi>10.1145/1873951.1874281
Full text: PDFPDF

We provide an efficient technique of color and luminance compensation for sequences of overlapping images. It can be used in construction of high-resolution and high-quality panoramic images even when the input images have very different colors and luminance. ...
expand
iPhotobook: creating photo books on mobile devices
Jun Xiao, Nic Lyons, C. Brian Atkins, Yuli Gao, Hui Chao, Xuemei Zhang
Pages: 1551-1554
doi>10.1145/1873951.1874282
Full text: PDFPDF

The amount of photo that is captured and stored with mobile devices is growing rapidly. We regularly see traditional desktop multimedia applications being ported to mobile devices. However, less often do we see novel interaction mechanism being developed ...
expand
Blog2Book: transforming blogs into photo books employing aesthetic principles
Philipp Sandhaus, Mohammad Rabbath, Ilja Erbis, Susanne Boll
Pages: 1555-1556
doi>10.1145/1873951.1874283
Full text: PDFPDF

For many people web blogs are the preferred means to document important moments of their lifes, e.g. a holiday trip or the year abroad. Such blogs contain photos and textual descriptions of events in a well-structured form. However, while being a perfect ...
expand
TagCaptcha: annotating images with CAPTCHAs
Donn Morrison, Stéphane Marchand-Maillet, Éric Bruno
Pages: 1557-1558
doi>10.1145/1873951.1874284
Full text: PDFPDF

We demonstrate our TagCaptcha image annotation system. TagCaptcha presents the user with a number of images that must be correctly labelled in order to pass a human verification test on the web. The images are divided into two subsets: a control or verification ...
expand
DEMONSTRATION SESSION: Demo - D2 track
James Lynch
A technical demonstration of large-scale image object retrieval by efficient query evaluation and effective auxiliary visual feature discovery
Yin-Hsi Kuo, Yi-Lun Wu, Kuan-Ting Chen, Yi-Hsuan Yang, Tzu-Hsuan Chiu, Winston H. Hsu
Pages: 1559-1562
doi>10.1145/1873951.1874286
Full text: PDFPDF

In this demonstration, we present a real-time system that addresses three essential issues of large-scale image object retrieval: 1) image object retrieval-facilitating pseudo-objects in inverted indexing and novel object-level pseudo-relevance feedback ...
expand
SIVA suite: authoring system and player for interactive non-linear videos
Britta Meixner, Beate Siegel, Günther Hölbling, Franz Lehner, Harald Kosch
Pages: 1563-1566
doi>10.1145/1873951.1874287
Full text: PDFPDF

In this paper, an intuitive authoring system and player for interactive non-linear video called SIVA Suite is presented for demonstration. Such videos are enriched by additional content. Possible forms of additional content are plaintext, richtext, images ...
expand
Integrated mobile visualization and interaction of events and POIs
Daniel Schmeiß, Ansgar Scherp, Steffen Staab
Pages: 1567-1570
doi>10.1145/1873951.1874288
Full text: PDFPDF

We propose a new approach for mobile visualization and interaction of temporal information by integrating support for time with today's most prevalent visualization of spatial information the map. Our approach allows for an easy and precise selection ...
expand
3D ancient mosaics
Sebastiano Battiato, Giovanni Puglisi
Pages: 1571-1574
doi>10.1145/1873951.1874289
Full text: PDFPDF

Digital 3D mosaics generation is a current trend of NPR (Non Photorealistic Rendering) field; in this demo we present an interactive system realized in JAVA where the user can simulate ancient mosaic in a 3D environment starting for any input image. ...
expand
Training data collection system for a learning-based photographic aesthetic quality inference engine
Razvan Orendovici, James Z. Wang
Pages: 1575-1578
doi>10.1145/1873951.1874290
Full text: PDFPDF

We present a novel data collection system deployed for the ACQUINE - Aesthetic Quality Inference Engine. The goal of the system is to collect online user opinions, both structured and unstructured, for training future generation learning-based aesthetic ...
expand
Photo2Trip: an interactive trip planning system based on geo-tagged photos
Huagang Yin, Xin Lu, Changhu Wang, Nenghai Yu, Lei Zhang
Pages: 1579-1582
doi>10.1145/1873951.1874291
Full text: PDFPDF

In this technical demonstration, we present a novel interactive trip planning system, i.e. Photo2Trip, by leveraging existing travel clues recovered from 20 million geo-tagged photos. Compared with the most common ways of trip planning, such as surveying ...
expand
Coming together: negotiated content by multi-agents
Arne Eigenfeldt
Pages: 1583-1586
doi>10.1145/1873951.1874292
Full text: PDFPDF

In this paper, we describe a software system that generates unique musical compositions in realtime, created by four autonomous multi-agents. Given no explicit musical data, agents explore their environment, building beliefs through interactions with ...
expand
Mobile product recognition
Sam S. Tsai, David Chen, Vijay Chandrasekhar, Gabriel Takacs, Ngai-Man Cheung, Ramakrishna Vedantham, Radek Grzeszczuk, Bernd Girod
Pages: 1587-1590
doi>10.1145/1873951.1874293
Full text: PDFPDF

We present a mobile product recognition system for the camera-phone. By snapping a picture of a product with a camera-phone, the user can retrieve online information of the product. The product is recognized by an image-based retrieval system located ...
expand
DEMONSTRATION SESSION: Demo - D3 track
Kiyoharu Aizawa
Joke-o-Mat HD: browsing sitcoms with human derived transcripts
Adam Janin, Luke Gottlieb, Gerald Friedland
Pages: 1591-1594
doi>10.1145/1873951.1874295
Full text: PDFPDF

Joke-o-mat HD is a system that allows a user to navigate sitcoms (such as Seinfeld) by "narrative themes", including scenes, punchlines, and dialog segments. The themes can be filtered by the main actors and by keyword. For example, the user can ...
expand
Multi-exposure imaging on mobile devices (demo)
Natasha Gelfand, Andrew Adams, Sung Hee Park, Kari Pulli
Pages: 1595-1598
doi>10.1145/1873951.1874296
Full text: PDFPDF

Many natural scenes have a dynamic range that is larger than the dynamic range of a camera's image sensor. A popular approach to producing an image without under- and over-exposed areas is to capture several input images with varying exposure settings, ...
expand
iComics: automatic conversion of movie into comics
Richang Hong, Meng Wang, Guangda Li, Xiao-Tong Yuan, Shuicheng Yan, Tat-Seng Chua
Pages: 1599-1602
doi>10.1145/1873951.1874297
Full text: PDFPDF

This demonstration presents a system, named iComics, for automatic conversion of movie into comics. We design three components to realize the system: script-face mapping, key-scene extraction, and cartoonization. Script-face mapping utilizes face recognition ...
expand
vESP: enriching enterprise document search results with aligned video summarization
Pål Halvorsen, Dag Johansen, Bjørn Olstad, Tomas Kupka, Sverre Tennøe
Pages: 1603-1604
doi>10.1145/1873951.1874298
Full text: PDFPDF

In this demo, we present a video-enabled enterprise search platform (vESP), an application prototype that enhance a widely deployed commercial enterprise search engine with video streaming. The idea is that for example in a large enterprise, like Microsoft, ...
expand
MindFinder: interactive sketch-based image search on millions of images
Yang Cao, Hai Wang, Changhu Wang, Zhiwei Li, Liqing Zhang, Lei Zhang
Pages: 1605-1608
doi>10.1145/1873951.1874299
Full text: PDFPDF

In this paper, we showcase the MindFinder system, which is an interactive sketch-based image search engine. Different from existing work, most of which is limited to a small scale database or only enables single modality input, MindFinder is a sketch-based ...
expand
Facilitating interactive search and navigation in videos
Klaus Schoeffmann
Pages: 1609-1612
doi>10.1145/1873951.1874300
Full text: PDFPDF

We present a tool that can efficiently facilitate interactive navigation and search in videos. In addition to browsing a video by shots it also allows a user to navigate through a video with extended seeker bars showing time-related content abstractions. ...
expand
BIOFACE: a biometric face demonstrator
Mourad Ouaret, Antitza Dantcheva, Rui Min, Lionel Daniel, Jean Luc Dugelay
Pages: 1613-1616
doi>10.1145/1873951.1874301
Full text: PDFPDF

In this paper, a demonstrator called BIOFACE incorporating several facial biometric techniques is described. It includes the well established Eigenfaces and the recently published Tomofaces techniques, which perform face recognition based on facial appearance ...
expand
ClustTour: city exploration by use of hybrid photo clustering
Symeon Papadopoulos, Christos Zigkolis, Stefanos Kapiris, Yiannis Kompatsiaris, Athena Vakali
Pages: 1617-1620
doi>10.1145/1873951.1874302
Full text: PDFPDF

We present a technical demonstration of an online city exploration application that helps users identify interesting spots in a city by use of photo clusters corresponding to landmarks and events. Our application, called ClustTour, is based on an efficient ...
expand
DEMONSTRATION SESSION: Demo - D4 track
Winston Hsu
Rerum novarum: interactive exploration of illuminated manuscripts
Daniele Borghesani, Costantino Grana, Rita Cucchiara
Pages: 1621-1624
doi>10.1145/1873951.1874304
Full text: PDFPDF

This paper describes an interactive application for the exploration and annotation of illuminated manuscripts, which typically contain thousands of pictures, used to comment or embellish the manuscript Gothic text. The system is composed by a modern ...
expand
Sirio, orione and pan: an integrated web system for ontology-based video search and annotation
Marco Bertini, Gianpaolo D'Amico, Andrea Ferracani, Marco Meoni, Giuseppe Serra
Pages: 1625-1628
doi>10.1145/1873951.1874305
Full text: PDFPDF

In this technical demonstration we show an integrated web system for video search and annotation based on ontologies. The system is composed by three components: the Orione ontology-based search engine, the Sirio\footnote{Sirio was the hound of Orione. ...
expand
Web-based semantic browsing of video collections using multimedia ontologies
Marco Bertini, Gianpaolo D'Amico, Andrea Ferracani, Marco Meoni, Giuseppe Serra
Pages: 1629-1632
doi>10.1145/1873951.1874306
Full text: PDFPDF

In this technical demonstration we present a novel web-based tool that allows a user friendly semantic browsing of video collections, based on ontologies, concepts, concept relations and concept clouds. The system is developed as a Rich Internet Application ...
expand
MediaTable: a tool for categorizing multimedia collections
Ork de Rooij, Marcel Worring
Pages: 1633-1636
doi>10.1145/1873951.1874307
Full text: PDFPDF

In this technical demonstration, we present MediaTable, our interactive multimedia collection search and categorization tool. MediaTable allows users to search through, and categorize a multimedia collection with ease by employing several familiar interface ...
expand
Interactive person-retrieval in TV series and distributed surveillance video
Martin Bäuml, Mika Fischer, Keni Bernardin, Hazim K. Ekenel, Rainer Stiefelhagen
Pages: 1637-1638
doi>10.1145/1873951.1874308
Full text: PDFPDF

Tracking and identifying persons in videos are important building blocks in many applications. For browsing of multimedia data or interactive investigation of surveillance footage it is not even necessary to uniquely identify a person. Rather it often ...
expand
Trajectory-based visualization of web video topics
Juan Cao, Chong-Wah Ngo, YongDong Zhang, DongMing Zhang, Liang Ma
Pages: 1639-1642
doi>10.1145/1873951.1874309
Full text: PDFPDF

While there have been research efforts in organizing large scale web videos into clusters or topics, efficient browsing of web video topics remains a challenging problem not yet addressed. The related issues include how to efficiently browse and track ...
expand
Adding haptic feature to YouTube
Md. Abdur Rahman, Abdulmajeed Alkhaldi, Jongeun Cha, Abdulmotaleb El Saddik
Pages: 1643-1646
doi>10.1145/1873951.1874310
Full text: PDFPDF

In this paper, we present a web-based framework in which users can annotate tactile feeling to a YouTube video and experience the tactile feeling by wearing a tactile device while watching\annotating the video. The tactile device is embedded into a wearable ...
expand
Assisted news reading with automated illustration
Diogo Delgado, Joao Magalhaes, Nuno Correia
Pages: 1647-1650
doi>10.1145/1873951.1874311
Full text: PDFPDF

We all had the problem of forgetting about what we just read a few sentences before. This comes from the problem of attention and is more common with children and elderly. People feel either bored or distracted by something more interesting. This paper ...
expand
MediaPick: tangible semantic media retrieval system
Gianpaolo D'Amico, Andrea Ferracani, Lea Landucci, Matteo Mancini, Daniele Pezzatini, Nicola Torpei
Pages: 1651-1654
doi>10.1145/1873951.1874312
Full text: PDFPDF

This paper addresses the design and development of MediaPick [1], an interactive multi-touch system for semantic search of multimedia contents. Our solution provides an intuitive, easy-to-use way to select concepts organized according to an ontological ...
expand
Effects of environmental colour on mood: a wearable LifeColour capture device
Aiden R. Doherty, Philip Kelly, Brendan O'Flynn, Padraig Curran, Alan F. Smeaton, Cian O'Mathuna, Noel E. O'Connor
Pages: 1655-1658
doi>10.1145/1873951.1874313
Full text: PDFPDF

Colour is everywhere in our daily lives and impacts things like our mood, yet we rarely take notice of it. One method of capturing and analysing the predominant colours that we encounter is through visual lifelogging devices such as the SenseCam. However ...
expand
DEMONSTRATION SESSION: Demo - D5 track
Paul Natsev
Mobile video browsing and retrieval with the OVIDIUS platform
Andrei Bursuc, Titus Zaharia, Françoise Prêteux
Pages: 1659-1662
doi>10.1145/1873951.1874315
Full text: PDFPDF

This paper describes a mobile video browsing and retrievalapproach, based on the so-called OVIDIUS (On-line VIDeo Indexing Universal System) platform. In contrast with traditional and commercial video retrieval platforms, where video content is treated ...
expand
Serious games for health: personalized exergames
Stefan Göbel, Sandro Hardy, Viktor Wendel, Florian Mehm, Ralf Steinmetz
Pages: 1663-1666
doi>10.1145/1873951.1874316
Full text: PDFPDF

In this paper, we describe a set of personalized exergames which combine methods and concepts of serious games, adaptation and personalization, authoring and sensor technologies. Compared to existing systems, the set of games does not only keep track ...
expand
A GPU-accelerated face annotation system for smartphones
Yi-Chu Wang, Sydney Pang, Kwang-Ting Cheng
Pages: 1667-1668
doi>10.1145/1873951.1874317
Full text: PDFPDF

Face annotation makes it easy to share and manage digital photos and videos. While state-of-the-art face recognition algorithms can achieve high accuracy to support automatic face annotation, their implementations on an embedded platform cannot achieve ...
expand
Crew: cross-modal resource searching by exploiting wikipedia
Chen Liu, Beng Chin Ooi, Anthony K.H. Tung, Dongxiang Zhang
Pages: 1669-1672
doi>10.1145/1873951.1874318
Full text: PDFPDF

In Web 2.0, users have generated and shared massive amounts of resources in various media formats, such as news, blogs, audios, photos and videos. The abundance and diversity of the resources call for better integration to improve the accessibility. ...
expand
Construction of image retrieval systems focused on user knowledge interaction
Tomoko Kajiyama, Shin'ichi Satoh
Pages: 1673-1676
doi>10.1145/1873951.1874319
Full text: PDFPDF

Our objective was to apply different kinds of database with our proposed graphical search interface, and to verify the effectiveness focused on user knowledge structure in searching because it allowed users to easily modify received information to suitable ...
expand
Visualization of concurrent tones in music with colours
Peter Ciuha, Bojan Klemenc, Franc Solina
Pages: 1677-1680
doi>10.1145/1873951.1874320
Full text: PDFPDF

Visualizing music in a meaningful and intuitive way is a challenge. Our aim is to visualize music by interconnecting similar aspects in music and in visual perception. We focus on visualizing harmonic relationships between tones and colours. Related ...
expand
Changing characters' point of view in interactive storytelling
Fred Charles, Julie Porteous, Marc Cavazza
Pages: 1681-1684
doi>10.1145/1873951.1874321
Full text: PDFPDF

Virtual characters are at the epicentre of Interactive Storytelling systems and in recent years multiple AI planning approaches have been described to specify their autonomous behaviour. This demonstrator provides an overview of our novel approach to ...
expand
Speeding up mobile multimedia applications
Jiang Gao
Pages: 1685-1688
doi>10.1145/1873951.1874322
Full text: PDFPDF

Mobile devices are becoming ubiquitous multimedia computing platforms. However, due to limited computational power on these devices, a good mobile application requires far more considerations in algorithm design and optimization than for desktop systems. ...
expand
A multimedia approach to visualize and interact with large scale mobile LiDAR data
James D. Lynch, Xin Chen, Roger B. Hui
Pages: 1689-1692
doi>10.1145/1873951.1874323
Full text: PDFPDF

This paper presents a multimedia visualization tool for large-scale mobile LIDAR, panoramic imagery, high-resolution view targeted camera imagery, and GPS/IMU geo-location. A first of its kind system joins all sensor data providing a powerful tool for ...
expand
Automatic skin enhancement with visible and near-infrared image fusion
Sabine Süsstrunk, Clément Fredembach, Daniel Tamburrino
Pages: 1693-1696
doi>10.1145/1873951.1874324
Full text: PDFPDF

Skin tones, portraits in particular, are of critical importance in photography and video, but a number of factors, such as pigmentation irregularities (e.g., moles, freckles), irritation, roughness, or wrinkles can reduce their appeal. Moreover, such ...
expand
SESSION: Doctoral symposium - DS1 track
Susanne Boll, Carlo Colombo
Free-hand sketch based image and video retrieval
Rui Hu
Pages: 1697-1698
doi>10.1145/1873951.1874326
Full text: PDFPDF

We present an overview of our work to date on a sketch based retrieval of image and video. We present a fast technique for extracting motion trajectories from videos and a Viterbi matching approach for retrieving video clips using free-hand sketched ...
expand
Automatic and manual processes in end-user multimedia authoring tools: where is the balance?
Rodrigo Laiola Guimarães
Pages: 1699-1700
doi>10.1145/1873951.1874327
Full text: PDFPDF

This thesis aims to analyze, model, and develop a framework for next-generation multimedia authoring tools targeted to end-users. In particular, I concentrate on the combination of automatic and manual processes for the realization of such framework. ...
expand
Analysis and classification of conversational interactions
Anna Pesarin
Pages: 1701-1702
doi>10.1145/1873951.1874328
Full text: PDFPDF
Flashboost: design of flash memory buffer cache mechanism for video-on-demand
Moonkyung Ryu
Pages: 1703-1704
doi>10.1145/1873951.1874329
Full text: PDFPDF

A magnetic disk is a serious bottleneck which limits the scalability of a video server due to its head seek overhead. For a video server, Interval Caching is a state-of-the-art caching mechanism that addresses the problem utilizing RAM as a buffer ...
expand
Interoperable and unified multimedia retrieval in distributed and heterogeneous environments
Florian Stegmaier
Pages: 1705-1706
doi>10.1145/1873951.1874330
Full text: PDFPDF

In this abstract, the research topics of my doctoral thesis will be introduced. These emerged within THESEUS1, in which I work as a third-party funded researcher. The overall aim of my work is to provide unified and interoperable multimedia ...
expand
SESSION: Discussion room - DR1 track
Ramesh Jain
Towards a universal detector by mining concepts with small semantic gaps
Jiashi Feng, Yan-tao Zheng, Shuicheng Yan
Pages: 1707-1710
doi>10.1145/1873951.1874332
Full text: PDFPDF

Can we have a universal detector that could recognize unseen objects with no training exemplars available? Such a detector is so desirable, as there are hundreds of thousands of object concepts in human vocabulary but few available labeled image examples. ...
expand
Intelligent query: open another door to 3d object retrieval
Yue Gao, Meng Wang, Jialie Shen, Qionghai Dai, Naiyao Zhang
Pages: 1711-1714
doi>10.1145/1873951.1874333
Full text: PDFPDF

The increasing number of available 3D objects makes their efficient retrieval technology highly desired. Extensive research has been dedicated to view-based 3D object retrieval because of its advantage of 2D views for 3D object content representation. ...
expand
Interactive storytelling via video content recombination
Julie Porteous, Sergio Benini, Luca Canini, Fred Charles, Marc Cavazza, Riccardo Leonardi
Pages: 1715-1718
doi>10.1145/1873951.1874334
Full text: PDFPDF

In the paper we present a prototype of video-based storytelling that is able to generate multiple story variants from a baseline video. The video content for the system is generated by an adaptation of forefront video summarisation techniques that decompose ...
expand
PANEL SESSION: Panel - PA1
Ed Chang, Tat-Seng Chua
The use of non-conventional methods for content analysis and understanding: panel overview
Nicu Sebe, Qi Tian
Pages: 1719-1720
doi>10.1145/1873951.1874336
Full text: PDFPDF

This panel will enable the participants to understand key concepts, state-of-the-art techniques, and open issues in content analysis and understanding that make use of non-conventional methods. As such we will cover aspects such as (1) eye gaze for multimodal ...
expand
PANEL SESSION: Panel - PA2
Ed Chang, Tat-Seng Chua
All things mobile: the present and future of mobile phone computing
Daniel Gatica-Perez
Pages: 1721-1722
doi>10.1145/1873951.1874338
Full text: PDFPDF

This is the summary of the panel All Things Mobile: The Present and Future of Mobile Phone Computing.
expand
PANEL SESSION: Panel - PA3
Ed Chang, Tat-Seng Chua
"Disputatio" on the Use of Ontologies in Multimedia
Simone Santini, Amarnath Gupta
Pages: 1723-1728
doi>10.1145/1873951.1874340
Full text: PDFPDF
WORKSHOP SESSION: Workshop overviews track
eHeritage 2010: 2nd ACM workshop on eHeritage and digital art preservation
Olga Pereira Bellon, Ilan Shimshoni, Matteo Dellepiane
Pages: 1729-1730
doi>10.1145/1873951.1874342
Full text: PDFPDF
ACM workshop on 3d object retrieval: 3DOR'10 chair's welcome
Mohamed Daoudi, Michela Spagnuolo, Remco Veltkamp
Pages: 1731-1732
doi>10.1145/1873951.1874343
Full text: PDFPDF

3D media has emerged rapidly as a new type of content within the multimedia domain. The recent acceleration of 3D content production, witnessed across all fields up to user-generated content, is causing a huge amount of traffic and data stored and transmitted ...
expand
MML 2010: international workshop on machine learning and music
Rafael Ramirez, Darrell Conklin, Christina Anagnostopoulou, José M. Iñesta
Pages: 1733-1734
doi>10.1145/1873951.1874344
Full text: PDFPDF

MML 2010, the International Workshop on Machine Learning and Music, continues a series of workshops related to artificial intelligence and machine learning in music. In this short article the Programme Chairs summarize the content of the workshop.
expand
ACM workshop on mobile video delivery
Mainak Chatterjee, Samrat Ganguly
Pages: 1735-1736
doi>10.1145/1873951.1874345
Full text: PDFPDF
WSM'10: 2nd ACM workshop on social media
Susanne Boll, Steven C.H. Hoi, Roelof van Zwol, Jiebo Luo
Pages: 1737-1738
doi>10.1145/1873951.1874346
Full text: PDFPDF

The ACM SIGMM International Workshop on Social Media (WSM'10) is the second workshop held in conjunction with the ACM International Multimedia Conference (MM'10) at Firenze, Italy, 2010. This workshop provides a forum for researchers and practitioners ...
expand
Modeling, detecting, and processing events in multimedia
Ansgar Scherp, Ramesh Jain, Mohan Kankanhalli, Vasileios Mezaris
Pages: 1739-1740
doi>10.1145/1873951.1874347
Full text: PDFPDF
Second ACM international workshop on multimedia in forensics, security and intelligence (MiFor 2010)
Sebastiano Battiato, Sabu Emmanuel, Adrian Ulges, Marcel Worring
Pages: 1741-1742
doi>10.1145/1873951.1874348
Full text: PDFPDF

This paper introduces the context of the workshop and the associated papers.
expand
The second ACM international workshop on multimedia technologies for distance learning (MTDL 2010)
Timothy K. Shih, Rynson Lau, Nadia Magnenat-Thalmann, Marc Spaniol, Baltasar Fernández-Manjón
Pages: 1743-1744
doi>10.1145/1873951.1874349
Full text: PDFPDF

The MTDL 2010 workshop in its second edition aims to continue in the contribution and evaluation of the impact of multimedia technologies to e-Learning. This workshop is held in conjunction with the ACM Multimedia 2010 Conference in Firenze (Italy). ...
expand
ACM multimedia 2010 workshop on 3D video processing
Oliver Schreer, Adrian Hilton, Emanuele Trucco
Pages: 1745-1746
doi>10.1145/1873951.1874350
Full text: PDFPDF

Research on 3D video processing has gained a tremendous amount of momentum due to advances in video communications, broadcasting and entertainment technology (e.g., animation blockbusters like Avatar and Up). There is an increasing need for reliable ...
expand
Multimedia content with a speech track: ACM multimedia 2010 workshop on searching spontaneous conversational speech
Martha Larson, Roeland Ordelman, Florian Metze, Wessel Kraaij, Franciska de Jong
Pages: 1747-1748
doi>10.1145/1873951.1874351
Full text: PDFPDF
First ACM international workshop on analysis and retrieval of tracked events and motion in imagery streams (ARTEMI 2010)
Anastasios Doulamis, Jordi Gonzàlez
Pages: 1749-1750
doi>10.1145/1873951.1874352
Full text: PDFPDF

The advancement of novel capabilities for video understanding does increase the cross-fertilization between multiple computer vision and pattern recognition research topics. ARTEMIS2010 provides the forum for discussing a holistic view on the interpretation ...
expand
3rd international workshop on automated information extraction in media production
Alberto Messina, Robbie De Sutter, Jean-Pierre Evain, Masanori Sano, Gerald Friedland
Pages: 1751-1752
doi>10.1145/1873951.1874353
Full text: PDFPDF

The third Workshop on Automated Information Extraction in Media Production (AIEMPro10) aims at fostering exchange of ideas and of practices between leading experts in research and leading actors in the media community, in order to catalyze the migration ...
expand
Pervasive video analysis: workshop overview
Hamid Aghajan, Marco Cristani, Vittorio Murino, Nicu Sebe
Pages: 1753-1754
doi>10.1145/1873951.1874354
Full text: PDFPDF

This workshop aims at tackling the novel challenging scenarios in pervasive video analysis which require not only to address specific problems (e.g., tracking, recognition) on a single view, but to deal with a set of distributed observations, eventually ...
expand
ACM workshop on mobile cloud media computing
Xian-Sheng Hua, Gang Hua, Chang Wen Chen
Pages: 1755-1756
doi>10.1145/1873951.1874355
Full text: PDFPDF

Smart mobile devices such as camera phones typically will be carried by people all the time. These devices are true "multimedia" devices that acquire, process, transmit and present text, image, video and audio data. However, due to the limitations in ...
expand
ACM workshop on advanced video streaming techniques for peer-to-peer networks and social networking
Gabriella Olmo, Christian Timmerer, Pascal Frossard, Keith Mitchell
Pages: 1757-1758
doi>10.1145/1873951.1874356
Full text: PDFPDF

This paper provides a summary and overview of the ACM workshop on advanced video streaming techniques for peer-to-peer networks and social networking.
expand
3rd international workshop on affective interaction in natural environments (AFFINE)
Ginevra Castellano, Kostas Karpouzis, Jean-Claude Martin, Louis-Philippe Morency, Christopher Peters, Laurel D. Riek
Pages: 1759-1760
doi>10.1145/1873951.1874357
Full text: PDFPDF

The 3rd International Workshop on Affective Interaction in Natural Environments, AFFINE, follows a number of successful AFFINE workshops and events commencing in 2008.A key aim of AFFINE is the identification and investigation of significant open issues ...
expand
ACM international workshop on social, adaptive and personalized multimedia interaction and access (SAPMIA 2010)
David Vallet, Naeem Ramzan, Martin Halvey, Charalampos Z. Patrikakis
Pages: 1761-1762
doi>10.1145/1873951.1874358
Full text: PDFPDF

In an effort to address and overcome some of the open issues that hinder effective access and interaction of multimedia content, this workshop will bring together individuals from a number of research communities, including but not limited to Multimedia ...
expand
Overview of ACM international workshop on connected multimedia
Zhongfei (Mark) Zhang, Zhengyou Zhang, Ramesh Jain, Yueting Zhuang
Pages: 1763-1764
doi>10.1145/1873951.1874359
Full text: PDFPDF

Following the very first international workshop on connected multimedia held in Hangzhou, China, in October of 2009 jointly sponsored by US National Science Foundation and Zhejiang University of China, this is the very first ACM International Workshop ...
expand
MM'10 workshop summary for SSPW: ACM workshop on social signal processing 2010
Maja Pantic, Alessandro Vinciarelli, Alex Pentland
Pages: 1765-1766
doi>10.1145/1873951.1874360
Full text: PDFPDF

The Workshop on Social Signal Processing (SSPW) is the yearly event of the Social Signal Processing Network (EU-FP7 SSPNet project). This year's workshop programme consists of 4 premium Key Note Talks by Jeff Cohn, Alex Pentland. Justine Cassell, and ...
expand
ACM workshop on surreal media and virtual cloning
Ebroul Izquierdo, Yang Cai, Qianni Zhang, Manuel García-Herranz
Pages: 1767-1768
doi>10.1145/1873951.1874361
Full text: PDFPDF

This paper gives an overview of ACM Multimedia 2010 Workshop on Surreal Media and Virtual Cloning, including research work towards the creation of surreal media and realistic 3D virtual environments where virtual humans and objects can interact remotely. ...
expand
ACM international workshop on very-large-scale multimedia corpus, mining and retrieval (VLS-MCMR'10)
Benoit Huet, Tat-Seng Chua, Alexander Hauptmann
Pages: 1769-1770
doi>10.1145/1873951.1874362
Full text: PDFPDF

The purpose of this workshop is to bring together researchers interested in the construction and analysis of Very Large Scale Multimedia Corpus, as well as the methodologies to Mine and Retrieve information from them. The Workshop will provide a forum ...
expand
TUTORIAL SESSION: Tutorials track
Processing web-scale multimedia data
Malcolm Slaney, Edward Y. Chang
Pages: 1771-1772
doi>10.1145/1873951.1874364
Full text: PDFPDF

The Internet brings us access to multimedia databases with billions of data instances. The massive amount of data available to researchers and application developers brings both opportunities and challenges. In particular, massive amount of data makes ...
expand
Advances in multimedia retrieval, part i: frontiers in multimedia search
Alan Hanjalic, Martha Larson
Pages: 1773-1774
doi>10.1145/1873951.1874365
Full text: PDFPDF
Video search engines: advances in multimedia retrieval, part ii [1]
Cees G.M. Snoek, Arnold W.M. Smeulders
Pages: 1775-1776
doi>10.1145/1873951.1874366
Full text: PDFPDF

In this tutorial, we focus on the challenges in video search, present methods how to achieve state-of-the-art performance, and indicate how to obtain improvements in the near future. Moreover, we give an overview of the latest developments and future ...
expand
Understanding multimedia content using web scale social media data
Dong Xu, Lei Zhang, Jiebo Luo
Pages: 1777-1778
doi>10.1145/1873951.1874367
Full text: PDFPDF

Nowadays, increasingly rich and massive social media data (such as texts, images, audios, videos, blogs, and so on) are being posted to the web, including social networking websites (e.g., MySpace, Facebook), photo and video sharing websites (e.g., Flickr, ...
expand
Mobile video streaming in modern wireless networks
Mohamed Hefeeda, Cheng-Hsin Hsu
Pages: 1779-1780
doi>10.1145/1873951.1874368
Full text: PDFPDF

Increasingly more users use mobile devices to watch videos streamed over wireless networks, and they demand more content at better quality. For example, market forecasts reveal that mobile video streaming, such as mobile TV, will catch up with gaming ...
expand
Immersive future media technologies: from 3D video to sensory experiences
Christian Timmerer, Karsten Müller
Pages: 1781-1782
doi>10.1145/1873951.1874369
Full text: PDFPDF

In this tutorial we present immersive future media technologies ranging from 3D video to sensory experiences. The former targets stereo and multi-view video technologies whereas the latter aims at stimulating other senses than vision or audition enabling ...
expand
Modeling human behavior with mobile phones
Daniel Gatica-Perez
Pages: 1783-1784
doi>10.1145/1873951.1874370
Full text: PDFPDF

In just a few years, mobile phones have emerged as the ultimate multimedia device. This is the summary of a proposed tutorial on Modeling Human Behavior with Mobile Phones, which aims to present the scientific and technological state-of-the-art in mobile ...
expand
Human-centered multimedia systems: tutorial overview
Nicu Sebe, Alejandro Jaimes, Hamid Aghajan
Pages: 1785-1786
doi>10.1145/1873951.1874371
Full text: PDFPDF

This tutorial will focus on technical analysis and interaction techniques formulated from the perspective of key human factors in a user-centered approach to developing multimedia systems. The tutorial will take a holistic view on the research issues ...
expand
Designing and optimizing large-scale multimedia mining applications in distributed processing environments
Deepak S. Turaga, Mihaela van der Schaar
Pages: 1787-1788
doi>10.1145/1873951.1874372
Full text: PDFPDF

In this tutorial, we will present the fundamental principles of large-scale adaptive multimedia stream mining, describe state-of-the-art in terms of systems and algorithms, and include recent theoretical and experimental results. We will also discuss ...
expand

Powered by The ACM Guide to Computing Literature


The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2016 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us

Useful downloads: Adobe Reader    QuickTime    Windows Media Player    Real Player
Did you know the ACM DL App is now available?
Did you know your Organization can subscribe to the ACM Digital Library?
The ACM Guide to Computing Literature
All Tags
Export Formats
 
 
Save to Binder