Searched for keywords.author.keyword:"multimedia and multimodal retrieval" OR acmdlCCS:"multimedia and multimodal retrieval"  [new search]  [edit/save query]  [advanced search]
Searched The ACM Full-Text Collection: 569,850 records   [Expand your search to The ACM Guide to Computing Literature: 2,871,578 records] Help: ACM vs. Guide
2,048 results found
Export Results: bibtexendnoteacmrefcsv

Refine by People
Names show/hide
Institutions show/hide
Authors show/hide
Editors show/hide
Reviewers show/hide
Refine by Publications
Publication Names show/hide
ACM Publications show/hide
All Publications show/hide
Content Formats show/hide
Publishers show/hide
Refine by Conferences
Sponsors show/hide
Events show/hide
Proceeding Series show/hide
Refine by Publication Year
1965
Result 1 – 20 of 2,048
Result page: 1 2 3 4 5 6 7 8 9 10 >>

Sort by:

1 published by ACM
November 2014 MAED '14: Proceedings of the 3rd ACM International Workshop on Multimedia Analysis for Ecological Data
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 2,   Downloads (12 Months): 19,   Downloads (Overall): 80

Full text available: PDFPDF
To build a detailed knowledge of the biodiversity, the geographical distribution and the evolution of the alive species is essential for a sustainable development and the preservation of this biodiversity. Massive databases of underwater video surveillance have been recently made available for supporting designing algorithms targeting the identification of fishes. ...
Keywords: specialized information retrieval, video search, multimedia and multimodal retrieval
[result highlights]

2 published by ACM
October 2017 MM '17: Proceedings of the 25th ACM international conference on Multimedia
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 12,   Downloads (12 Months): 72,   Downloads (Overall): 185

Full text available: PDFPDF
Subspace representations have been widely applied for videos in many tasks. In particular, the subspace-based query-by-image video retrieval (QBIVR), facing high challenges on similarity-preserving measurements and efficient retrieval schemes, urgently needs considerable research attention. In this paper, we propose a novel subspace-based QBIVR framework to enable efficient video search. We ...
Keywords: asymmetric hashing, geometry-preserving distance metric, query-by-image, video retrieval
[result highlights]

3 published by ACM
October 2016 MM '16: Proceedings of the 24th ACM international conference on Multimedia
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 5,   Downloads (12 Months): 17,   Downloads (Overall): 160

Full text available: PDFPDF
While successful on broadcast news, meetings or telephone conversation, state-of-the-art speaker diarization techniques tend to perform poorly on TV series or movies. In this paper, we propose to rely on state-of-the-art face clustering techniques to guide acoustic speaker diarization. Two approaches are tested and evaluated on the first season of ...
Keywords: face clustering, speaker diarization, talking-face detection
[result highlights]

4 published by ACM
October 2016 MM '16: Proceedings of the 24th ACM international conference on Multimedia
Publisher: ACM
Bibliometrics:
Citation Count: 23
Downloads (6 Weeks): 19,   Downloads (12 Months): 326,   Downloads (Overall): 1,033

Full text available: PDFPDF
Delivering wide-angle and high-resolution spherical panoramic video content entails a high streaming bitrate. This imposes challenges when panorama clips are consumed in virtual reality (VR) head-mounted displays (HMD). The reason is that the HMDs typically require high spatial and temporal fidelity contents and strict low-latency in order to guarantee the ...
Keywords: head-mounted display (hmd), hevc, panoramic video streaming, tiles, video coding, virtual reality
[result highlights]

5 published by ACM
December 2016 MANPU '16: Proceedings of the 1st International Workshop on coMics ANalysis, Processing and Understanding
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1,   Downloads (12 Months): 9,   Downloads (Overall): 55

Full text available: PDFPDF
Despite the widespread research interest given in the recent years in analyzing the structure and content of comic books, the question of how to effectively query and retrieve comic images stays a challenge, due to the substantial differences between them and naturalistic images. In this paper, we present a scheme ...
Keywords: CBIR, query-by-example, attributed region adjacency graph, structural pattern recognition, comics
[result highlights]

6 published by ACM
October 2016 MM '16: Proceedings of the 2016 ACM on Multimedia Conference
Publisher: ACM
Bibliometrics:
Citation Count: 4
Downloads (6 Weeks): 2,   Downloads (12 Months): 41,   Downloads (Overall): 342

Full text available: PDFPDF
Cross-modal retrieval has been attracting increasing attention because of the explosion of multi-modal data, e.g., texts and images. Most supervised cross-modal retrieval methods learn discriminant common subspaces minimizing the heterogeneity of different modalities by exploiting the label information. However, these methods neglect the fact that, in practice, the given labels ...
Keywords: cross-modal retrieval, label completion
[result highlights]

7 published by ACM
June 2017 ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 5,   Downloads (12 Months): 25,   Downloads (Overall): 118

Full text available: PDFPDF
Due to the increasing availability of image and multimedia collections, unsupervised post-processing methods, which are capable of improving the effectiveness of retrieval results without the need of user intervention, have become indispensable. This paper presents the Unsupervised Distance Learning Framework (UDLF), a software which enables an easy use and evaluation ...
Keywords: content-based image retrieval, rank-aggregation, re-ranking, unsupervised learning
[result highlights]

8 published by ACM
July 2019 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) - Special Section on Cross-Media Analysis for Visual Question Answering, Special Section on Big Data, Machine Learning and AI Technologies for Art and Design and Special Section on MMSys/NOSSDAV 2018: Volume 15 Issue 2s, August 2019
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 17,   Downloads (12 Months): 56,   Downloads (Overall): 56

Full text available: HtmlHtml  PDFPDF
As an indispensable process of cross-media analyzing, comprehending heterogeneous data faces challenges in the fields of visual question answering (VQA), visual captioning, and cross-modality retrieval. Bridging the semantic gap between the two modalities is still difficult. In this article, to address the problem in cross-modality retrieval, we propose a cross-modal ...
Keywords: Cross-modality retrieval, MLP, auto-encoder, joint loss
[result highlights]

9 published by ACM
May 2016 MMSys '16: Proceedings of the 7th International Conference on Multimedia Systems
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 4,   Downloads (12 Months): 16,   Downloads (Overall): 149

Full text available: PDFPDF
In this paper, we present Heimdallr, a dataset that aims to serve two different purposes. The first purpose is action recognition and pose estimation, which requires a dataset of annotated sequences of athlete skeletons. We employed a crowdsourcing platform where people around the world were asked to annotate frames and ...
Keywords: data set, interactive, soccer, crowdsourcing, multimedia
[result highlights]

10 published by ACM
October 2016 ICMI '16: Proceedings of the 18th ACM International Conference on Multimodal Interaction
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 3,   Downloads (12 Months): 53,   Downloads (Overall): 171

Full text available: PDFPDF
Research has demonstrated that humans require different amounts of information, over time, to accurately perceive emotion expressions. This varies as a function of emotion classes. For example, recognition of happiness requires a longer stimulus than recognition of anger. However, previous automatic emotion recognition systems have often overlooked these differences. In ...
Keywords: Audio-Visual, Emotion, Emotion Classification, Emotion Spotting, Temporal Evidence
[result highlights]

11 published by ACM
August 2017 SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 8,   Downloads (12 Months): 80,   Downloads (Overall): 323

Full text available: PDFPDF
Fine-grained Sketch-based Image Retrieval (Fine-grained SBIR), which uses hand-drawn sketches to search the target object images, has been an emerging topic over the last few years. The difficulties of this task not only come from the ambiguous and abstract characteristics of sketches with less useful information, but also the cross-modal ...
Keywords: deep multimodal embedding, fine-grained sketch-based image retrieval (fine-grained sbir), multimodal ranking loss
[result highlights]

12 published by ACM
June 2018 ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 9,   Downloads (12 Months): 129,   Downloads (Overall): 178

Full text available: PDFPDF
Current hashing methods for cross-modal retrieval generally attempt to learn the separate modality-specific transformation matrices to embed multi-modality data into a latent common subspace, and usually ignore the fact that respecting the diversity of multi-modality features in the latent subspace could be beneficial for retrieval improvements. To this, we propose ...
Keywords: cross-modal correlation, cross-modal hashing, cross-modal retrieval, graph hashing, subspace learning
[result highlights]

13 published by ACM
June 2019 MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 12,   Downloads (12 Months): 32,   Downloads (Overall): 32

Full text available: PDFPDF
In this demo, we demonstrate an in-device conversational photo sharing service, termed meChat, which helps users share in-device photos easily in messaging applications by searching conversation-related photos automatically. In particular, meChat understands the semantics of on-going conversation and in-device photos by projecting both of them into a single semantic space. ...
Keywords: on-device intelligence, personal service
[result highlights]

14 published by ACM
October 2018 MM '18: Proceedings of the 26th ACM international conference on Multimedia
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 8,   Downloads (12 Months): 110,   Downloads (Overall): 110

Full text available: PDFPDF
Cross-modal retrieval between visual data and natural language description remains a long-standing challenge in multimedia. While recent image-text retrieval methods offer great promise by learning deep representations aligned across modalities, most of these methods are plagued by the issue of training with small-scale datasets covering a limited number of images ...
Keywords: image-text retrieval, joint embedding, webly supervised learning
[result highlights]

15 published by ACM
June 2017 CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 0,   Downloads (12 Months): 9,   Downloads (Overall): 29

Full text available: PDFPDF
The evolution of technologies to store and share images has made imperative the need for methods to index and retrieve multimedia information based on visual content. The CBIR (Content-Based Image Retrieval) systems are the main solution in this scenario. Originally, these systems were solely based on the use of low-level ...
Keywords: combination, content-based image retrieval, correlation, effectiveness, rank-aggregation, re-ranking, unsupervised learning
[result highlights]

16 published by ACM
October 2016 MM '16: Proceedings of the 24th ACM international conference on Multimedia
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 2,   Downloads (12 Months): 20,   Downloads (Overall): 222

Full text available: PDFPDF
In the literature of cross-modal search, most methods employ linear models to pursue hash codes that preserve data similarity, in terms of Euclidean distance, both within-modal and across-modal. However, data dimensionality can be quite different across modalities. It is known that the behavior of Euclidean distance/similarity between datapoints can be ...
Keywords: cross-modal, hashing, neighborhood-preserving
[result highlights]

17 published by ACM
October 2016 MM '16: Proceedings of the 24th ACM international conference on Multimedia
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 3,   Downloads (12 Months): 13,   Downloads (Overall): 123

Full text available: PDFPDF
Person discovery in the absence of prior identity knowledge requires accurate association of visual and auditory cues. In broadcast data, multimodal analysis faces additional challenges due to narrated voices over muted scenes or dubbing in different languages. To address these challenges, we define and analyze the problem of dubbing detection ...
Keywords: multimodal, person diarization, recurrent neural networks
[result highlights]

18 published by ACM
October 2017 VSCC '17: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 2,   Downloads (12 Months): 39,   Downloads (Overall): 80

Full text available: PDFPDF
Most of existing cross-modal retrieval approaches only exploit labeled data to train coupled projection matrices for supporting retrieval tasks across heterogeneous modalities. However, the valuable information involved in unlabeled data is unfortunately ignored. In this paper, we propose a novel Semi-Supervised Distance Consistent method (SSDC) to solve the problem. Our ...
Keywords: pseudo label, semi-supervised, cross-modal retrieval
[result highlights]

19 published by ACM
June 2018 ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 3,   Downloads (12 Months): 115,   Downloads (Overall): 178

Full text available: PDFPDF
Multimedia data available in various disciplines are usually heterogeneous, containing representations in multi-views, where the cross-modal search techniques become necessary and useful. It is a challenging problem due to the heterogeneity of data with multiple modalities, multi-views in each modality and the diverse data categories. In this paper, we propose ...
Keywords: cross-modal hashing, metric learning, multi-view learning, tensor factorization
[result highlights]

20 published by ACM
June 2019 ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 5,   Downloads (12 Months): 47,   Downloads (Overall): 47

Full text available: PDFPDF
Despite the major advances on feature development for low and mid-level representations, a single visual feature is often insufficient to achieve effective retrieval results in different scenarios. Since diverse visual properties provide distinct and often complementary information for a same query, the combination of different features, including handcrafted and learned ...
Keywords: content-based image retrieval, effectiveness estimation, genetic algorithm, rank-aggregation, re-ranking, unsupervised learning
[result highlights]

Result 1 – 20 of 2,048
Result page: 1 2 3 4 5 6 7 8 9 10 >>



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2019 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us