skip to main content
research-article

Selecting vantage objects for similarity indexing

Published:02 September 2011Publication History
Skip Abstract Section

Abstract

Indexing has become a key element in the pipeline of a multimedia retrieval system, due to continuous increases in database size, data complexity, and complexity of similarity measures. The primary goal of any indexing algorithm is to overcome high computational costs involved with comparing the query to every object in the database. This is achieved by efficient pruning in order to select only a small set of candidate matches. Vantage indexing is an indexing technique that belongs to the category of embedding or mapping approaches, because it maps a dissimilarity space onto a vector space such that traditional access methods can be used for querying. Each object is represented by a vector of dissimilarities to a small set of m reference objects, called vantage objects. Querying takes place within this vector space. The retrieval performance of a system based on this technique can be improved significantly through a proper choice of vantage objects. We propose a new technique for selecting vantage objects that addresses the retrieval performance directly, and present extensive experimental results based on three data sets of different size and modality, including a comparison with other selection strategies. The results clearly demonstrate both the efficacy and scalability of the proposed approach.

References

  1. Arkin, E. M., Chew, L., Huttenlocher, D., Kedem, K., and Mitchell, J. 1991. An efficiently computable metric for comparing polygonal shapes. Patt. Anal. Mach. Intell.13, 3, 209--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., and Wu, A. 1994. An optimal algorithm for approximate nearest neighbor searching. In Proceedings of the 5th ACM SIAM Symposium on Discrete Algorithms. 573--582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Athitsos, V., Alon, J., Sclaroff, S., and Kollios, G. 2004. Boostmap: A method for efficient approximate similarity rankings. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04). Vol. 2, IEEE, Los Alamitos, CA, 268--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Beckmann, N., Kriegel, H., Schneider, R., and Seeger, B. 1990. The r*-tree: An efficient and robust access method for points and rectangles. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'90). ACM, New York, 322--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bentley, J. 1975. Binary search trees used for associative searching. Comm. ACM 18, 9, 507--519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bóhm, C., Berchtold, S., and Keim, D. A. 2001. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33, 3, 322--373. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bozkaya, T. and Ozsoyoglu, M. 1997. Distance-based indexing for high-dimensional metric spaces. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 97). ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bozkaya, T. and Ozsoyoglu, M. 1999. Indexing large metric spaces for similarity search queries. Trans. Datab. Syst. 24, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brisaboa, N., Farina, A., Pedreira, O., and Reyes, N. 2006. Similarity search using sparse pivots for efficient multimedia information retrieval. In Proceedings of the 8th IEEE International Symposium on Multimedia (ISM'06). IEEE, Los Alamitos, CA, 881--888. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Buckley, C. and Voorhees, E. M. 2000. Evaluating evaluation measure stability. In Research and Development in Information Retrieval, 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bustos, B., Navarro, G., and Chavez, E. 2003. Pivot selection techniques for proximity searching in metric spaces. Patt. Recogn. Lett. 2357--2366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chavez, E. and Navarro, G. 2001. Searching in metric spaces. ACM Comput. Surv. 33, 3, 273--321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ciaccia, P., Patella, M., and Zezula, P. 1997. M-tree: An efficient access method for similarity search in metric spaces. In Proceedings of the 23rd VLDB Conference. 426--435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Faloutsos, C. and Lin, K.-I. 1995. FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'95). ACM, New York, 163--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gaede, V. and Gunther, O. 1998. Multidimensional access methods. ACM Comput. Surv. 30, 2, 170--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Giannopoulos, P. and Veltkamp, R. C. 2002. A pseudo-metric for weighted point sets. In Proceedings of the European Conference on Computer Vision (ECCV'02). Lecture Notes in Computer Science, vol. 2352, Springer, Berlin, 715--730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gutman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'84). ACM, New York, 47--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Henning, C. and Latecki, L. J. 2003. The choice of vantage objects for image retrieval. Patt. Recogn. 36, 9, 2187--219Google ScholarGoogle ScholarCross RefCross Ref
  19. Histecru, G. and Farach-Colton, M. 1999. Cluster-preserving embeddings of proteins. Tech. rep., Rutgers University, Piscataway, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hjaltason, G. and Samet, H. 2003. Properties of embedding methods for similarity searching in metric spaces. Patt. Anal. Mach. Intell. 25, 5, 530--549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hristescu, G. and Farach-Colton, M. 1999. Cluster-preserving embedding of proteins. Tech. rep. 99-50, DIMACS 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kruskal, J. and Wish, M. 1978. Multidimensional Scaling. Sage Publications, Beverly Hills, CA.Google ScholarGoogle Scholar
  23. Latecki, L. J., Lakaemper, R., and Eckhardt, U. 2000. Shape descriptors for non-rigid shapes with a single closed contour. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 424--429.Google ScholarGoogle Scholar
  24. Linial, N., London, E., and Rabinovich, Y. 1995. The geometry of graphs and some of its algorithmic applications. Combinatorica 15, 215--245.Google ScholarGoogle ScholarCross RefCross Ref
  25. Mokhtarian, F., Abbasi, S., and Kittler, J. 1996. Efficient and robust retrieval by shape content through curvature scale space. In Proceedings of the British Machine and Vision Conference (BMVC'96).Google ScholarGoogle Scholar
  26. Pekalska, E., Duin, R., and Paclik, P. 2005. Prototype selection for dissimilarity-based classifiers. In Pattern Recognition, Elsevier, Amsterdam, 189--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Rubner, Y., Tomasi, C., and Guibas, L. 1998. A metric for distributions with applications to image databases. In Proceedings of the IEEE 6th International Conference on Computer Vision (ICCV'98). IEEE, Los Alamitos, CA, 59--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Samet, H. 2006. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sellis, T. K., Roussopoulos, N., and Faloutsos, C. 1987. The r-tree: A dynamic index for multi-dimensional objects. In Proceedings of the Conference on Very Large Databases (VLDB). 507--518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Typke, R., Giannopoulos, P., Veltkamp, R. C., Wiering, F., and van Oostrum, R. 2003. Using transportation distances for measuring melodic similarity. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 107--114.Google ScholarGoogle Scholar
  31. van Leuken, R. H., Veltkamp, R. C., and Typke, R. 2006. Selecting vantage objects for similarity indexing. In Proceedings of the International Conference on Pattern Recognition (ICPR). 453--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Venkateswaran, J., Lachwani, D., Kahveci, T., and Jermaine, C. 2006. Reference-based indexing of sequence databases. In Proceedings of the Conference on Very Large Databases (VLDB). 906--917. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Vleugels, J. and Veltkamp, R. C. 2002. Efficient image retrieval through vantage objects. In Pattern Recognition, 69--80.Google ScholarGoogle Scholar
  34. Wang, X., Wang, J. T.-L., Lin, K.-I., Shasha, D., Shapiro, B. A., and Zhang, K. 2000. An index structure for data mining and clustering. In Knowledge and Information Systems, 161--184.Google ScholarGoogle Scholar
  35. Yianilos, P. N. 1993. Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the 4th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). ACM, New York, 311--321. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Selecting vantage objects for similarity indexing

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 7, Issue 3
          August 2011
          117 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/2000486
          Issue’s Table of Contents

          Copyright © 2011 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 September 2011
          • Accepted: 1 January 2010
          • Revised: 1 September 2009
          • Received: 1 July 2008
          Published in tomm Volume 7, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!