10.1145/1143844.1143857acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicpsprocConference Proceedings
ARTICLE

Cover trees for nearest neighbor

ABSTRACT

We present a tree data structure for fast nearest neighbor operations in general n-point metric spaces (where the data set consists of n points). The data structure requires O(n) space regardless of the metric's structure yet maintains all performance properties of a navigating net (Krauthgamer & Lee, 2004b). If the point set has a bounded expansion constant c, which is a measure of the intrinsic dimensionality, as defined in (Karger & Ruhl, 2002), the cover tree data structure can be constructed in O (c6n log n) time. Furthermore, nearest neighbor queries require time only logarithmic in n, in particular O (c12 log n) time. Our experimental results show speedups over the brute force search varying between one and several orders of magnitude on natural machine learning datasets.

References

  1. Beygelzimer, A., Kakade, S., & Langford, J. (2005). Cover trees for nearest neighbor. Available at http://hunch.net/~jl/projects/cover_tree. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Clarkson, K. (1999). Nearest neighbor queries in metric spaces. Discrete and Computational Geometry, 22, 63--93.Google ScholarGoogle ScholarCross RefCross Ref
  3. Clarkson, K. (2002). Nearest neighbor searching in metric spaces: Experimental results for sb(s). http://cm.bell-labs.com/who/clarkson/Msb/readme.html.Google ScholarGoogle Scholar
  4. Friedman, J., Bentley, J., & Finkel, R. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3, 209--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Gray, A., & Moore, A. (2000). N-body problems in statistical learning. Advances in Neural Information Processing Systems, 13, 521--527.Google ScholarGoogle Scholar
  6. Gupta, A., Krauthgamer, R., & Lee, J. (2003). Bounded geometries, fractals, and low-distortion embeddings. Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science (pp. 534--543). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Har-Peled, S., & Mendel, M. (2006). Fast constructions of nets in low dimensional metrics and their applications. SIAM Journal on Computing, 35, 1148--1184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Karger, D., & Ruhl, M. (2002). Finding nearest neighbors in growth restricted metrics. Proceedings of the 34th Annual ACM Symposium on Theory of Computing (pp. 741--750). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Krauthgamer, R., & Lee, J. (2004a). The black-box complexity of nearest neighbor search. Proceedings of the 31st International Colloquium on Automata, Languages and Programming (pp. 858--869).Google ScholarGoogle ScholarCross RefCross Ref
  10. Krauthgamer, R., & Lee, J. (2004b). Navigating nets: Simple algorithms for proximity search. Proceedings of the 15th Annual Symposium on Discrete Algorithms (pp. 791--801). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Laviolette, F., Marchand, M., & Shah, M. (2005). A PAC-bayes approach to the set covering machine. Advances in Neural Information Processing Systems, 18.Google ScholarGoogle Scholar
  12. Omohundro, S. (1987). Efficient algorithms with neural network behavior. Journal of Complex Systems, 1, 273--347.Google ScholarGoogle Scholar
  13. Uhlmann, J. (1991). Satisfying general proximity/similarity queries with metric trees. Information Processing Letters, 40, 175--179.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Cover trees for nearest neighbor

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!