skip to main content
research-article

Learning Hurdles for Sleeping Experts

Published:01 July 2014Publication History
Skip Abstract Section

Abstract

We study the online decision problem in which the set of available actions varies over time, also called the sleeping experts problem. We consider the setting in which the performance comparison is made with respect to the best ordering of actions in hindsight. In this article, both the payoff function and the availability of actions are adversarial. Kleinberg et al. [2010] gave a computationally efficient no-regret algorithm in the setting in which payoffs are stochastic. Kanade et al. [2009] gave an efficient no-regret algorithm in the setting in which action availability is stochastic.

However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. [2010]. We show that such an algorithm would imply an algorithm for PAC learning DNF, a long-standing important open problem. We also consider the setting in which the number of available actions is restricted and study its relation to agnostic-learning monotone disjunctions over examples with bounded Hamming weight.

References

  1. J. Abernethy. 2010. Can we learn to gamble efficiently? (open problem). In Proceedings of the 23rd Annual Conference on Learning Theory. 318--319.Google ScholarGoogle Scholar
  2. S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. 1998. Proof verification and the hardness of approximation problems. J. ACM 45, 501--555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Ben-David, D. Pál, and S. Shalev-Shwartz. 2009. Agnostic online learning. In Proceedings of the 22nd Annual Conference on Learning Theory.Google ScholarGoogle Scholar
  4. A. Beygelzimer, J. Langford, L. Li, L. Reyzin, and R. E. Schapire. 2011. Contextual bandit algorithms with supervised learning guarantees. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR Proceedings Track, 19--26.Google ScholarGoogle Scholar
  5. A. Blum and Y. Mansour. 2007. From external to internal regret. J. Machine Learn. Res. 8, 1307--1324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. Cesa-Bianchi, A. Conconi, and C. Gentile. 2004. On the generalization ability of on-line learning algorithms. IEEE Trans. Inform. Theory 50, 9, 2050--2057. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Cesa-Bianchi and G. Lugosi. 2006. Prediction, Learning, and Games. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. P. Dubhashi and A. Panconesi. 2009. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Dudik, D. Hsu, S. Kale, N. Karampatziakis, J. Langford, L. Reyzin, and T. Zhang. 2011. Efficient optimal learning for contextual bandits. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelliegence, 169--178.Google ScholarGoogle Scholar
  10. Y. Freund and R. E. Schapire. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the 2nd European Conference on Computational Learning Theory. 23--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Freund, R. E. Schapire, Y. Singer, and M. K. Warmuth. 1997. Using and combining predictors that specialize. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing. ACM, New York, NY, 334--343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Håstad. 2001. Some optimal inapproximability results. J. ACM 48, 798--859. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Haussler. 1992. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Inform. Computat. 100, 78--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Hazan, S. Kale, and S. Shalev-Shwartz. 2012. Near-optimal algorithms for online matrix prediction. In Proceedings of the 25th Annual Conference on Learning Theory, Vol. 23, JMLR Proceedings Track, 38.1--38.13.Google ScholarGoogle Scholar
  15. A. T. Kalai, V. Kanade, and Y. Mansour. 2009. Reliable agnostic learning. In Proceedings of the 22nd Annual Conference on Learning Theory.Google ScholarGoogle Scholar
  16. A. T. Kalai, A. R. Klivans, Y. Mansour, and R. A. Servedio. 2005. Agnostically learning halfspaces. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. V. Kanade, B. McMahan, and B. Bryan. 2009. Sleeping experts and bandits with stochastic action availability and adversarial rewards. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. JMLR Proceedings Track, 272--279.Google ScholarGoogle Scholar
  18. M. J. Kearns, R. E. Schapire, and L. M. Sellie. 1994. Toward efficient agnostic learning. Machine Learn. 17, 2--3, 115--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. 2010. Regret bounds for sleeping experts and bandits. Machine Learn. 80, 2--3, 245--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. R. Klivans and R. A. Servedio. 2001. Learning DNF in time 2Õ(n1/3). In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing. ACM, New York, NY, 258--265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. R. Klivans and A. Sherstov. 2007. A lower bound for agnostically learning disjunctions. In Proceedings of the 20th Annual Conference on Learning Theory. 409--423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Langford and T. Zhang. 2007. The epoch-greedy algorithm for contextual multi-armed bandits. In Proceedings of the 23rd Annual Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  23. N. Littlestone. 1989. From on-line to batch learning. In Proceedings of the 2nd Annual Workshop on Computational Learning Theory. 269--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. H. Papadimitriou and M. Yannakakis. 1991. Optimization, approximation, and complexity classes. J. Comput. System Sci. 43, 3, 425--440.Google ScholarGoogle ScholarCross RefCross Ref
  25. S. Shalev-Shwartz, O. Shamir, and K. Sridharan. 2010. Learning kernel-based halfspaces with the zero-one loss. In Proceedings of the 23rd Annual Conference on Learning Theory. 441--450.Google ScholarGoogle Scholar
  26. L. G. Valiant. 1984. A theory of the learnable. Commun. ACM 27, 11, 1134--1142. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning Hurdles for Sleeping Experts

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Computation Theory
        ACM Transactions on Computation Theory  Volume 6, Issue 3
        Special issue on innovations in theoretical computer science 2012 - Part II
        July 2014
        107 pages
        ISSN:1942-3454
        EISSN:1942-3462
        DOI:10.1145/2663945
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 July 2014
        • Accepted: 1 July 2013
        • Revised: 1 June 2013
        • Received: 1 September 2012
        Published in toct Volume 6, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!