Yoram Singer Authors: Add personal information
 Affiliation history
Bibliometrics: publication history
SEARCH
ROLE
Author only

AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas

BOOKMARK & SHARE

 #resultstats2 { width:100%; padding: 10px; background-color:#ECE9D8; /*background:#9c9;*/ } #resultstats { width:100%; /*background-color:#ECE9D8;*/ font-size: 12px; /*background:#9c9;*/ } #resultstats a:link { text-decoration: none; } #resultstats div { padding-bottom: 5px; padding-top: 5px; } #resfound { background-color: #ece9d8; padding-left:5px } #searchtools { display:inline; float:right; padding-right: 10px; } #refinements { padding-top: 5px; } #refinements table { margin-left: 10px; } #refinements tr { vertical-align:top; } #refinements span { font-size: 12px; } .rectots { font-size:12px; } .problem { font-size:12px; color: red; } div.problem { height: 400px; } #results { /* float:right; */ /* width:80%; */ color: #000000; font-family: Verdana,Arial,Helvetica,sans-serif; font-size: 12px; padding-top: 10px; /*background:#9c9;*/ } #aggs { float:left; width:20%; font-family: Verdana,Arial,Helvetica,sans-serif; font-size:12px; } #pagelogic { font-family: Verdana,Arial,Helvetica,sans-serif; font-size:12px; float:right; padding-bottom: 10px; } #pagerange { font-family: Verdana,Arial,Helvetica,sans-serif; font-size:12px; float:left; padding-bottom: 10px; } #resultmenu { font-family: Verdana,Arial,Helvetica,sans-serif; font-size:12px; float:right; padding-bottom: 10px; } #resultmenu label { margin-right: 0px; vertical-align:bottom; } #resultmenu input { margin-right: 0px; vertical-align:bottom; } #resultmenu select { margin-right: 0px; background-color: #aff; border: 0; border-radius: 0; font-size: 12px; } #resultmenu option { font-size: 12px; background-color: white; } .aggHead { font-weight: bold; padding-bottom: 0; padding-top: 10px; } #aggs ul{ list-style-type: none; margin-top: 0.25em; padding-left: 0; font-size: 1em; } #upcevents li{ padding-bottom: 1.25em; } #upcevents div{ margin-top: 25px; } a.showhide:link { text-decoration: none; } #results .numbering{ font-size: 12px; font-weight:bold; width:30px; float:left; text-align: right; padding-top: 2px; } #results .details{ font-size: 12px; width:92%; float:right; padding-bottom: 20px; } #results .title{ font-size: 14px; padding-bottom:5px; } #results .authors{ font-size: 12px; padding-bottom:5px; } #results .source{ font-size: 12px; padding-bottom:5px; } #results .publisher{ font-size: 12px; padding-bottom:5px; } #results .metrics{ font-size: 12px; padding-bottom:5px; } #results .metricsCol1{ float: left; display:inline; padding-bottom:5px; } #results .metricsCol2{ float: right; display: inline; width: 590px; padding-bottom:5px; } #results .ft{ font-size: 12px; padding-bottom:5px; } #results .abstract{ font-size: 12px; padding-bottom:5px; } #results .kw{ font-size: 12px; padding-bottom:5px; } #results .pubother{ font-size: 12px; padding-bottom:5px; } .izers { margin-top: 10px; margin-bottom:10px; } .highlights em{ background-color: #FFFF00; font-style: normal; } .publicationDate { background-color: #aff; } .citedCount { } .download6Weeks { } .download12Months { } .downloadAll { } 111 results found Export Results: bibtex | endnote | acmref | csv Result 1 – 20 of 111 Result page: 1 2 3 4 5 6 Sort by: relevance publication date citation count downloads (6 Weeks) downloads (12 months) downloads (overall) 1 December 2016 NIPS'16: Proceedings of the 30th International Conference on Neural Information Processing Systems Publisher: Curran Associates Inc. Bibliometrics: Citation Count: 0 Downloads (6 Weeks): 1,   Downloads (12 Months): 1,   Downloads (Overall): 1 Full text available: PDF We develop a general duality between neural networks and compositional kernel Hilbert spaces. We introduce the notion of a computation skeleton, an acyclic graph that succinctly describes both a family of neural networks and a kernel space. Random neural networks are generated from a skeleton through node replication followed by ... 2 June 2016 ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 Publisher: JMLR.org Bibliometrics: Citation Count: 6 We show that parametric models trained by a stochastic gradient method (SGM) with few iterations have vanishing generalization error. We prove our results by arguing that SGM is algorithmically stable in the sense of Bousquet and Elisseeff. Our analysis only employs elementary tools from convex and continuous optimization. We derive ... 3 January 2016 The Journal of Machine Learning Research: Volume 17 Issue 1, January 2016 Publisher: JMLR.org Bibliometrics: Citation Count: 1 Downloads (6 Weeks): 3,   Downloads (12 Months): 13,   Downloads (Overall): 25 Full text available: PDF Matrix approximation is a common tool in recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is low-rank. In this paper, we propose, analyze, and experiment with two procedures, one parallel and the other global, for constructing local matrix ... Keywords: kernel smoothing, recommender systems, matrix approximation, collaborative filtering, non-parametric methods 4 April 2014 WWW '14: Proceedings of the 23rd international conference on World wide web Publisher: ACM Bibliometrics: Citation Count: 27 Downloads (6 Weeks): 13,   Downloads (12 Months): 127,   Downloads (Overall): 1,079 Full text available: PDF Personalized recommendation systems are used in a wide variety of applications such as electronic commerce, social networks, web search, and more. Collaborative filtering approaches to recommendation systems typically assume that the rating matrix (e.g., movie ratings by viewers) is low-rank. In this paper, we examine an alternative approach in which ... Keywords: collaborative filtering, ranking, recommender systems 5 September 2013 ECMLPKDD'13: Proceedings of the 2013th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III Publisher: Springer-Verlag Bibliometrics: Citation Count: 0 We describe a new, simplified, and general analysis of a fusion of Nesterov's accelerated gradient with parallel coordinate descent. The resulting algorithm, which we call BOOM, for boo sting with m omentum, enjoys the merits of both techniques. Namely, BOOM retains the momentum and convergence properties of the accelerated gradient ... Keywords: accelerated gradient, coordinate descent, boosting 6 June 2013 ICML'13: Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 Publisher: JMLR.org Bibliometrics: Citation Count: 5 Matrix approximation is a common tool in recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is of low-rank. We propose a new matrix approximation model where we assume instead that the matrix is locally of low-rank, leading to ... 7 July 2011 EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing Publisher: Association for Computational Linguistics Bibliometrics: Citation Count: 0 Downloads (6 Weeks): 1,   Downloads (12 Months): 6,   Downloads (Overall): 56 Full text available: PDF We discuss and analyze the problem of finding a distribution that minimizes the relative entropy to a prior distribution while satisfying max-norm constraints with respect to an observed distribution. This setting generalizes the classical maximum entropy problems as it relaxes the standard constraints on the observed values. We tackle the ... 8 July 2011 The Journal of Machine Learning Research: Volume 12, 2/1/2011 Publisher: JMLR.org Bibliometrics: Citation Count: 329 Downloads (6 Weeks): 20,   Downloads (12 Months): 172,   Downloads (Overall): 1,670 Full text available: PDF We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning. Metaphorically, the adaptation allows us to find needles in haystacks in the form of very predictive but rarely seen features. Our paradigm ... 9 March 2011 Mathematical Programming: Series A and B: Volume 127 Issue 1, March 2011 Publisher: Springer-Verlag New York, Inc. Bibliometrics: Citation Count: 0 We describe and analyze a simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy $${\epsilon}$$ is $${\tilde{O}(1 / \epsilon)}$$, where each iteration operates on a single training ... Keywords: Second, First, More, SVM, Stochastic gradient descent 10 March 2011 Mathematical Programming: Series A and B - Special Issue on "Optimization and Machine learning"; Alexandre d’Aspremont • Francis Bach • Inderjit S. Dhillon • Bin Yu: Volume 127 Issue 1, March 2011 Publisher: Springer-Verlag New York, Inc. Bibliometrics: Citation Count: 125 We describe and analyze a simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy $${\epsilon}$$ is $${\tilde{O}(1 / \epsilon)}$$, where each iteration operates on a single training ... Keywords: Stochastic gradient descent 11 September 2010 Machine Learning: Volume 80 Issue 2-3, September 2010 Publisher: Kluwer Academic Publishers Bibliometrics: Citation Count: 5 Boosting algorithms build highly accurate prediction mechanisms from a collection of low-accuracy predictors. To do so, they employ the notion of weak-learnability. The starting point of this paper is a proof which shows that weak learnability is equivalent to linear separability with ℓ 1 margin. The equivalence is a direct ... Keywords: Boosting, Linear separability, Margin, Minimax theorem 12 December 2009 NIPS'09: Proceedings of the 22nd International Conference on Neural Information Processing Systems Publisher: Curran Associates Inc. Bibliometrics: Citation Count: 8 We describe, analyze, and experiment with a new framework for empirical loss minimization with regularization. Our algorithmic framework alternates between two phases. On each iteration we first perform an unconstrained gradient descent step. We then cast and solve an instantaneous optimization problem that trades off minimization of a regularization term ... 13 December 2009 NIPS'09: Proceedings of the 22nd International Conference on Neural Information Processing Systems Publisher: Curran Associates Inc. Bibliometrics: Citation Count: 20 Bag-of-words document representations are often used in text, image and video processing. While it is relatively easy to determine a suitable word dictionary for text documents, there is no simple mapping from raw images or videos to dictionary terms. The classical approach builds a dictionary using vector quantization over a ... 14 December 2009 The Journal of Machine Learning Research: Volume 10, 12/1/2009 Publisher: JMLR.org Bibliometrics: Citation Count: 106 Downloads (6 Weeks): 0,   Downloads (12 Months): 19,   Downloads (Overall): 417 Full text available: PDF We describe, analyze, and experiment with a framework for empirical loss minimization with regularization. Our algorithmic framework alternates between two phases. On each iteration we first perform an unconstrained gradient descent step. We then cast and solve an instantaneous optimization problem that trades off minimization of a regularization term while ... 15 November 2009 IEEE Transactions on Information Theory: Volume 55 Issue 11, November 2009 Publisher: IEEE Press Bibliometrics: Citation Count: 3 Context trees are a popular and effective tool for tasks such as compression, sequential prediction, and language modeling. We present an algebraic perspective of context trees for the task of individual sequence prediction. Our approach stems from a generalization of the notion of margin used for linear predictors. By exporting ... Keywords: online learning, Context trees, shifting bounds, context trees, perceptron 16 June 2009 ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning Publisher: ACM Bibliometrics: Citation Count: 24 Downloads (6 Weeks): 2,   Downloads (12 Months): 16,   Downloads (Overall): 424 Full text available: PDF We derive generalizations of AdaBoost and related gradient-based coordinate descent methods that incorporate sparsity-promoting penalties for the norm of the predictor that is being learned. The end result is a family of coordinate descent algorithms that integrate forward feature induction and back-pruning through regularization and give an automatic stopping criterion ... 17 July 2008 ICML '08: Proceedings of the 25th international conference on Machine learning Publisher: ACM Bibliometrics: Citation Count: 169 Downloads (6 Weeks): 13,   Downloads (12 Months): 148,   Downloads (Overall): 1,257 Full text available: PDF We describe efficient algorithms for projecting a vector onto the l 1 -ball. We present two methods for projection. The first performs exact projection in O(n) expected time, where n is the dimension of the space. The second works on vectors k of whose elements are perturbed outside the l ... 18 June 2008 The Journal of Machine Learning Research: Volume 9, 6/1/2008 Publisher: JMLR.org Bibliometrics: Citation Count: 1 Downloads (6 Weeks): 1,   Downloads (12 Months): 2,   Downloads (Overall): 119 Full text available: PDF We describe and analyze an algorithmic framework for online classification where each online trial consists of multiple prediction tasks that are tied together. We tackle the problem of updating the online predictor by defining a projection problem in which each prediction task corresponds to a single linear constraint. These constraints ... 19 January 2008 SIAM Journal on Computing: Volume 37 Issue 5, January 2008 Publisher: Society for Industrial and Applied Mathematics Bibliometrics: Citation Count: 56 The Perceptron algorithm, despite its simplicity, often performs well in online classification tasks. The Perceptron becomes especially effective when it is used in conjunction with kernel functions. However, a common difficulty encountered when implementing kernel-based online algorithms is the amount of memory required to store the online hypothesis, which may ... Keywords: kernel methods, learning theory, the Perceptron algorithm, online classification 20 December 2007 Machine Learning: Volume 69 Issue 2-3, December 2007 Publisher: Kluwer Academic Publishers Bibliometrics: Citation Count: 27 We describe a novel framework for the design and analysis of online learning algorithms based on the notion of duality in constrained optimization. We cast a sub-family of universal online bounds as an optimization problem. Using the weak duality theorem we reduce the process of online learning to the task ... Keywords: Mistake bounds, Regret bounds, Online learning, Duality