ABSTRACT
We discuss a new paradigm, called active learning, for supervised learning that aims at improving the efficiency of neural network training procedures. The starting point for active learning is the observation that the traditional approach of randomly selecting training samples leads to large, highly redundant training sets. This redundancy is not always desirable. Especially if the acquisition of training data is expensive, one is rather interested in small, information training sets. Such training sets can be obtained if the learner is enabled to select those training data that he or she expects to be most informative. In this case, the learner is no longer a passive recipient of information but takes an active role in the selection of the training data.
- Fahlmann, SE. and Lebiere, C. (1990), "The cascade-correlation learning architecture," Advances in Neural Information Processing Systems, Touretzky, D.S., editor, Los Altos, CA. Morgan Kaufmann, vol. 2, pp. 524-532.]] Google Scholar
Digital Library
- LeCun, Y., Denker IS., and Solla, SA. (1990), "Optimal brain damage," Advances in Neural Information Processing Systems, Touretzky, D.S. , editor, San Mateo, CA. Morgan Kaufmann, vol. 2, pp. 598-605.]] Google Scholar
Digital Library
- Littmann, E. and Ritter, H. (1996), "Learning and generalization in cascade network architectures," Neural Computation, vol. 8, pp. 1521-1539.]]Google Scholar
Digital Library
- Riedmiller, M. (1994), "Advanced supervised learning in multilayer perceptions - from backpropagation to adaptive learning techniques," Computer Standards and Interfaces, vol. 16, pp. 265-278.]]Google Scholar
Cross Ref
- Fedorov, V.V. (1972), Theory of optimal experiments, Academic Press, New York.]]Google Scholar
- Box, C.E.P. and Draper, N.R. (1987), Empirical Model Building and Response Surfaces, Wiley, New York.]] Google Scholar
Digital Library
- Atkinson, AC. and Donev, AN. (1992), Optimum Experimental Designs, Clarendon Press, Oxford.]]Google Scholar
- Valiant, L.G, (1984), "A theory of the learnable," Communications of the ACM, vol. 27, pp. 1134-1142.]] Google Scholar
Digital Library
- Angluin, D. (1988), "Queries and concept learning," Machine Learning, vol. 2, pp. 319342.]] Google Scholar
Digital Library
- Angluin, D. (1987), "Learning regular sets from queries and counterexamples," Information and Computation, vol. 75, pp. 87-106.]] Google Scholar
Digital Library
- Angluin, D. and Kharitonov, M. (1995), "When won't membership queries help?," Journal of Computer and System Sciences, vol. 50, pp. 336-355.]] Google Scholar
Digital Library
- Plutowski, M. and White, H. (1993), "Selecting concise training sets from clean data," IEEE Transactions on Neural Networks, vol. 4, pp. 305-318.]]Google Scholar
Digital Library
- Plutowski, M., Cottrell, C., and White, H. (1996), "Experience with selecting exemplars from clean data," Neural Networks, vol. 9, pp. 273-294.]] Google Scholar
Digital Library
- Rbel, A. (1993), "The dynamic pattern selection algorithm Effective training and controlled generalization of backpropagation neural networks," Technical Report 93-23, Technische Universitt Berlin, Berlin.]]Google Scholar
- Cortes, C. and Vapnik, V. (1995), "Support-vector networks," Machine Learning, vol. 20, pp. 273-297.]] Google Scholar
Digital Library
- Cuyon, I., Mati, N., and Vapnik, V. (1996), "Discovering informative patterns and data cleaning," Advances in Knowledge Discovery and Data Mining, Fayyad, U.M., editor, AAI Press, Menlo Park, CA, pp. 181-20.]] Google Scholar
Digital Library
- 17,Jung, C. and Opper, M. (1996), "Selection of examples for a linear classifier," Journal of Physics A, vol. 29, pp. 1367-1380]]Google Scholar
Cross Ref
- Munro, P.W. (1992) "Repeat until bored; A pattern selection strategy," in J.E. Moody, 5.3. Hanson, and R.P. Lippmann, editors, Advances in Neural Information Processing Systems, vol. 4, pp. 1001-1008, San Mateo, CA. Morgan Kaufmann.]]Google Scholar
- Cachin, C. (1994), "Pedagogical pattern selection strategies," Neural Networks, vol. 7, pp. 175-181.]] Google Scholar
Digital Library
- Thrun, SB. (1992), "The role of exploration in learning control," Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches, White, D.A. and Sofge, D.A., editors, Van Nordstrand Reinhold, Florence, Kentucky, pp. 527-559.]]Google Scholar
- Thrun, S. (1995), "Exploration in active learning," The Handbook of Brain Theory and Neural Networks, Arbib, MA., editor, MIT Press, Cambridge, MA, pp. 381-384.]] Google Scholar
Digital Library
- Ratsaby, 3. (1998), "An incremental nearest neighbor algorithm with queries," Advances in Neural Processing Systems, Jordan, MI., Kearns, M.d., and Solla, S.A., editors, Cambridge, MA. MIT Press, vol. 10, pp. 612-618.]] Google Scholar
Digital Library
- Heskes, T. (1994), "The use of being stubborn and introspective," Proceedings of the ZiF Conference on Adaptive Behavior and Learning, Dean, 3., Cruse, H., and RAtter, H., editors, Bielefeld, pp. 55-65.]]Google Scholar
- Leisch, F., dam, L.C., and Hornik, K. (1998), "Cross-validation with active pattern selection for neural-network classifiers," IEEE Transactions on Neural Networks, vol. 9, pp. 35-41.]]Google Scholar
Digital Library
- van de Laar, P., Gielen, S., and Heskes, T. (1997), "Input selection with partial retraining," Artificial Neural Networks - ICANN '97, Gerstner, W., Germond, A., Hasler, M., and Nicoud, J.-D., editors, Berlin. Springer, pp. 469-474.]] Google Scholar
Digital Library
- Kinzel, W. and Rujn, p. (1990), "Improving a network generalization ability by selecting examples," Europhysics Letters, vol. 13, pp. 473-477.]]Google Scholar
Cross Ref
- Watkin, T.L.H. and Rau, A. (1992), "Selecting examples for perceptrons," Journal of Physics A: Mathematical and General, vol. 25, pp. 113-121.]]Google Scholar
Cross Ref
- Kinouchi, 0. and Caticha, N. (1992), "Optimal generalization in perceptions," Journal of Physics A, vol. 25, pp. 6243-6250.]]Google Scholar
Cross Ref
- Hwang, 3.-N., Choi, 3.3., Oh, S., and Marks II, R.J. (1991), "Query-based learning applied to partially trained multilayer perceptrons," IEEE transactions on Neural Networks, vol. 2, pp. 131-136.]]Google Scholar
Digital Library
- Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1988), "Learning internal representations by error propagation," Parallel Distributed Processing, Rumelhart, D.E. and MacClelland, J.L., editors, 7th ed., MIT Pr., Cambridge, Mass, vol. 1, chapter 8, pp. 318-362.]] Google Scholar
Digital Library
- Linden, A. and Kindermann, 3. (1989), "Inversion of multilayer nets," Proceedings of the International Joint Conference on Neural Networks, New York, IEEE Press, vol. 2, pp. 425430.]]Google Scholar
- Baum, ER. (1991), "Neural net algorithms that learn in polynomial time from examples and queries," IEEE Transactions on Neural Networks, vol. 2, pp. 5-19.]]Google Scholar
Digital Library
- Hasenjger, M., and Ritter, H. (1998), "Active learning with local models," Neural Processing Letters, vol. 7, pp. 107-117.]] Google Scholar
Digital Library
- Sollich, P. (1994), "Query construction, entropy and generalization in neural network models," Physical Review E, vol. 49, pp. 4637-4651.]]Google Scholar
Cross Ref
- MacKay, D.J.C. (1992), "Information-based objective functions for active data selection," Neural Computation, vol. 4, pp. 590-604.]] Google Scholar
Digital Library
- MacKay, D.J,C. (1992), 'The evidence framework applied to classification networks," Neural Computation, vol. 4, pp. 720-736.]]Google Scholar
Digital Library
- Cohn, D.A. (1996), "Neural network exploration using optimal experiment design," Neural Networks, vol. 9, pp. 1071-1083.]] Google Scholar
Digital Library
- Belue, L.M., Bauer Jr., 1KW., and Ruck, D.W. (1997),, "Selecting optimal experiments for multiple output multilayer perceptions," Neural Computation, vol. 9, pp. 161-183.]] Google Scholar
Digital Library
- Seung, H.S., Opper, M., and Sompolinsky, H. (1992), "Query by committee," Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, New York, NY. ACM Pr., pp. 287-294.]] Google Scholar
Digital Library
- Fteund, Y., Seung, U.S., Shamir, B., and Tishby, N. (1997), "Selective sampling using the Query by Committee algorithm," Machine Learning, vol. 28, pp. 133168.]] Google Scholar
Digital Library
- Krogh, A. and Vedelsby, J. (1995), "Neural network ensembles, cross validation, and active learning," Advances in Neural Information Processing Systems, Tesauro, C., Touretzky, D., and Leen, TN., editors, Cambridge, MA. MIT Pr., vol. 7, pp. 231-238.]]Google Scholar
- Eisenberg, B. and Rivest, R.L. (1990), 'On the sample complexity of paclearning using random and chosen examples," Proceedings of the Third Annual Workshop on Computational Learning Theory, Fulk, lvi. and Case, J., editors, San Mateo, CA. Morgan Kaufmann, pp. 154162.]] Google Scholar
Digital Library
- Kulkarni, SR., Mitter, SK., and Tsitsiklis, J.N. (1993), "Active learning using arbitrary binary valued queries," Machine Learning, vol. 11, pp. 23-35.]] Google Scholar
Digital Library
- Baum, SB. and Lang, K. (1992), "Query learning can work poorly when a human oracle is used," International Joint Conference on Neural Networks, Beijing, China.]]Google Scholar
- Cohn, D. (1997), "Minimizing statistical bias with queries," Advances in Neural Information Processing Systems, Mozer, MC., Jordan, MI., and Petsche, T., editors, Cambridge, MA. MIT Pr., vol. 9, pp. 417-423.]]Google Scholar
- Cohn, D.A., Chahramani, Z., and Jordan, MI. (1996), "Active learning with statistical models," Journal of Artificial Intelligence Research, vol. 4, pp. 129145.]]Google Scholar
Digital Library
- Cohn, D., Atlas, L., and Ladner, R. (1994), "Improving generalization with active learning," Machine Learning, vol. 15, pp. 201-221.]] Google Scholar
Digital Library
- Tishby, N., Levin, B., and Solla, S. (1989), "Consistent inference of probabilities in layered networks: Predictions and generalization," Proceedings of the International Joint Conference on Neural Networks, New York, IEEE Press, vol. 2, pp. 403-409.]]Google Scholar
- Kirkpatrick, S., Celatt, Jr., C.D., and Vecchi, M.P. (1983), "Optimization by simulated annealing," Science, vol. 220, pp. 671-680.]]Google Scholar
Cross Ref
- Bachrach, R,, Fine, S., and Shamir, 5. (1998), "Query by Committee, linear separation and random walks," Accepted to Euro Colt -99, full version available at (http://www.cs.huji.ac.il/labs/learning/Papers/qbc-main.ps.gzJ.]] Google Scholar
Digital Library
- Dagan, I. and Engelson, S. (1995), "Selective sampling in natural language learning," IJCAI95 Workshop On New Approaches to Learning for Natural Language Processing, available at http://www.cs.biu.ac.il:8080/argamon/Access/ijcai-ml-nl95.ps.ZJ,]]Google Scholar
- Dagan, I. and Engelson, S. (1995), "Committee-based sampling for training probabilistic classifiers," Proceedings of the 12th International Conference on Machine Learning, San Francisco, CA. Morgan Kaufmann, pp. 150-157.]]Google Scholar
- Liere, R. and Tadepalli, P. (1997) "Active learning with committees for text categorization," Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence (AAAI97/IAAI-97), Menlo Park, CA. AAAI Press, pp. 591-596.]]Google Scholar
- 54,Lang, K.J. and Witbrock, M.J. (1989), "Learning to tell two spirals apart," Proceedings of the 1988 Connectionist Summer School, Touretzky, D., Hinton, C., and Sejnowski, T., editors, San Mateo, CA. Carnegie Mellon Univ., Morgan Kaufmann, pp. 52-59.]]Google Scholar
- Barber, C.B., Dobkin, D.P., and Huhdanpaa, H. (1996), "The quickhull algorithm for convex hulls," ACM Transactions on Mathematical Software, vol. 22, pp. 469-483.]] Google Scholar
Digital Library
- Kohonen, T. (1995), Self-Organizing Maps, Springer, Berlin, chapter 6, pp. 175-189.]] Google Scholar
Digital Library
- Fukumizu, K. (1996), "Active learning in multilayer perceptrons," Advances in Neural Information Processing Systems, Touretzky, D.S., Mozer, MC., arid Hasselmo, ME., editors, Cambridge, MA. MIT Press, vol. 8, pp. 295-301.]]Google Scholar
- Paass, C. and Kindermann, J. (1995), "Bayesian query construction for neural network models," Advances in Neural Processing Systems, Tesauro, C., Touretsky, D., and Leen, T.K., editors, Cambridge, MA, MIT Pr., vol. 7, pp. 443-450.]]Google Scholar
- Sung, K.K. and Niyogi, p. (1995), "Active learning for function approximation," Advances in Neural Processing Systems, Tesauro, C., Touretzky, D., and Leen, TIC., editors, Cambridge, MA. MIT Pr., vol. 7, pp. 593-600.]]Google Scholar
- Hofmann, T. and Buhmann, J.M. (1998), "Active data clustering," Advances in Neural Processing Systems, Jordan, M.I., Kearns, M.J., and Solla, S.A., editors, Cambridge, MA. MIT Press, vol. 10, pp. 528-534.]] Google Scholar
Digital Library
- Hasenjger, M., Ritter, H., and Obermayer, K. (1999), "Active learning in self-organizing maps," Kohonen Maps, Oja, B. and Kaski, S., editors, Elsevier, Amsterdam, pp. 57-70.]]Google Scholar
- Cohn, D., Riskin, E., and Ladner, R. (1994), "Theory and practice of vector quantizers trained on small training sets," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, pp. 54-65.]] Google Scholar
Digital Library
- Borg, I. and Croenen, P. (1997), Modern Multidimensional Scaling, Springer, New York.]]Google Scholar
- Hofmann, T. and Buhmann, J. (1997), "Pairwise data clustering by deterministic annealing," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, pp. 1-14.]] Google Scholar
Digital Library
- Craepel, T. and Obermayer, IC. (1999), "A stochastic self-organizing map for proximity data," Neural Computation, vol. 11, pp. 139-155.]] Google Scholar
Digital Library
Index Terms
- Active learning in neural networks
Recommendations
Improving Graph Neural Networks by combining active learning with self-training
AbstractIn this paper, we propose a novel framework, called STAL, which makes use of unlabeled graph data, through a combination of Active Learning and Self-Training, in order to improve node labeling by Graph Neural Networks (GNNs). GNNs have been shown ...
Transfer active learning
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementActive learning traditionally assumes that labeled and unlabeled samples are subject to the same distributions and the goal of an active learner is to label the most informative unlabeled samples. In reality, situations may exist that we may not have ...
Multi-label active learning through serial–parallel neural networks
AbstractMulti-label active learning is an extension of supervised learning with high-dimensional label spaces and interactive scenarios. Its key issues include the exploitation of label correlations, handling of missing labels, and selection ...
Highlights- We propose multi-label active learning through serial-parallel neural networks.




Comments