research-article

Research frontier: deep machine learning--a new frontier in artificial intelligence research

Abstract

This article provides an overview of the mainstream deep learning approaches and research directions proposed over the past decade. It is important to emphasize that each approach has strengths and "weaknesses, depending on the application and context in "which it is being used. Thus, this article presents a summary on the current state of the deep machine learning field and some perspective into how it may evolve. Convolutional Neural Networks (CNNs) and Deep Belief Networks (DBNs) (and their respective variations) are focused on primarily because they are well established in the deep learning field and show great promise for future work.

References

  1. R. Bellman, Dynamic Programming. Princeton, NJ: Princeton Univ. Press, 1957.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Duda, P. Hart, and D. Stork, Pattern Recognition, 2nd ed. New York: Wiley-Interscience, 2000. Google ScholarGoogle Scholar
  3. T. Lee and D. Mumford, "Hierarchical Bayesian inference in the visual cortex," J. Opt. Soc. Amer., vol. 20, pt. 7, pp. 1434-1448, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  4. T. Lee, D. Mumford, R. Romero, and V. Lamme, "The role of the primary visual cortex in higher level vision," Vision Res., vol. 38, pp. 2429-2454, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  5. G. Wallis and H. Bülthoff, "Learning to recognize objects," Trends Cogn. Sci., vol. 3, no. 1, pp. 23-31, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  6. G. Wallis and E. Rolls, "Invariant face and object recognition in the visual system," Prog. Neurobiol., vol. 51, pp. 167-194, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  7. Y. Bengio, "Learning deep architectures for AI," Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1-127, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  9. F.-J. Huang and Y. LeCun, "Large-scale learning with SVM and convolutional nets for generic object categorization," in Proc. Computer Vision and Pattern Recognition Conf. (CVPR'06), 2006. Google ScholarGoogle Scholar
  10. B. Kwolek, "Face detection using convolutional neural networks and Gabor filters," in Lecture Notes in Computer Science, vol. 3696. 2005, p. 551. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. H. C. Tivive and A. Bouzerdoum, "A new class of convolutional neural networks (SICoNNets) and their application of face detection," in Proc. Int. Joint Conf. Neural Networks, 2003, vol. 3, pp. 2157-2162.Google ScholarGoogle Scholar
  12. S. Sukittanon, A. C. Surendran, J. C. Platt, and C. J. C. Burges, "Convolutional networks for speech detection," Interspeech, pp. 1077-1080, 2004.Google ScholarGoogle Scholar
  13. Y.-N. Chen, C.-C. Han, C.-T. Wang, B.-S. Jeng, and K.-C. Fan, "The application of a convolution neural network on face and license plate detection," in Proc. 18th Int. Conf. Pattern Recognition (ICPR'06), 2006, pp. 552-555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. E. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., vol. 18, pp. 1527-1554, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. E. Hinton, "Training products of experts by minimizing contrastive divergence," Neural Comput., vol. 14, pp. 1771-1800, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, vol. 313, no. 5786, pp. 504-507, 2006.Google ScholarGoogle Scholar
  17. Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, "Greedy layer-wise training of deep networks," in Advances in Neural Information Processing Systems 19 (NIPS'06). 2007, pp. 153-160.Google ScholarGoogle Scholar
  18. M. Ranzato, F. J. Huang, Y. Boureau, and Y. LeCun, "Unsupervised learning of invariant feature hierarchies with applications to object recognition," in Proc. Computer Vision and Pattern Recognition Conf., 2007.Google ScholarGoogle Scholar
  19. H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio, "An empirical evaluation of deep architectures on problems with many factors of variation," in Proc. 24th Int. Conf. Machine Learning (ICML'07), 2007, pp. 473-480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Lee, R. Grosse, R. Ranganath, and A. Ng, "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations," in Proc. 26th Int. Conf. Machine Learning, 2009, pp. 609-616. Google ScholarGoogle Scholar
  21. P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proc. 25th Int. Conf. Machine Learning (ICML'08), 2008, pp. 1096-1103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I. Sutskever and G. Hinton, "Learning multilevel distributed representations for high-dimensional sequences," in Proc. 11th Int. Conf. Artificial Intelligence and Statistics, 2007.Google ScholarGoogle Scholar
  23. A. Lockett and R. Miikkulainen, "Temporal convolution machines for sequence learning," Dept. Comput. Sci., Univ. Texas, Austin, Tech. Rep. AI-09-04, 2009.Google ScholarGoogle Scholar
  24. H. Lee, Y. Largman, P. Pham, and A. Ng, "Unsupervised feature learning for audio classification using convolutional deep belief networks," in Advances in Neural Information Processing Systems 22 (NIPS'09), 2009.Google ScholarGoogle Scholar
  25. H. Mobahi, R. Collobert, and J. Weston, "Deep learning from temporal coherence in video," in Proc. 26th Annu. Int. Conf. Machine Learning, 2009, pp. 737-744. Google ScholarGoogle Scholar
  26. I. Arel, D. Rose, and B. Coop, "DeSTIN: A deep learning architecture with application to high-dimensional robust pattern recognition," in Proc. 2008 AAAI Workshop Biologically Inspired Cognitive Architectures (BICA).Google ScholarGoogle Scholar
  27. The MNIST database of handwritten digits {Online}. Available: http://yann.lecun.com/exdb/mnist/Google ScholarGoogle Scholar
  28. Caltech 101 dataset {Online}. Available: http:// www.vision.caltech.edu/Image_Datasets/Caltech101/Google ScholarGoogle Scholar
  29. http://www.darpa.mil/IPTO/solicit/baa/BAA09- 40_PIP.pdfGoogle ScholarGoogle Scholar
  30. http://www.numenta.comGoogle ScholarGoogle Scholar
  31. http://www.binatix.comGoogle ScholarGoogle Scholar
  32. T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio, "Robust object recognition with cortex-like mechanisms," IEEE Trans. Pattern Anal. Machine Intell., vol. 29, no. 3, pp. 411-426, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. George, "How the brain might work: A hierarchical and temporal model for learning and recognition," Ph.D. dissertation, Stanford Univ., Stanford, CA, 2008. Google ScholarGoogle Scholar
  34. T. Dean, G. Carroll, and R. Washington, "On the prospects for building a working model of the visual cortex," in Proc. Nat. Conf. Artificial Intelligence, 2007, vol. 22, p. 1597. Google ScholarGoogle Scholar
  35. T. Dean, "A computational model of the cerebral cortex," in Proc. Nat. Conf. Artificial Intelligence, 2005, vol. 20, pp. 938-943. Google ScholarGoogle Scholar
  36. T. S. Lee and D. Mumford, "Hierarchical Bayesian inference in the visual cortex," J. Opt. Soc. Amer. A, vol. 20, no. 7, pp. 1434-1448, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  37. M. Szarvas, U. Sakai, and J. Ogata, "Real-time pedestrian detection using LIDAR and convolutional neural networks," in Proc. 2006 IEEE Intelligent Vehicles Symp., pp. 213-218.Google ScholarGoogle Scholar
  38. P. Y. Simard, D. Steinkraus, and J. C. Platt, "Best practices for convolutional neural networks applied to visual document analysis," in Proc. 7th Int. Conf. Document Analysis and Recognition, 2003, pp. 958-963. Google ScholarGoogle Scholar
  39. J. Hawkins and S. Blakeslee, On Intelligence. Times Books, Oct. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. K. Fukushima, "Neocognitron for handwritten digit recognition," Neurocomputing, vol. 51, pp. 161-180, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  41. K. Fukushima, "Restoring partly occluded patterns: A neural network model," Neural Netw., vol. 18, no. 1, pp. 33-43, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W. H. Freeman, 1983. Google ScholarGoogle Scholar
  43. K. Fukushima, "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position," Biol. Cybern., vol. 36, no. 4, pp. 193-202, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  44. D. H. Hubel and T. N. Wiesel, "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex," J. Physiol., vol. 160, pp. 106-154, 1962.Google ScholarGoogle ScholarCross RefCross Ref
  45. M. Riesenhuber and T. Poggio, "Hierarchical models of object recognition in cortex," Nat. Neurosci., vol. 2, no. 11, pp. 1019-1025, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  46. J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. J. W. Miller and P. H. Lommel, "Biometric sensory abstraction using hierarchical quilted self-organizing maps," Proc. SPIE, vol. 6384, 2006.Google ScholarGoogle Scholar
  48. S. Behnke, Hierarchical Neural Networks for Image Interpretation. New York: Springer-Verlag, 2003. Google ScholarGoogle Scholar
  49. S. V. Rice, F. R. Jenkins, and T. A. Nartker, "The fifth annual test of OCR accuracy," Information Sciences Res. Inst., Las Vegas, NV, TR-96-01, 1996.Google ScholarGoogle Scholar
  50. E. M. Newton and P. J. Phillips, "Meta-analysis of third-party evaluations of iris recognition," IEEE Trans. Syst., Man, Cybern. A, vol. 39, no. 1, pp. 4-11, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. M. Osadchy, Y. LeCun, and M. Miller, "Synergistic face detection and pose estimation with energy-based models," J. Mach. Learn. Res., vol. 8, pp. 1197-1215, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. A. Adler and M. Schuckers, "Comparing human and automatic face recognition performance," IEEE Trans. Syst., Man, Cybern. B, vol. 37, no. 5, pp. 1248-1255, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. Lang, "Phoneme recognition using time-delay neural networks," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 328-339, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  54. K. Lang, A. Waibel, and G. Hinton, "A time-delay neural-network architecture for isolated word recognition," Neural Netw., vol. 3, no. 1, pp. 23-44, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. R. Hadseel, A. Erkan, P. Sermanet, M. Scoffier, U. Muller, and Y. LeCun, "Deep belief net learning in a long-range vision system for autonomous off-road driving," in Proc. Intelligent Robots and Systems, 2008, pp. 628-633.Google ScholarGoogle Scholar
  56. G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, vol. 313, pp. 504-507, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  57. K. Kavukcuoglu, M. Ranzato, R. Fergus, and Y. Le-Cun, "Learning invariant features through topographic filter maps," in Proc. Int. Conf. Computer Vision and Pattern Recognition, 2009.Google ScholarGoogle Scholar
  58. J. Weston, F. Ratle, and R. Collobert, "Deep learning via semi-supervised embedding," in Proc. 25th Int. Conf. Machine Learning, 2008, pp. 1168-1175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. K. A. DeJong, "Evolving intelligent agents: A 50 year quest," IEEE Comput. Intell. Mag., vol. 3, no. 1, pp. 12-17, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. X. Yao and M. Islam, "Evolving artificial neural network ensembles," IEEE Comput. Intell. Mag., vol. 2, no. 1, pp. 31-42, 2008. Google ScholarGoogle Scholar

Index Terms

(auto-classified)
  1. Research frontier

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!