skip to main content
research-article

Pedestrian Trajectory Prediction in Heterogeneous Traffic using Facial Keypoints-based Convolutional Encoder-decoder Network

Published:14 November 2022Publication History
Skip Abstract Section

Abstract

Future pedestrian trajectory prediction offers great prospects for many practical applications such as unmanned vehicles, building evacuation design and robotic path planning. Most existing methods focus on social interaction among pedestrians but ignore the fact that heterogeneous traffic objects (cars, dogs, bicycles, motorcycles, etc.) have significant influence on the future trajectory of a subject pedestrian. Also, the walking direction intention of a pedestrian may be referred by his/her facial keypoints. Considering this, this work proposes to predict a pedestrian's future trajectory by jointly using neighboring heterogeneous traffic information and his/her facial keypoints. To fulfill this, an end-to-end facial keypoints-based convolutional encoder-decoder network (FK-CEN) is designed, in which the heterogeneous traffic and facial keypoints are input. After training, FK-CEN is evaluated on 5 crowded video sequences collected from the public datasets MOT-16 and MOT-17. Experimental results demonstrate that it outperforms state-of-the-art approaches, in terms of prediction errors.

REFERENCES

  1. [1] Alahi A., Goel K., Ramanathan V., et al. 2016. Social : Human trajectory prediction in crowded spaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 961971.Google ScholarGoogle Scholar
  2. [2] Gupta A., Johnson J., Savarese S., Fei-Fei Li, and Alahi A.. 2018. Social : Socially acceptable trajectories with generative adversarial networks. In CVPR, 2018.1, 2, 3, 4, 5, 6, 8, 9, 12.Google ScholarGoogle Scholar
  3. [3] Kitani K. M., Ziebart B. D., Bagnell J. A., et al. 2012. Activity forecasting. European Conference on Computer Vision. Springer, Berlin, 201214.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Liang J., Jiang L., Niebles J. C., et al. 2019. Peeking into the future: Predicting future person activities and locations in videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 57255734.Google ScholarGoogle Scholar
  5. [5] Pellegrini S., Ess A., Schindler K., et al. 2009. You'll never walk alone: Modeling social behavior for multi-target tracking. 2009 IEEE 12th International Conference on Computer Vision. IEEE, 261268.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Vemula A., Muelling K., and Oh J.. 2017. Social attention: Modeling attention in human crowds. arXiv:1710.04689 [cs].Google ScholarGoogle Scholar
  7. [7] Helbing D., Farkas I., and Vicsek T.. 2000. Simulating dynamical features of escape panic. Nature 407, 9 (2000), 487491.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Morris B. T. and Trivedi M. M.. 2011. Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 11 (2011), 22872301.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Kitani K. M., Ziebart B. D., Bangell J. A., et al. 2012. Activity forecasting. Proceedings of the 2012 European Conference on Computer Vision, LNCS 7575. Berlin: Springer, 201214.Google ScholarGoogle Scholar
  10. [10] Xu Y., Piao Z., and Gao S.. 2018. Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 52755284.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Yi S., Li H., and Wang X.. 2016. Pedestrian behavior understanding and prediction with deep neural networks. European Conference on Computer Vision. Springer, Cham, 263279.Google ScholarGoogle Scholar
  12. [12] Zou H., Su H., Song S., et al. 2018. Understanding human behaviors in crowds by imitating the decision-making process. Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Kitani K. M., Ziebart B. D., Bagnell J. A., et al. 2012. Activity forecasting. European Conference on Computer Vision. Springer, Berlin, 201214.Google ScholarGoogle Scholar
  14. [14] Xie D., Shu T., Todorovic S., et al. 2017. Learning and inferring “dark matter” and predicting human intents and trajectories in videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 7 (2017), 16391652.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Manh H. and Alaghband G.. 2018. Scene-: A model for human trajectory prediction[J]. arXiv preprint arXiv:1808.04018.Google ScholarGoogle Scholar
  16. [16] Jaipuria N., Habibi G., and How J. P.. 2018. A transferable pedestrian motion prediction model for intersections with different geometries. arXiv preprint arXiv:1806.09444.Google ScholarGoogle Scholar
  17. [17] Sadeghian A., Kosaraju V., Sadeghian A., et al. 2019. Sohie: An attentive for predicting paths compliant to social and physical constraints. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 13491358.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Nikhil N. and Morris B. Tran. 2018. Convolutional neural network for trajectory prediction. Proceedings of the European Conference on Computer Vision (ECCV). 0-0.Google ScholarGoogle Scholar
  19. [19] Yagi T., Mangalam K., Yonetani R., et al. 2018. Future person localization in first-person videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 75937602.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Yang S., Luo P., Loy C. C., et al. 2016. Wider face: A face detection benchmark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 55255533.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Zhang Z., Luo P., Loy C. C., et al. 2014. Facial landmark detection by deep multi-task learning. European Conference on Computer Vision. Springer, Cham, 94108.Google ScholarGoogle Scholar
  22. [22] Vinyals O., Toshev A., Bengio S., et al. 2015. Show and tell: A neural image caption generator. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 31563164.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Milan A., Leal-Taixé L., Reid I., et al. 2016. MOT16: A benchmark for multi-object tracking[J]. arXiv preprint arXiv:1603.00831.Google ScholarGoogle Scholar
  24. [24] Lerner A., Chrysanthou Y., and Lischinski D.. 2007. Crowds by example. Computer Graphics Forum 655664.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Pellegrini S., Ess A., Schindler K., and van Gool L.. 2009. You'll never walk alone: Modeling social behavior for multi-target tracking. In IEEE 12th International Conference on Computer Vision (ICCV). 261268.Google ScholarGoogle Scholar
  26. [26] “PyTorch.” [Online]. Available: https://pytorch.org/. [Accessed: 23-Jun-2018].Google ScholarGoogle Scholar
  27. [27] Kingma D. P. and Ba J.. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.Google ScholarGoogle Scholar
  28. [28] Song X., Han D., and Sun J.. 2018. A data-driven neural network approach to simulate pedestrian movement. Physica A-statistical Mechanics and Its Applications 509, 11 (2018), 827844.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Song X., Ma L., and Ma Y.. 2016. Selfishness- and selflessness-based models of pedestrian room evacuation. Physica A-statistical Mechanics and Its Applications 447, 4 (2016), 455466.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Ramírez M., Torres F., Toledo B. A., Coello M., and Valdivia J. A.. 2019. Unpredictability in pedestrian flow: The impact of stochasticity and anxiety in the event of an emergency. Physica A: Statistical Mechanics and its Applications 531, 1 (2019), 121742.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Kim Jooyoung, Ahn Chiwon, and Lee Seungjae. 2018. Modeling handicapped pedestrians considering physical characteristics using cellular automaton. Physica A: Statistical Mechanics and its Applications 510, 15 (2018), 507517.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Kooij J. F. P., Schneider N., Flohr F., and Gavrila D. M.. 2014. Context-based pedestrian path prediction. In European Conference on Computer Vision. Springer, Cham, 618633.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Calton Pu et al. 2020. Beyond artificial reality: Finding and monitoring live events from social sensors. ACM Transactions on Internet Technology (TOIT) 20, 1 (2020), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Azadeh Ghari Neiat, Bouguettaya Athman, and Mistry Sajib. 2019. Incentive-based crowdsourcing of hotspot services. ACM Transactions on Internet Technology (TOIT) 19, 1 (2019), 124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Song X., Chen Kai, et al. 2020. Pedestrian trajectory prediction based on deep convolutional LSTM network. IEEE Transactions on Intelligent Transportation Systems 3, (2020). .Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Song X., Xie H., Sun J., Han Daolin, Cui Yong, and Chen Bin. 2019. Simulation of pedestrian rotation dynamics near crowded exits. IEEE Transactions on Intelligent Transportation Systems 20, 8 (2019), 31423155.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Xue H., Huynh D. Q., and Reynolds M.. 2018. SS-LSTM: A hierarchical LSTM model for pedestrian trajectory prediction. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1186–1194.Google ScholarGoogle Scholar
  38. [38] Zhang P., Ouyang W., Zhang P., Xue J., and NZheng .. 2019. -Sr-lstm: State refinement for towards pedestrian trajectory prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1208512094.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Pedestrian Trajectory Prediction in Heterogeneous Traffic using Facial Keypoints-based Convolutional Encoder-decoder Network

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Internet Technology
      ACM Transactions on Internet Technology  Volume 22, Issue 4
      November 2022
      642 pages
      ISSN:1533-5399
      EISSN:1557-6051
      DOI:10.1145/3561988
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 November 2022
      • Online AM: 22 March 2022
      • Accepted: 10 July 2020
      • Revised: 9 June 2020
      • Received: 27 March 2020
      Published in toit Volume 22, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)119
      • Downloads (Last 6 weeks)5

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!