skip to main content
research-article

Multimodal Hand and Foot Gesture Interaction for Handheld Devices

Published:01 October 2014Publication History
Skip Abstract Section

Abstract

We present a hand-and-foot-based multimodal interaction approach for handheld devices. Our method combines input modalities (i.e., hand and foot) and provides a coordinated output to both modalities along with audio and video. Human foot gesture is detected and tracked using contour-based template detection (CTD) and Tracking-Learning-Detection (TLD) algorithm. 3D foot pose is estimated from passive homography matrix of the camera. 3D stereoscopic and vibrotactile are used to enhance the immersive feeling. We developed a multimodal football game based on the multimodal approach as a proof-of-concept. We confirm our systems user satisfaction through a user study.

References

  1. Kazi Masudul Alam, Abu Saleh Md Mahfujur Rahman, and Abdulmotaleb El Saddik. 2013. Mobile haptic e-Book system to support 3D immersive reading in ubiquitous environments. ACM Trans. Multimedia Comput. Commun. Appl. 9, 4 (2013), Article 27, 20 pages. DOI: http://dx.doi.org/10.1145/2501643.2501649 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jason Alexander, Teng Han, William Judd, Pourang Irani, and Sriram Subramanian. 2012. Putting your best foot forward: Investigating real-world mappings for foot-based gestures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'12). ACM, New York, NY, 1229--1238. DOI: http://dx.doi.org/10.1145/2207676.2208575 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Oliver Amft and Paul Lukowicz. 2009. From backpacks to smartphones: Past, present, and future of wearable computers. IEEE Perv. Comput. 8, 3 (2009), 8--13. DOI: http://dx.doi.org/10.1109/MPRV.2009.44 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Christoph Amma, Dirk Gehrig, and Tanja Schultz. 2010. Airwriting recognition using wearable motion sensors. In Proceedings of the 1st Augmented Human International Conference (AH'10). ACM, New York, NY, Article 10, 8 pages. DOI: http://dx.doi.org/10.1145/1785455.1785465 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Thomas Augsten, Konstantin Kaefer, René Meusel, Caroline Fetzer, Dorian Kanitz, Thomas Stoff, Torsten Becker, Christian Holz, and Patrick Baudisch. 2010. Multitoe: High-precision interaction with back-projected floors based on high-resolution multi-touch input. In Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology (UIST'10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gilles Bailly, Jörg Müller, Michael Rohs, Daniel Wigdor, and Sven Kratz. 2012. ShoeSense: A new perspective on gestural interaction and wearable applications. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'12). ACM, New York, NY, 1239--1248. DOI: http://dx.doi.org/10.1145/2207676.2208576 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Alex Butler, Shahram Izadi, and Steve Hodges. 2008. SideSight: Multi-“touch” interaction around small devices. In Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology (UIST'08). ACM, New York, NY, 201--204. DOI: http://dx.doi.org/10.1145/1449715.1449746 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael Cohen. 2008. Integration of laptop sudden motion sensor as accelerometric control for virtual environments. In Proceedings of the 7th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and Its Applications in Industry (VRCAI'08). ACM, New York, NY, Article 38, 2 pages. DOI: http://dx.doi.org/10.1145/1477862.1477911 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Andrew Crossan, Stephen Brewster, and Alexander Ng. 2010. Foot tapping for mobile interaction. In Proceedings of the 24th BCS Interaction Specialist Group Conference (BCS'10). British Computer Society, 418--422. http://dl.acm.org/citation.cfm?id=2146303.2146366 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Pedro F. Felzenszwalb and Ramin Zabih. 2011. Dynamic programming and graph algorithms in computer vision. IEEE Trans. Pattern Anal. Mach. Intell. 33, 4 (2011), 721--740. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kristen Grauman, Margrit Betke, James Gips, and Gary R. Bradski. 2001. Communication via eye blinks - detection and duration analysis in real time. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  12. Wei Guan, Lu Wang, Jonathan Mooser, Suya You, and Ulrich Neumann. 2009. Robust pose estimation in untextured environments for augmented reality applications. In Proceedings of the 8th IEEE International Symposium on Mixed and Augmented Reality (ISMAR'09). IEEE Computer Society, 191--192. DOI: http://dx.doi.org/10.1109/ISMAR.2009.5336470 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Wei Guan, Suya You, and Ulrich Newmann. 2012. Efficient matchings and mobile augmented reality. ACM Trans. Multimedia Comput. Commun. Appl. 8, 3s (2012), Article 47, 15 pages. DOI: http://dx.doi.org/10.1145/2348816.2348826 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nate Hagbi, Oriel Bergig, Jihad El-Sana, and Mark Billinghurst. 2011. Shape recognition and pose estimation for mobile augmented reality. IEEE Trans. Visual. Comput. Graphics 17, 10 (2011), 1369--1379. DOI: http://dx.doi.org/10.1109/TVCG.2010.241 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Alaa Halawani and Haibo Li. 2013. FingerInk: Turn your glass into a digital board. In Proceedings of the Australian Computer-Human Interaction Conference (OzCHI'13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Alaa Halawani, Shafiq ur Réhman, Haibo Li, and Adi Anani. 2012. Active vision for controlling an electric wheelchair. Intell. Serv. Robot. 5, 2 (2012), 89--98. DOI: http://dx.doi.org/10.1007/s11370-011-0098-3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Teng Han, Jason Alexander, Abhijit Karnik, Pourang Irani, and Sriram Subramanian. 2011. Kick: Investigating the use of kick gestures for mobile interactions. In Proceedings of the 13th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI'11). ACM, New York, NY, 29--32. DOI: http://dx.doi.org/10.1145/2037373.2037379 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. M. Haralick, Hyonam Joo, D. Lee, S. Zhuang, V. G. Vaidya, and M. B. Kim. 1989. Pose estimation from corresponding point data. IEEE Trans. Syst. Man Cyberne. 19, 6 (1989), 1426--1446. DOI: http://dx.doi.org/10.1109/21.44063Google ScholarGoogle ScholarCross RefCross Ref
  19. Michael Hardegger, Gerhard Tröster, and Daniel Roggen. 2013. Improved actionSLAM for long-term indoor tracking with wearable motion sensors. In Proceedings of the International Symposium on Wearable Computers (ISWC'13). ACM, New York, NY, 1--8. DOI: http://dx.doi.org/10.1145/2493988.2494328 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Chris Harrison and Scott E. Hudson. 2009. Abracadabra: Wireless, high-precision, and unpowered finger input for very small mobile devices. In Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology (UIST'09). ACM, New York, NY, 121--124. DOI: http://dx.doi.org/10.1145/1622176.1622199 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. I. Hartley and A. Zisserman. 2004. Multiple View Geometry in Computer Vision (2nd Ed.). Cambridge University Press, ISBN: 0521540518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Alejandro Jaimes and Nicu Sebe. 2007. Multimodal human computer interaction: A survey. Comput. Vision Image Understand. 108, 1--2 (October 2007), 116--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Qiang Ji, Mauro S. Costa, Robert M. Haralick, and Linda G. Shapiro. 2000. A robust linear least-squares estimation of camera exterior orientation using multiple geometric features. ISPRS J. Photogrammetry Remote Sens. 55, 2 (2000), 75--93. DOI: http://dx.doi.org/10.1016/S0924-2716(00)00009-5Google ScholarGoogle ScholarCross RefCross Ref
  24. Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas. 2010. Tracking learning detection. IEEE Trans. Pattern Anal. Mach. Intell. 34, 7, (2012), 1409--1402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Georg Klein and David Murray. 2007. Parallel tracking and mapping for small AR workspaces. In Proceedings of the 6th IEEE and ACM Symposium on Mixed and Augmented Reality (ISMAR'07). 1--10. DOI: http://dx.doi.org/10.1109/ISMAR.2007.4538852 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ravi Kondapalli and Ben-Zhen Sung. 2011. Daft datum—An interface for producing music through foot-based interaction. In Proceedings of the 7th International Workshop on Networking Issues in Multimedia Entertainment (NIME'11).Google ScholarGoogle Scholar
  27. Zhihan Lu, Muhammad Sikandar Lal Khan, and Shafiq ur Réhman. 2013a. Hand and foot gesture interaction for handheld devices. In Proceedings of the 21st ACM International Conference on Multimedia (MM'13). ACM, New York, NY, 621--624. DOI: http://dx.doi.org/10.1145/2502081.2502163 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhihan Lu, Shafiq ur Rehman, M. S. L. Khan, and Haibo Li. 2013b. Anaglyph 3D stereoscopic visualization of 2D video based on fundamental matrix. In Proceedings of the International Conference on Virtual Reality and Visualization (ICVRV). 305--308. DOI: http://dx.doi.org/10.1109/ICVRV.2013.59 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Zhihan Lu and Shafiq ur Réhman. 2013. Touch-less interaction smartphone on Go!. In Proceedings of the SIGGRAPH Asia Posters (SA'13). ACM, New York, NY, Article 28, 1 page. DOI: http://dx.doi.org/10.1145/2542302.2542336 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Bruce D. Lucas and Takeo Kanade. 1981. An iterative image registration technique with an application to stereo vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence - Volume 2 (IJCAI'81). Morgan Kaufmann Publishers Inc., San Francisco, CA, 674--679. http://dl.acm.org/citation.cfm?id=1623264.1623280 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zhihan Lv. 2013. Wearable smartphone: Wearable hybrid framework for hand and foot gesture interaction on smartphone. In Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW'13). IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zhihan Lv, Shengzhong Feng, Muhammad Sikandar Lal Khan, Shafiq ur Réhman, and Haibo Li. 2014. Foot motion sensing: Augmented game interface based on foot interaction for smartphone. In Proceedings of the CHI'14 Extended Abstracts on Human Factors in Computing Systems (CHI EA'14). ACM, New York, NY, 293--296. DOI: http://dx.doi.org/10.1145/2559206.2580096 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Zhihan Lv, Alaa Halawani, Muhammad Sikandar Lal Khan, Shafiq ur Réhman, and Haibo Li. 2013. Finger in air: Touch-less interaction on smartphone. In Proceedings of the 12th International Conference on Mobile and Ubiquitous Multimedia (MUM'13). ACM, New York, NY, Article 16, 4 pages. DOI: http://dx.doi.org/10.1145/2541831.2541833 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Zhihan Lv and Shafiq ur Réhman. 2013. Multi-gesture based football game in smart phones. In Proceedings of the SIGGRAPH Asia Symposium on Mobile Graphics and Interactive Applications (SA'13). ACM, New York, NY, Article 20, 1 pages. DOI: http://dx.doi.org/10.1145/2543651.2543677 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Pranav Mistry and Pattie Maes. 2009. SixthSense: A wearable gestural interface. In Proceedings of the ACM SIGGRAPH ASIA Sketches (SIGGRAPH ASIA'09). ACM, New York, NY, Article 11, 1 pages. DOI: http://dx.doi.org/10.1145/1667146.1667160 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pranav Mistry, Pattie Maes, and Liyan Chang. 2009. WUW - wear ur world: A wearable gestural interface. In Proceedings of the CHI'09 Extended Abstracts on Human Factors in Computing Systems (CHI EA'09). ACM, New York, NY, 4111--4116. DOI: http://dx.doi.org/10.1145/1520340.1520626 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Volker Paelke, Christian Reimann, and Dirk Stichling. 2004. Foot-based mobile Interaction with games. In Proceedings of the ACM SIGCHI International Conference on Advances in Computer Entertainment Technology (ACE'04). ACM, New York, NY, 321--324. DOI: http://dx.doi.org/10.1145/1067343.1067390 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Srikumar Ramalingam, Suresh K. Lodha, and Peter Sturm. 2006. A generic structure-from-motion framework. Comput. Vision Image Understand. 103, 3 (September 2006), 218--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Réhman, A. Khan, and H. Li. 2012. Interactive feet for mobile immersive interaction. In Proceedings of the ACM International Workshop MobiVis Workshop at MobileHCI.Google ScholarGoogle Scholar
  40. Nuttapol Sangsuriyachot and Masanori Sugimoto. 2012. Novel interaction techniques based on a combination of hand and foot gestures in tabletop environments. In Proceedings of the 10th Asia Pacific Conference on Computer Human Interaction. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jeremy Scott, David Dearman, Koji Yatani, and Khai N. Truong. 2010. Sensing foot gestures from the pocket. In Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology (UIST'10). ACM, New York, NY, 199--208. DOI: http://dx.doi.org/10.1145/1866029.1866063 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Ju-Hwan Seo, Jeong-Yean Yang, and Dong-Soo Kwon. 2012. Laser scanner based foot motion detection for intuitive robot user interface system. In Proceedings of the IEEE International Symposium on Robots and Human Interactive Communications (RO-MAN).Google ScholarGoogle ScholarCross RefCross Ref
  43. Tyler Simpson, Michel Gauthier, and Arthur Prochazka. 2010. Evaluation of tooth-click triggering and speech recognition in assistive technology for computer access. Neurorehabil. Neural Repair 24, 2 (February 2010), 188--194.Google ScholarGoogle ScholarCross RefCross Ref
  44. Khoa Nguyen Tran and Zhiyong Huang. 2007. Design and implementation of a built-in camera based user interface for mobile games. In Proceedings of the 5th International Conference on Computer Graphics and Interactive Techniques in Australia and Southeast Asia (GRAPHITE'07). ACM, New York, NY, 25--31. DOI: http://dx.doi.org/10.1145/1321261.1321266 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Alex Ufkes and Mark Fiala. 2013. A markerless augmented reality system for mobile devices. In Proceedings of the International Conference on Computer and Robot Vision (CRV). 226--233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. S. ur Rehman, Li Liu, and Haibo Li. 2007. Lipless tracking and emotion estimation. In Proceedings of the 3rd International IEEE Conference on Signal-Image Technologies and Internet-Based System (SITIS'07). 768--774. DOI: http://dx.doi.org/10.1109/SITIS.2007.102 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. S. ur Rehman, Jiong Sun, Li Liu, and Haibo Li. 2008. Turn your mobile into the ball: Rendering live football game using vibration. IEEE Trans. Multimedia 10, 6 (2008), 1022--1033. DOI: http://dx.doi.org/10.1109/TMM.2008.2001352 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. D. Valkov, F. Steinicke, G. Bruder, and K. Hinrichs. 2010. Traveling in 3D virtual environments with foot gestures and a multi-touch enabled WIM. In Proceedings of the International Conference on Virtual Reality.Google ScholarGoogle Scholar
  49. Daniel Wagner, Gerhard Reitmayr, Alessandro Mulloni, Tom Drummond, and Dieter Schmalstieg. 2008. Pose tracking from natural features on mobile phones. In Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality (ISMAR'08). IEEE Computer Society, 125--134. DOI: http://dx.doi.org/10.1109/ISMAR.2008.4637338 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Daniel Wagner, Gerhard Reitmayr, Alessandro Mulloni, Tom Drummond, and D. Schmalstieg. 2010. Real-time detection and tracking for augmented reality on mobile phones. IEEE Trans. Visual. Comput. Graphics 16, 3 (2010), 355--368. DOI: http://dx.doi.org/10.1109/TVCG.2009.99 Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Jingtao Wang, Shumin Zhai, and John Canny. 2006. Camera phone based motion sensing: Interaction techniques, applications and performance study. In Proceedings of the 19th Annual ACM Symposium on User Interface Software and Technology (UIST'06). ACM, New York, NY, 101--110. DOI: http://dx.doi.org/10.1145/1166253.1166270 Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bullet. 1, 6 (1945), 80--83. DOI: http://dx.doi.org/10.2307/3001968Google ScholarGoogle ScholarCross RefCross Ref
  53. Shahrouz Yousefi, Farid Abedan Kondori, and Haibo Li. 2011. 3D gestural interaction for stereoscopic visualization on mobile devices. In Proceedings of the 14th International Conference on Computer Analysis of Images and Patterns - Volume Part II (CAIP'11). Lecture Notes in Computer Science, vol. 6855. Springer-Verlag, Berlin, Heidelberg, 555--562. http://dl.acm.org/citation.cfm?id=2044575.2044654 Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Shahrouz Yousefi, Farid Abedan Kondori, and Haibo Li. 2013. Experiencing real 3D gestural interaction with mobile devices. Pattern Recognition Letters 34, 8 (2013), 912--921. DOI: http://dx.doi.org/10.1016/j.patrec.2013.02.004 Computer Analysis of Images and Patterns. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multimodal Hand and Foot Gesture Interaction for Handheld Devices

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!