skip to main content
research-article
Open Access

DeepFormableTag: end-to-end generation and recognition of deformable fiducial markers

Published:19 July 2021Publication History
Skip Abstract Section

Abstract

Fiducial markers have been broadly used to identify objects or embed messages that can be detected by a camera. Primarily, existing detection methods assume that markers are printed on ideally planar surfaces. The size of a message or identification code is limited by the spatial resolution of binary patterns in a marker. Markers often fail to be recognized due to various imaging artifacts of optical/perspective distortion and motion blur. To overcome these limitations, we propose a novel deformable fiducial marker system that consists of three main parts: First, a fiducial marker generator creates a set of free-form color patterns to encode significantly large-scale information in unique visual codes. Second, a differentiable image simulator creates a training dataset of photorealistic scene images with the deformed markers, being rendered during optimization in a differentiable manner. The rendered images include realistic shading with specular reflection, optical distortion, defocus and motion blur, color alteration, imaging noise, and shape deformation of markers. Lastly, a trained marker detector seeks the regions of interest and recognizes multiple marker patterns simultaneously via inverse deformation transformation. The deformable marker creator and detector networks are jointly optimized via the differentiable photorealistic renderer in an end-to-end manner, allowing us to robustly recognize a wide range of deformable markers with high accuracy. Our deformable marker system is capable of decoding 36-bit messages successfully at ~29 fps with severe shape deformation. Results validate that our system significantly outperforms the traditional and data-driven marker methods. Our learning-based marker system opens up new interesting applications of fiducial markers, including cost-effective motion capture of the human body, active 3D scanning using our fiducial markers' array as structured light patterns, and robust augmented reality rendering of virtual objects on dynamic surfaces.

Skip Supplemental Material Section

Supplemental Material

a67-yaldiz.mp4
3450626.3459762.mp4

References

  1. Shumeet Baluja. 2017. Hiding images in plain sight: Deep steganography. In The Conference and Workshop on Neural Information Processing Systems. 2069--2079.Google ScholarGoogle Scholar
  2. Ross Bencina, Martin Kaltenbrunner, and Sergi Jorda. 2005. Improved topological fiducial tracking in the reactivision system. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 99--99.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Filippo Bergamasco, Andrea Albarelli, Emanuele Rodola, and Andrea Torsello. 2011. Rune-tag: A high accuracy fiducial marker with strong occlusion resilience. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 113--120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Joseph DeGol, Timothy Bretl, and Derek Hoiem. 2017. ChromaTag: a colored marker and fast detection algorithm. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 1472--1481.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Ieee, 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  6. Denso Wave. 1994. Quick Response (QR) code. https://d1wqtxts1xzle7.cloudfront.net/51791265/Three_QR_Code.pdfGoogle ScholarGoogle Scholar
  7. Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. 2018. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 224--236.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jean Duchon. 1977. Splines minimizing rotation-invariant semi-norms in Sobolev spaces. In Constructive Theory of Functions of Several Variables, Walter Schempp and Karl Zeller (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 85--100.Google ScholarGoogle Scholar
  9. Mark Fiala. 2005. ARTag, a fiducial marker system using digital techniques. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2. IEEE, 590--596.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. John G Fryer and Duane C Brown. 1986. Lens distortion for close-range photogrammetry. Photogrammetric engineering and remote sensing 52, 1 (1986), 51--58.Google ScholarGoogle Scholar
  11. Sergio Garrido-Jurado, Rafael Muñoz-Salinas, Francisco José Madrid-Cuevas, and Manuel Jesús Marín-Jiménez. 2014. Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognition 47, 6 (2014), 2280--2292.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sergio Garrido-Jurado,Rafael Munoz-Salinas, Francisco José Madrid-Cuevas, and Rafael Medina-Carnicer. 2016. Generation of fiducial marker dictionaries using mixed integer linear programming. Pattern Recognition 51 (2016), 481--491.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Oleg Grinchuk, Vadim Lebedev, and Victor Lempitsky. 2016. Learnable visual markers. In The Conference and Workshop on Neural Information Processing Systems. 4143--4151.Google ScholarGoogle Scholar
  14. Jamie Hayes and George Danezis. 2017. Generating steganographic images via adversarial training. In The Conference and Workshop on Neural Information Processing Systems. 1954--1963.Google ScholarGoogle Scholar
  15. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2961--2969.Google ScholarGoogle ScholarCross RefCross Ref
  16. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  17. Danying Hu, Daniel DeTone, and Tomasz Malisiewicz. 2019. Deep charuco: Dark charuco marker pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8436--8444.Google ScholarGoogle ScholarCross RefCross Ref
  18. Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial Transformer Networks. In The Conference and Workshop on Neural Information Processing Systems. 2017--2025. http://papers.nips.cc/paper/5854-spatial-transformer-networksGoogle ScholarGoogle Scholar
  19. Neil F Johnson and Sushil Jajodia. 1998. Exploring steganography: Seeing the unseen. Computer 31, 2 (1998), 26--34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jan Kallwies, Bianca Forkel, and Hans-Joachim Wuensche. 2020. Determining and Improving the Localization Accuracy of AprilTag Detection. In IEEE International Conference on Robotics and Automation (ICRA). IEEE, 8288--8294.Google ScholarGoogle Scholar
  21. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In International Conference on Learning Representations. https://openreview.net/forum?id=Hk99zCeAbGoogle ScholarGoogle Scholar
  22. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4401--4410.Google ScholarGoogle ScholarCross RefCross Ref
  23. Hirokazu Kato and Mark Billinghurst. 1999. Marker tracking and hmd calibration for a video-based augmented reality conferencing system. In Proceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR'99). IEEE, 85--94.Google ScholarGoogle ScholarCross RefCross Ref
  24. Maximilian Krogius, Acshi Haggenmiller, and Edwin Olson. 2019. Flexible Layouts for Fiducial Tags.. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 1898--1903.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Youngwan Lee, Joong-won Hwang, Sangrok Lee, Yuseok Bae, and Jongyoul Park. 2019. An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  26. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2117--2125.Google ScholarGoogle ScholarCross RefCross Ref
  27. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European conference on computer vision (ECCV). Springer, 740--755.Google ScholarGoogle ScholarCross RefCross Ref
  28. Guilin Liu, Fitsum A Reda, Kevin J Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image inpainting for irregular holes using partial convolutions. In Proceedings of the European conference on computer vision (ECCV). 85--100.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rafael Munoz-Salinas. 2012. Aruco: a minimal library for augmented reality applications based on opencv. Universidad de Córdoba (2012).Google ScholarGoogle Scholar
  30. Leonid Naimark and Eric Foxlin. 2002. Circular data matrix fiducial system and robust image processing for a wearable vision-inertial self-tracker. In Proceedings. International Symposium on Mixed and Augmented Reality. IEEE, 27--36.Google ScholarGoogle ScholarCross RefCross Ref
  31. Gaku Narita, Yoshihiro Watanabe, and Masatoshi Ishikawa. 2016. Dynamic projection mapping onto deforming non-rigid surface using deformable dot cluster marker. IEEE transactions on visualization and computer graphics 23, 3 (2016), 1235--1248.Google ScholarGoogle Scholar
  32. Edwin Olson. 2011. AprilTag: A robust and flexible visual fiducial system. In 2011 IEEE International Conference on Robotics and Automation. IEEE, 3400--3407.Google ScholarGoogle ScholarCross RefCross Ref
  33. OpenCV. 2020. Open Source Computer Vision Library. https://opencv.org/. Version 4.2.0.Google ScholarGoogle Scholar
  34. John Peace, Eric Psota, Yanfeng Liu, and Lance C. Pérez. 2020. E2ETag: An End-to-End Trainable Method for Generating and Detecting Fiducial Markers. In British Machine Vision Conference (BMVC).Google ScholarGoogle Scholar
  35. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In The Conference and Workshop on Neural Information Processing Systems. 91--99.Google ScholarGoogle Scholar
  36. Francisco J Romero-Ramirez, Rafael Muñoz-Salinas, and Rafael Medina-Carnicer. 2018. Speeded up detection of squared fiducial markers. Image and vision Computing 76 (2018), 38--47.Google ScholarGoogle Scholar
  37. Matthew Tancik, Ben Mildenhall, and Ren Ng. 2020. Stegastamp: Invisible hyperlinks in physical photographs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2117--2126.Google ScholarGoogle ScholarCross RefCross Ref
  38. Weixuan Tang, Shunquan Tan, Bin Li, and Jiwu Huang. 2017. Automatic steganographic distortion learning using a generative adversarial network. IEEE Signal Processing Letters 24, 10 (2017), 1547--1551.Google ScholarGoogle Scholar
  39. Hideaki Uchiyama and Eric Marchand. 2011. Deformable random dot markers. In 2011 10th IEEE International Symposium on Mixed and Augmented Reality. IEEE, 237--238.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Bruce Walter, Stephen R Marschner, Hongsong Li, and Kenneth E Torrance. 2007. Microfacet Models for Refraction through Rough Surfaces. Rendering techniques 2007 (2007), 18th.Google ScholarGoogle Scholar
  41. John Wang and Edwin Olson. 2016. AprilTag 2: Efficient and robust fiducial detection. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 4193--4198.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Eric Wengrowski and Kristin Dana. 2019. Light field messaging with deep photographic steganography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1515--1524.Google ScholarGoogle ScholarCross RefCross Ref
  43. Pin Wu, Yang Yang, and Xiaoqiang Li. 2018. Stegnet: Mega image steganography capacity with deep convolutional network. Future Internet 10, 6 (2018), 54.Google ScholarGoogle ScholarCross RefCross Ref
  44. Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. https://github.com/facebookresearch/detectron2.Google ScholarGoogle Scholar
  45. Anqi Xu and Gregory Dudek. 2011. Fourier tag: A smoothly degradable fiducial marker system with configurable payload capacity. In Canadian Conference on Computer and Robot Vision. IEEE, 40--47.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5505--5514.Google ScholarGoogle ScholarCross RefCross Ref
  47. Jiren Zhu, Russell Kaplan, Justin Johnson, and Li Fei-Fei. 2018. Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV). 657--672.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DeepFormableTag: end-to-end generation and recognition of deformable fiducial markers

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 40, Issue 4
          August 2021
          2170 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/3450626
          Issue’s Table of Contents

          Copyright © 2021 Owner/Author

          This work is licensed under a Creative Commons Attribution International 4.0 License.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 July 2021
          Published in tog Volume 40, Issue 4

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader