skip to main content
research-article

Micro perceptual human computation for visual tasks

Published:07 September 2012Publication History
Skip Abstract Section

Abstract

Human Computation (HC) utilizes humans to solve problems or carry out tasks that are hard for pure computational algorithms. Many graphics and vision problems have such tasks. Previous HC approaches mainly focus on generating data in batch, to gather benchmarks, or perform surveys demanding nontrivial interactions. We advocate a tighter integration of human computation into online, interactive algorithms. We aim to distill the differences between humans and computers and maximize the advantages of both in one algorithm. Our key idea is to decompose such a problem into a massive number of very simple, carefully designed, human micro-tasks that are based on perception, and whose answers can be combined algorithmically to solve the original problem. Our approach is inspired by previous work on micro-tasks and perception experiments. We present three specific examples for the design of micro perceptual human computation algorithms to extract depth layers and image normals from a single photograph, and to augment an image with high-level semantic information such as symmetry.

Skip Supplemental Material Section

Supplemental Material

tp189_12.mp4

References

  1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Susstrunk, S. 2010. Superpixels. Tech. rep., EPFL.Google ScholarGoogle Scholar
  2. Adar, E. 2011. Why I hate Mechanical Turk research. In Proceedings of the CHI' Workshop on Crowdsourcing and Human Computation.Google ScholarGoogle Scholar
  3. Adomavicius, G. and Tuzhilin, A. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. Trans. Knowl. Data Engin. 17, 734--749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ahn, L. V., Blum, M., Hopper, N. J., and Langford, J. 2003. CAPTCHA: Using hard AI problems for security. In Proceedings of the Conference on Advances in Cryptology (Eurocrypt). 294--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Amazon. 2005. Mechanical turk. http://www.mturk.com/.Google ScholarGoogle Scholar
  6. Amer, M., Raich, R., and Todorovic, S. 2010. Monocular extraction of 2.1D sketch. In Proceedings of the International Conference on Image Processing (ICIP). 3437--3440.Google ScholarGoogle Scholar
  7. Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., and Zaharia, M. 2010. A view of cloud computing. Comm. ACM 53, 50--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Assa, J. and Wolf, I. 2007. Diorama construction from a single image. In Proceedings of the Eurographics Conference. Eurographics Association.Google ScholarGoogle Scholar
  9. Belhumeur, P. N., Kriegman, D. J., and Yuille, A. L. 1997. The bas-relief ambiguity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1060--1066. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bernstein, M. S., Brandt, J., Miller, R. C., and Karger, D. R. 2011. Crowds in two seconds: Enabling real-time crowd-powered interfaces. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 32--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., Crowell, D., and Panovich, K. 2010. Soylent: A word processor with a crowd inside. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 313--322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bhat, P., Zitnick, C. L., Cohen, M., and Curless, B. 2010. GradientShop: A gradient-domain optimization framework for image and video filtering. ACM Trans. Graph. 29, 10:1--10:14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., Miller, R., Tatarowicz, A., White, B., White, S., and Yeh, T. 2010. VizWiz: Nearly real-time answers to visual questions. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 333--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Branson, S., Wah, C., Babenko, B., Schroff, F., Welinder, P., Perona, P., and Belongie, S. 2010. Visual recognition with humans in the loop. In Proceedings of the European Conference on Computer Vision (ECCV). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Chen, P.-C., Hays, J. H., Lee, S., Park, M., and Liu, Y. 2007. A quantitative evaluation of symmetry detection algorithms. Tech. rep. CMU-RI-TR-07-36, Robotics Institute, Pittsburgh, PA.Google ScholarGoogle Scholar
  16. Chen, X., Golovinskiy, A., and Funkhouser, T. 2009. A benchmark for 3D mesh segmentation. ACM Trans. Graph. 28, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chilton, L. B., Horton, J. J., Miller, R. C., and Azenkot, S. 2010. Task search in a human computation market. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP). 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Cole, F., Sanik, K., DeCarlo, D., Finkelstein, A., Funkhouser, T., Rusinkiewicz, S., and Singh, M. 2009. How well do line drawings depict shape? ACM Trans. Graph. 28, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Comaniciu, D. and Meer, P. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 5, 603--619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Cornelius, H., Perd'och, M., Matas, J., and Loy, G. 2007. Efficient symmetry detection using local affine frames. In Proceedings of the Scandinavian Conference on Image Analysis (SCIA). 152--161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. CrowdFlower. 2007. Crowdflower. http://crowdflower.com/.Google ScholarGoogle Scholar
  22. Durou, J.-D., Falcone, M., and Sagona, M. 2008. Numerical methods for shape-from-shading: A new survey with benchmarks. Comput. Vis. Image Understand. 109, 22--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Faridani, S., Hartmann, B., and Ipeirotis, P. 2011. What's the right price? Pricing tasks for finishing on time. In Proceedings of the AAAI Workshop on Human Computation (HCOMP).Google ScholarGoogle Scholar
  24. Goldberg, D., Nichols, D., Oki, B. M., and Terry, D. 1992. Using collaborative filtering to weave an information tapestry. Comm. ACM 35, 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Grier, D. A. 2005. When Computers Were Human. Princeton University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Hayes, B. 2008. Cloud computing. Comm. ACM 51, 7, 9--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Healy, A. F., Proctor, R. W., and Weiner, I. B., Eds. 2003. Experimental Psychology. Handbook of Psychology. Vol. 4. Wiley.Google ScholarGoogle Scholar
  28. Heer, J. and Bostock, M. 2010. Crowdsourcing graphical perception: Using mechanical turk to assess visualization design. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI). 203--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Hoiem, D., Efros, A. A., and Hebert, M. 2005. Automatic photo pop-up. http://www.cs.uiuc.edu/homes/dhoiem/projects/popup/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Huang, E., Zhang, H., Parkes, D. C., Gajos, K. Z., and Chen, Y. 2010. Toward automatic task design: A progress report. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ipeirotis, P. G. 2010. Analyzing the amazon mechanical turk marketplace. ACM Crossroads 17, 16--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ipeirotis, P. G., Provost, F., and Wang, J. 2010. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kalogerakis, E., Hertzmann, A., and Singh, K. 2010. Learning 3D mesh segmentation and labeling. ACM Trans. Graph. 29, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Koenderink, J. J., van Doorn, A. J., and Kappers, A. M. L. 1992. Surface perception in pictures. Percept. Psycophys. 52, 5, 487--496.Google ScholarGoogle ScholarCross RefCross Ref
  35. Koenderink, J. J., van Doorn, A. J., Kappers, A. M. L., and Todd, J. T. 2001. Ambiguity and the ‘mental eye’ in pictorial relief. Percept. 30, 431--448.Google ScholarGoogle ScholarCross RefCross Ref
  36. Levinshtein, A., Stere, A., Kutulakos, K. N., Fleet, D. J., Dickinson, S. J., and Siddiqi, K. 2009. TurboPixels: Fast superpixels using geometric flows. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2290--2297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Little, G., Chilton, L. B., Goldman, M., and Miller, R. C. 2010. TurKit: Human computation algorithms on Mechanical Turk. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Liu, Y., Hel-Or, H., Kaplan, C. S., and Gool, L. V. 2010. Computational symmetry in computer vision and computer graphics. Found. Trends Comput. Graph. Vis. 5, 1--195.Google ScholarGoogle ScholarCross RefCross Ref
  39. Mason, W. and Suri, S. 2011. Conducting behavioral research on amazon's mechanical turk. Behav. Res. Methods 44, 1.Google ScholarGoogle ScholarCross RefCross Ref
  40. Mason, W. and Watts, D. J. 2010. Financial incentives and the “performance of crowds”. SIGKDD Explor. Newslett. 11, 100--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Oh, B. M., Chen, M., Dorsey, J., and Durand, F. 2001. Image-Based modeling and photo editing. In Proceedings of the ACM SIGGRAPH Conference. 433--442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Quinn, A. J. and Bederson, B. B. 2011. Human computation: A survey and taxonomy of a growing field. In Proceedings of the ACM SIGCHI Conference. 1403--1412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Russel, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. 2008. LabelMe: A database and Web-based tool for image annotation. Int. J. Comput. Vis. 77, 1--3, 157-173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Samasource. 2008. Samasource. http://www.samasource.org/.Google ScholarGoogle Scholar
  45. Saxena, A., Sun, M., and Ng, A. Y. 2009. Make3D: Learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31, 824--840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Schmidt, R., Khan, A., Kurtenbach, G., and Singh, K. 2009. On expert performance in 3D curve-drawing tasks. In Proceedings of the Eurographics Workshop on Sketch-Based Interfaces and Modeling (SBIM). 133--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Shahaf, D. and Horvitz, E. 2010. Generalized task markets for human and machine computation. In Proceedings of the National Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  48. Sorokin, A., Berenson, D., Srinivasa, S., and Hebert, M. 2010. People helping robots helping people: Crowdsourcing for grasping novel objects. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).Google ScholarGoogle Scholar
  49. Spiro, I., Taylor, G., Williams, G., and Bregler, C. 2010. Hands by hand: Crowd-Sourced motion tracking for gesture annotation. In Proceedings of the Computer Vision and Pattern Recognition Workshops (CVPRW). 17--24.Google ScholarGoogle Scholar
  50. Sykora, D., Sedlacek, D., Jinchao, S., Dingliana, J., and Collins, S. 2010. Adding depth to cartoons using sparse depth (in)equalities. Comput. Graph. Forum 29, 2.Google ScholarGoogle ScholarCross RefCross Ref
  51. Talton, J. O., Gibson, D., Yang, L., Hanrahan, P., and Koltun, V. 2009. Exploratory modeling with collaborative design spaces. ACM Trans. Graph. 28, 167:1--167:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Txteagle. 2009. Txteagle. http://txteagle.com/.Google ScholarGoogle Scholar
  53. Ventura, J., DiVerdi, S., and Hollerer, T. 2009. A sketch-based interface for photo pop-up. In Proceedings of the Eurographics Workshop on Sketch-Based Interfaces and Modeling (SBIM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. von Ahn, L. 2005. Human computation. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the ACM SIGCHI Conference. 319--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. von Ahn, L. and Dabbish, L. 2008. General techniques for designing games with a purpose. Comm. ACM 51, 8, 58--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Wu, T.-P., Sun, J., Tang, C.-K., and Shum, H.-Y. 2008. Interactive normal reconstruction from a single image. ACM Trans. Graph. 27, 119:1--119:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yuen, J., Russell, B. C., Liu, C., and Torralba, A. 2009. LabelMe video: Building a video database with human annotations. In Proceedings of the IEEE 12th International Conference on Computer Vision (ICCV). 1451--1458.Google ScholarGoogle Scholar

Index Terms

  1. Micro perceptual human computation for visual tasks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 31, Issue 5
        August 2012
        107 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/2231816
        Issue’s Table of Contents

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 September 2012
        • Accepted: 1 January 2012
        • Revised: 1 December 2011
        • Received: 1 November 2011
        Published in tog Volume 31, Issue 5

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader