skip to main content
research-article

Mining and Quality Assessment of Mashup Model Patterns with the Crowd: A Feasibility Study

Published:25 June 2016Publication History
Skip Abstract Section

Abstract

Pattern mining, that is, the automated discovery of patterns from data, is a mathematically complex and computationally demanding problem that is generally not manageable by humans. In this article, we focus on small datasets and study whether it is possible to mine patterns with the help of the crowd by means of a set of controlled experiments on a common crowdsourcing platform. We specifically concentrate on mining model patterns from a dataset of real mashup models taken from Yahoo! Pipes and cover the entire pattern mining process, including pattern identification and quality assessment. The results of our experiments show that a sensible design of crowdsourcing tasks indeed may enable the crowd to identify patterns from small datasets (40 models). The results, however, also show that the design of tasks for the assessment of the quality of patterns to decide which patterns to retain for further processing and use is much harder (our experiments fail to elicit assessments from the crowd that are similar to those by an expert). The problem is relevant in general to model-driven development (e.g., UML, business processes, scientific workflows), in that reusable model patterns encode valuable modeling and domain knowledge, such as best practices, organizational conventions, or technical choices, that modelers can benefit from when designing their own models.

Skip Supplemental Material Section

Supplemental Material

References

  1. Gustavo Alonso, Fabio Casati, Harumi Kuno, and Vijay Machiraju. 2003. Web Services: Concepts, Architectures, and Applications. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cinzia Cappiello, Florian Daniel, Agnes Koschmider, Maristella Matera, and Matteo Picozzi. 2011. A quality model for mashups. In The International Conference of Web Engineering (ICWE’11). Springer, 137--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cinzia Cappiello, Florian Daniel, Maristella Matera, and Cesare Pautasso. 2010. Information quality in mashups. IEEE Internet Computing 14, 4 (2010), 14--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cinzia Cappiello, Maristella Matera, Matteo Picozzi, Florian Daniel, and Adrian Fernandez. 2012. Quality-aware mashup composition: Issues, techniques and tools. In The International Conference on the Quality of Information and Communications Technology (QUATIC’12). IEEE, 10--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Michael Pierre Carlson, Anne H. Ngu, Rodion Podorozhny, and Liangzhao Zeng. 2008. Automatic mash up of composite applications. In The International Conference of Service Oriented Computing (ICSOC’08). Springer, 317--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Huajun Chen, Bin Lu, Yuan Ni, Guotong Xie, Chunying Zhou, Jinhua Mi, and Zhaohui Wu. 2009. Mashup by surfing a web of data APIs. VLDB 2, 2 (August 2009), 1602--1605. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Florian Daniel and Maristella Matera. 2014. Mashups: Concepts, Models and Architectures. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ewa Deelman, Dennis Gannon, Matthew S. Shields, and Ian Taylor. 2009. Workflows and e-science: An overview of workflow system features and capabilities. Future Generation of Computer Systems 25, 5 (2009), 528--540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Remco Dijkman, Marlon Dumas, Boudewijn Van Dongen, Reina Käärik, and Jan Mendling. 2011. Similarity of business process models: Metrics and evaluation. Information Systems 36, 2 (2011), 498--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hazem Elmeleegy, Anca Ivan, Rama Akkiraju, and Richard Goodwin. 2008. Mashup advisor: A recommendation tool for mashup development. In The International Conference on Web Services (ICWS’08). IEEE Computer Society, 337--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, and Reynold Xin. 2011. CrowdDB: Answering queries with crowdsourcing. In SIGMOD. 61--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Liqiang Geng and Howard J. Hamilton. 2006. Interestingness measures for data mining: A survey. Computer Surveys 38, 3 (2006), 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Dan Gillick and Yang Liu. 2010. Non-expert evaluation of summarization systems is risky. In The NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (NAACL-HLT’10). Association for Computational Linguistics, 148--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ohad Greenshpan, Tova Milo, and Neoklis Polyzotis. 2009. Autocompletion for mashups. VLDB 2, 1 (August 2009), 538--549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jeff Howe. 2008. Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business. Crown Publishing Group, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Till Janner, Robert Siebeck, Christoph Schroth, and Volker Hoyer. 2009. Patterns for enterprise mashups in b2b collaborations to foster lightweight composition and end user development. In The International Conference on Web Services (ICWS’09). 976--983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hyun Joon Jung and Matthew Lease. 2011. Improving consensus accuracy via Z-score and weighted voting. In The AAAI Conference on Human Computation (AAAIWS’11). 88--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Christian Keimel, Julian Habigt, Clemens Horch, and Klaus Diepold. 2012. Qualitycrowd—A framework for crowd-based quality evaluation. In Picture Coding Symposium (PCS’12). IEEE, 245--248.Google ScholarGoogle ScholarCross RefCross Ref
  19. Faiza Khan Khattak and Ansaf Salleb-Aouissi. 2011. Quality control of crowd labeling through expert evaluation. In The Neural Information Processing Systems Workshop on Computational Social Science and the Wisdom of Crowds (NISP’11).Google ScholarGoogle Scholar
  20. Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’08). ACM, 453--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Christian Kohls. 2011. The structure of patterns: Part II - qualities. In The Conference on Pattern Languages of Programs (PLoP’11). 27:1--27:18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Wenchao Li, Sanjit A. Seshia, and Somesh Jha. 2012. CrowdMine: Towards crowdsourced human-assisted verification. In The Annual Design Automation Conference (DAC’12). IEEE, 1250--1251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Adam Marcus, Eugene Wu, David R. Karger, Samuel Madden, and Robert C. Miller. 2011. Crowdsourced databases: Query processing with people. In The Conference on Innovative Data Systems Research (CIDR’11). 211--214.Google ScholarGoogle Scholar
  24. Winter Mason and Duncan J. Watts. 2010. Financial incentives and the performance of crowds. ACM SigKDD Explorations Newsletter 11, 2 (2010), 100--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Thomas J. McCabe. 1976. A complexity measure. IEEE Transactions on Software Engineering SE-2, 4 (Dec. 1976), 308--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Richard M. C. McCreadie, Craig Macdonald, and Iadh Ounis. 2010. Crowdsourcing a news query classification dataset. In The SIGIR Workshop on Crowdsourcing for Search Evaluation (CSE’10). 31--38.Google ScholarGoogle Scholar
  27. OMG. 2011. Business Process Model and Notation (BPMN) version 2.0. http://www.bpmn.org. (2011).Google ScholarGoogle Scholar
  28. Object Management Group (OMG). 2014. The Interaction Flow Modeling Language (IFML), Version 974 1.0. OMG standard specification. Object Management Group, http://www.ifml.org.Google ScholarGoogle Scholar
  29. OMG. 2014. Unified Modeling Language (UML). http://www.uml.org/. (2014).Google ScholarGoogle Scholar
  30. Aditya Parameswaran and Neoklis Polyzotis. 2011. Answering queries using humans, algorithms and databases. In The Conference on Innovative Data Systems Research (CIDR’11). 160--166.Google ScholarGoogle Scholar
  31. Anton V. Riabov, Eric Boillet, Mark D. Feblowitz, Zhen Liu, and Anand Ranganathan. 2008. Wishful search: Interactive composition of data mashups. In The International Conference on World Wide Web (WWW’08). ACM, 775--784. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Carlos Rodríguez, Florian Daniel, and Fabio Casati. 2014a. Crowd-based mining of reusable process model patterns. In The International Conference on Business Process Management (BPM’14). Springer, 51--66.Google ScholarGoogle ScholarCross RefCross Ref
  33. Carlos Rodríguez, Soudip Roy Chowdhury, Florian Daniel, Hamid R. Motahari Nezhad, and Fabio Casati. 2014b. Assisted mashup development: On the discovery and recommendation of mashup composition knowledge. In Web Services Foundations. Springer, 683--708.Google ScholarGoogle Scholar
  34. Soudip Roy Chowdhury, Florian Daniel, and Fabio Casati. 2014. Recommendation and weaving of reusable mashup model patterns for assisted development. ACM Transactions on Internet Technologies 14, 2--3 (2014), Article 21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis. 2008. Get another label? Improving data quality and data mining using multiple, noisy labelers. In The International Conference on Knowledge Discovery and Data Mining (KDD’08). ACM, 614--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kathryn T. Stolee and Sebastian Elbaum. 2010. Exploring the use of crowdsourcing to support empirical studies in software engineering. In The ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’10). ACM, 35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining. Addison-Wesley.Google ScholarGoogle Scholar
  38. Stefano Tranquillini, Florian Daniel, Pavel Kucherbaev, and Fabio Casati. 2014. Modeling, enacting and integrating custom crowdsourcing processes. ACM Transactions on the Web 9, 2 (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Luis Von Ahn. 2006. Games with a purpose. Computer 39, 6 (2006), 92--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Luis Von Ahn and Laura Dabbish. 2004. Labeling images with a computer game. In The SIGCHI Conference on Human Factors in Computing Systems (CHI’04). ACM, 319--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Mathias Weske. 2007. Business Process Management: Concepts, Languages, Architectures. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mining and Quality Assessment of Mashup Model Patterns with the Crowd: A Feasibility Study

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Internet Technology
            ACM Transactions on Internet Technology  Volume 16, Issue 3
            August 2016
            156 pages
            ISSN:1533-5399
            EISSN:1557-6051
            DOI:10.1145/2926746
            • Editor:
            • Munindar P. Singh
            Issue’s Table of Contents

            Copyright © 2016 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 25 June 2016
            • Received: 1 March 2016
            • Accepted: 1 March 2016
            Published in toit Volume 16, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!