skip to main content
research-article

The Efficient Mining of Skyline Patterns from a Volunteer Computing Network

Authors Info & Claims
Published:16 July 2021Publication History
Skip Abstract Section

Abstract

In the ever-growing world, the concepts of High-utility Itemset Mining (HUIM) as well as Frequent Itemset Mining (FIM) are fundamental works in knowledge discovery. Several algorithms have been designed successfully. However, these algorithms only used one factor to estimate an itemset. In the past, skyline pattern mining by considering both aspects of frequency and utility has been extensively discussed. In most cases, however, people tend to focus on purchase quantities of itemsets rather than frequencies. In this article, we propose a new knowledge called skyline quantity-utility pattern (SQUP) to provide better estimations in the decision-making process by considering quantity and utility together. Two algorithms, respectively, called SQU-Miner and SKYQUP are presented to efficiently mine the set of SQUPs. Moreover, the usage of volunteer computing is proposed to show the potential in real supermarket applications. Two new efficient utility-max structures are also mentioned for the reduction of the candidate itemsets, respectively, utilized in SQU-Miner and SKYQUP. These two new utility-max structures are used to store the upper-bound of utility for itemsets under the quantity constraint instead of frequency constraint, and the second proposed utility-max structure moreover applies a recursive updated process to further obtain strict upper-bound of utility. Our in-depth experimental results prove that SKYQUP has stronger performance when a comparison is made to SQU-Miner in terms of memory usage, runtime, and the number of candidates.

References

  1. Rakesh Agarwal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules. In International Conference on Very Large Data Bases. 487–499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. 1993. Mining association rules between sets of items in large databases. In ACM SIGMOD International Conference on Management of Data. 207–216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Usman Ahmed, Jerry Chun-Wei Lin, Gautam Srivastava, Rizwan Yasin, and Youcef Djenouri. 2020. An evolutionary model to mine high expected utility patterns from uncertain databases. IEEE Trans. Emerg. Topics Computat. Intell. (2020).Google ScholarGoogle Scholar
  4. Stephan Borzsony, Donald Kossmann, and Konrad Stocker. 2001. The skyline operator. In International Conference on Data Engineering 5, 1 (2020), 19–28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chee-Yong Chan, H. V. Jagadish, Kian-Lee Tan, Anthony K. H. Tung, and Zhenjie Zhang. 2006. Finding k-dominant skylines in high dimensional space. In ACM SIGMOD International Conference on Management of Data. 503–514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Raymond Chan, Qiang Yang, and Yi-Dong Shen. 2003. Mining high utility itemsets. In IEEE International Conference on Data Mining. 19–19.Google ScholarGoogle ScholarCross RefCross Ref
  7. Jan Chomicki, Parke Godfrey, Jarek Gryz, and Dongming Liang. 2003. Skyline with presorting. In International Conference on Data Engineering, Vol. 3. 717–719.Google ScholarGoogle ScholarCross RefCross Ref
  8. Microsoft. 2000. Example database Foodmart of Microsoft Analysis Services. Retrieved from http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx.Google ScholarGoogle Scholar
  9. Philippe Fournier-Viger, Jerry Chun-Wei Lin, Rage Uday Kiran, Yun Sing Koh, and Rincy Thomas. 2017. A survey of sequential pattern mining. Data Sci. Pattern Recog. 1, 1 (2017), 54–77.Google ScholarGoogle Scholar
  10. Philippe Fournier-Viger, Jerry Chun-Wei Lin, Bay Vo, Tin Truong Chi, Ji Zhang, and Hoai Bac Le. 2017. A survey of itemset mining. Wiley Interdisc. Rev.: Data Mining Knowl. Discov. 7, 4 (2017), e1207.Google ScholarGoogle ScholarCross RefCross Ref
  11. Philippe Fournier-Viger, Cheng-Wei Wu, and Vincent S. Tseng. 2012. Mining top-k association rules. In Canadian Conference on Artificial Intelligence. 61–73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Wensheng Gan, Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Vincent Tseng, and Philip Yu. 2021. A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng. 33, 4 (2021), 1306–1327.Google ScholarGoogle ScholarCross RefCross Ref
  13. Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, and Justin Zhan. 2017. Data mining in distributed environment: A survey. Wiley Interdisc. Rev.: Data Mining Knowl. Discov. 7, 6 (2017), e1216.Google ScholarGoogle ScholarCross RefCross Ref
  14. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2019. A survey of parallel sequential pattern mining. ACM Trans. Knowl. Discov. Data 13, 3 (2019), 1–34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Wensheng Gan, Jerry Chun-Wei Lin, Jiexiong Zhang, and Philip S. Yu. 2020. Utility mining across multi-sequences with individualized thresholds. ACM Trans. Data Sci. 1, 2 (2020), 1–29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Bart Goethals and M. J. Zaki. 2003. Frequent itemset mining implementations repository. Retrieved from http://fimi.cs.helsinki.fi.Google ScholarGoogle Scholar
  17. Vikram Goyal, Ashish Sureka, and Dhaval Patel. 2015. Efficient skyline itemsets mining. In International C* Conference on Computer Science & Software Engineering. 119–124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29, 2 (2000), 1–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Donald Kossmann, Frank Ramsak, and Steffen Rost. 2002. Shooting stars in the sky: An online algorithm for skyline queries. In International Conference on Very Large Data Bases. 275–286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu. 2011. An effective tree structure for mining high utility itemsets. Exp. Syst. Applic. 38, 6 (2011), 7419–7424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jerry Chun-Wei Lin, Lu Yang, Philippe Fournier-Viger, and Tzung-Pei Hong. 2019. Mining of skyline patterns by considering both frequent and utility constraints. Eng. Applic. Artif. Intelligence 77 (2019), 229–238.Google ScholarGoogle ScholarCross RefCross Ref
  22. Xuemin Lin, Yidong Yuan, Qing Zhang, and Ying Zhang. 2007. Selecting stars: The k most representative skyline operator. In International Conference on Data Engineering. 86–95.Google ScholarGoogle ScholarCross RefCross Ref
  23. Junqiang Liu, Yunhe Pan, Ke Wang, and Jiawei Han. 2002. Mining frequent item sets by opportunistic projection. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 229–238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mengchi Liu and Junfeng Qu. 2012. Mining high utility itemsets without candidate generation. In ACM International Conference on Information and Knowledge Management. 55–64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ying Liu, Wei-keng Liao, and Alok Choudhary. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. 689–695. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jeng-Shyang Pan, Jerry Chun-Wei Lin, Lu Yang, Philippe Fournier-Viger, and Tzung-Pei Hong. 2017. Efficiently mining of skyline frequent-utility patterns. Intell. Data Anal. 21, 6 (2017), 1407–1423.Google ScholarGoogle ScholarCross RefCross Ref
  27. Dimitris Papadias, Yufei Tao, Greg Fu, and Bernhard Seeger. 2005. Progressive skyline computation in database systems. ACM Trans. Datab. Syst. 30, 1 (2005), 41–82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jong Soo Park, Ming-Syan Chen, and Philip S. Yu. 1995. An effective hash-based algorithm for mining association rules. ACM SIGMOD Rec. 24, 2 (1995), 175–186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Vid Podpecan, Nada Lavrac, and Igor Kononenko. 2007. A fast algorithm for mining utility-frequent itemsets. In International Workshop on Constraint-based Mining and Learning at ECML/PKDD. 9–20.Google ScholarGoogle Scholar
  30. Ashok Savasere, Edward Robert Omiecinski, and Shamkant B. Navathe. 1995. An Efficient Algorithm for Mining Association Rules in Large Databases. Technical Report. Georgia Institute of Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Pankaj Singh, Sudhakar Singh, P. K. Mishra, and Rakhi Garg. 2019. RDD-Eclat: Approaches to parallelize Eclat algorithm on Spark RDD framework. In International Conference on Computer Networks and Inventive Communication Technologies. 755–768.Google ScholarGoogle Scholar
  32. Gautam Srivastava, Jerry Chun-Wei Lin, Matin Pirouz, Yuanfa Li, and Unil Yun. 2020. A pre-large weighted-fusion system of sensed high-utility patterns. IEEE Sensors J. (2020).Google ScholarGoogle Scholar
  33. Kian-Lee Tan, Pin-Kwang Eng, Beng Chin Ooi, et al. 2001. Efficient progressive skyline computation. In International Conference on Very Large Data Bases, Vol. 1. 301–310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Vincent S. Tseng, Bai-En Shie, Cheng-Wei Wu, and S. Yu Philip. 2012. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25, 8 (2012), 1772–1786. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Vincent S. Tseng, Cheng-Wei Wu, Philippe Fournier-Viger, and Philip S. Yu. 2015. Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28, 1 (2015), 54–67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Vincent S. Tseng, Cheng-Wei Wu, Bai-En Shie, and Philip S. Yu. 2010. UP-Growth: An efficient algorithm for high utility itemset mining. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 253–262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jimmy Ming-Tai Wu, Jerry Chun-Wei Lin, and Ashish Tamrakar. 2019. High-utility itemset mining with effective pruning strategies. ACM Trans. Knowl. Discov. Data 13, 6 (2019), 1–22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tsu-Yang Wu, Jerry Chun-Wei Lin, Unil Yun, Chun-Hao Chen, Gautam Srivastava, and Xianbiao Lv. 2020. An efficient algorithm for fuzzy frequent itemset mining. J. Intell. Fuzzy Syst. 385 (2020), 5787–5797.Google ScholarGoogle ScholarCross RefCross Ref
  39. Hong Yao, Howard J. Hamilton, and Cory J. Butz. 2004. A foundational approach to mining itemset utilities from databases. In SIAM International Conference on Data Mining. 482–486.Google ScholarGoogle Scholar
  40. Jieh-Shan Yeh, Yu-Chiang Li, and Chin-Chen Chang. 2007. Two-phase algorithms for a novel utility-frequent mining model. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. 433–444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. 1997. New algorithm for fast discovery of association rules. In International Conference on Knowledge Discovery and Data Mining. 283–286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Mohammed Javeed Zaki. 2000. Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12, 3 (2000), 372–390. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The Efficient Mining of Skyline Patterns from a Volunteer Computing Network

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Internet Technology
          ACM Transactions on Internet Technology  Volume 21, Issue 4
          November 2021
          520 pages
          ISSN:1533-5399
          EISSN:1557-6051
          DOI:10.1145/3472282
          • Editor:
          • Ling Lu
          Issue’s Table of Contents

          Copyright © 2021 Association for Computing Machinery.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 July 2021
          • Accepted: 1 September 2020
          • Revised: 1 August 2020
          • Received: 1 June 2020
          Published in toit Volume 21, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!