Abstract
In the ever-growing world, the concepts of High-utility Itemset Mining (HUIM) as well as Frequent Itemset Mining (FIM) are fundamental works in knowledge discovery. Several algorithms have been designed successfully. However, these algorithms only used one factor to estimate an itemset. In the past, skyline pattern mining by considering both aspects of frequency and utility has been extensively discussed. In most cases, however, people tend to focus on purchase quantities of itemsets rather than frequencies. In this article, we propose a new knowledge called skyline quantity-utility pattern (SQUP) to provide better estimations in the decision-making process by considering quantity and utility together. Two algorithms, respectively, called SQU-Miner and SKYQUP are presented to efficiently mine the set of SQUPs. Moreover, the usage of volunteer computing is proposed to show the potential in real supermarket applications. Two new efficient utility-max structures are also mentioned for the reduction of the candidate itemsets, respectively, utilized in SQU-Miner and SKYQUP. These two new utility-max structures are used to store the upper-bound of utility for itemsets under the quantity constraint instead of frequency constraint, and the second proposed utility-max structure moreover applies a recursive updated process to further obtain strict upper-bound of utility. Our in-depth experimental results prove that SKYQUP has stronger performance when a comparison is made to SQU-Miner in terms of memory usage, runtime, and the number of candidates.
- Rakesh Agarwal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules. In International Conference on Very Large Data Bases. 487–499. Google Scholar
Digital Library
- Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. 1993. Mining association rules between sets of items in large databases. In ACM SIGMOD International Conference on Management of Data. 207–216. Google Scholar
Digital Library
- Usman Ahmed, Jerry Chun-Wei Lin, Gautam Srivastava, Rizwan Yasin, and Youcef Djenouri. 2020. An evolutionary model to mine high expected utility patterns from uncertain databases. IEEE Trans. Emerg. Topics Computat. Intell. (2020).Google Scholar
- Stephan Borzsony, Donald Kossmann, and Konrad Stocker. 2001. The skyline operator. In International Conference on Data Engineering 5, 1 (2020), 19–28. Google Scholar
Digital Library
- Chee-Yong Chan, H. V. Jagadish, Kian-Lee Tan, Anthony K. H. Tung, and Zhenjie Zhang. 2006. Finding k-dominant skylines in high dimensional space. In ACM SIGMOD International Conference on Management of Data. 503–514. Google Scholar
Digital Library
- Raymond Chan, Qiang Yang, and Yi-Dong Shen. 2003. Mining high utility itemsets. In IEEE International Conference on Data Mining. 19–19.Google Scholar
Cross Ref
- Jan Chomicki, Parke Godfrey, Jarek Gryz, and Dongming Liang. 2003. Skyline with presorting. In International Conference on Data Engineering, Vol. 3. 717–719.Google Scholar
Cross Ref
- Microsoft. 2000. Example database Foodmart of Microsoft Analysis Services. Retrieved from http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx.Google Scholar
- Philippe Fournier-Viger, Jerry Chun-Wei Lin, Rage Uday Kiran, Yun Sing Koh, and Rincy Thomas. 2017. A survey of sequential pattern mining. Data Sci. Pattern Recog. 1, 1 (2017), 54–77.Google Scholar
- Philippe Fournier-Viger, Jerry Chun-Wei Lin, Bay Vo, Tin Truong Chi, Ji Zhang, and Hoai Bac Le. 2017. A survey of itemset mining. Wiley Interdisc. Rev.: Data Mining Knowl. Discov. 7, 4 (2017), e1207.Google Scholar
Cross Ref
- Philippe Fournier-Viger, Cheng-Wei Wu, and Vincent S. Tseng. 2012. Mining top-k association rules. In Canadian Conference on Artificial Intelligence. 61–73. Google Scholar
Digital Library
- Wensheng Gan, Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Vincent Tseng, and Philip Yu. 2021. A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng. 33, 4 (2021), 1306–1327.Google Scholar
Cross Ref
- Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, and Justin Zhan. 2017. Data mining in distributed environment: A survey. Wiley Interdisc. Rev.: Data Mining Knowl. Discov. 7, 6 (2017), e1216.Google Scholar
Cross Ref
- Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2019. A survey of parallel sequential pattern mining. ACM Trans. Knowl. Discov. Data 13, 3 (2019), 1–34. Google Scholar
Digital Library
- Wensheng Gan, Jerry Chun-Wei Lin, Jiexiong Zhang, and Philip S. Yu. 2020. Utility mining across multi-sequences with individualized thresholds. ACM Trans. Data Sci. 1, 2 (2020), 1–29. Google Scholar
Digital Library
- Bart Goethals and M. J. Zaki. 2003. Frequent itemset mining implementations repository. Retrieved from http://fimi.cs.helsinki.fi.Google Scholar
- Vikram Goyal, Ashish Sureka, and Dhaval Patel. 2015. Efficient skyline itemsets mining. In International C* Conference on Computer Science & Software Engineering. 119–124. Google Scholar
Digital Library
- Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29, 2 (2000), 1–12. Google Scholar
Digital Library
- Donald Kossmann, Frank Ramsak, and Steffen Rost. 2002. Shooting stars in the sky: An online algorithm for skyline queries. In International Conference on Very Large Data Bases. 275–286. Google Scholar
Digital Library
- Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu. 2011. An effective tree structure for mining high utility itemsets. Exp. Syst. Applic. 38, 6 (2011), 7419–7424. Google Scholar
Digital Library
- Jerry Chun-Wei Lin, Lu Yang, Philippe Fournier-Viger, and Tzung-Pei Hong. 2019. Mining of skyline patterns by considering both frequent and utility constraints. Eng. Applic. Artif. Intelligence 77 (2019), 229–238.Google Scholar
Cross Ref
- Xuemin Lin, Yidong Yuan, Qing Zhang, and Ying Zhang. 2007. Selecting stars: The k most representative skyline operator. In International Conference on Data Engineering. 86–95.Google Scholar
Cross Ref
- Junqiang Liu, Yunhe Pan, Ke Wang, and Jiawei Han. 2002. Mining frequent item sets by opportunistic projection. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 229–238. Google Scholar
Digital Library
- Mengchi Liu and Junfeng Qu. 2012. Mining high utility itemsets without candidate generation. In ACM International Conference on Information and Knowledge Management. 55–64. Google Scholar
Digital Library
- Ying Liu, Wei-keng Liao, and Alok Choudhary. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. 689–695. Google Scholar
Digital Library
- Jeng-Shyang Pan, Jerry Chun-Wei Lin, Lu Yang, Philippe Fournier-Viger, and Tzung-Pei Hong. 2017. Efficiently mining of skyline frequent-utility patterns. Intell. Data Anal. 21, 6 (2017), 1407–1423.Google Scholar
Cross Ref
- Dimitris Papadias, Yufei Tao, Greg Fu, and Bernhard Seeger. 2005. Progressive skyline computation in database systems. ACM Trans. Datab. Syst. 30, 1 (2005), 41–82. Google Scholar
Digital Library
- Jong Soo Park, Ming-Syan Chen, and Philip S. Yu. 1995. An effective hash-based algorithm for mining association rules. ACM SIGMOD Rec. 24, 2 (1995), 175–186. Google Scholar
Digital Library
- Vid Podpecan, Nada Lavrac, and Igor Kononenko. 2007. A fast algorithm for mining utility-frequent itemsets. In International Workshop on Constraint-based Mining and Learning at ECML/PKDD. 9–20.Google Scholar
- Ashok Savasere, Edward Robert Omiecinski, and Shamkant B. Navathe. 1995. An Efficient Algorithm for Mining Association Rules in Large Databases. Technical Report. Georgia Institute of Technology. Google Scholar
Digital Library
- Pankaj Singh, Sudhakar Singh, P. K. Mishra, and Rakhi Garg. 2019. RDD-Eclat: Approaches to parallelize Eclat algorithm on Spark RDD framework. In International Conference on Computer Networks and Inventive Communication Technologies. 755–768.Google Scholar
- Gautam Srivastava, Jerry Chun-Wei Lin, Matin Pirouz, Yuanfa Li, and Unil Yun. 2020. A pre-large weighted-fusion system of sensed high-utility patterns. IEEE Sensors J. (2020).Google Scholar
- Kian-Lee Tan, Pin-Kwang Eng, Beng Chin Ooi, et al. 2001. Efficient progressive skyline computation. In International Conference on Very Large Data Bases, Vol. 1. 301–310. Google Scholar
Digital Library
- Vincent S. Tseng, Bai-En Shie, Cheng-Wei Wu, and S. Yu Philip. 2012. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25, 8 (2012), 1772–1786. Google Scholar
Digital Library
- Vincent S. Tseng, Cheng-Wei Wu, Philippe Fournier-Viger, and Philip S. Yu. 2015. Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28, 1 (2015), 54–67. Google Scholar
Digital Library
- Vincent S. Tseng, Cheng-Wei Wu, Bai-En Shie, and Philip S. Yu. 2010. UP-Growth: An efficient algorithm for high utility itemset mining. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 253–262. Google Scholar
Digital Library
- Jimmy Ming-Tai Wu, Jerry Chun-Wei Lin, and Ashish Tamrakar. 2019. High-utility itemset mining with effective pruning strategies. ACM Trans. Knowl. Discov. Data 13, 6 (2019), 1–22. Google Scholar
Digital Library
- Tsu-Yang Wu, Jerry Chun-Wei Lin, Unil Yun, Chun-Hao Chen, Gautam Srivastava, and Xianbiao Lv. 2020. An efficient algorithm for fuzzy frequent itemset mining. J. Intell. Fuzzy Syst. 385 (2020), 5787–5797.Google Scholar
Cross Ref
- Hong Yao, Howard J. Hamilton, and Cory J. Butz. 2004. A foundational approach to mining itemset utilities from databases. In SIAM International Conference on Data Mining. 482–486.Google Scholar
- Jieh-Shan Yeh, Yu-Chiang Li, and Chin-Chen Chang. 2007. Two-phase algorithms for a novel utility-frequent mining model. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. 433–444. Google Scholar
Digital Library
- M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. 1997. New algorithm for fast discovery of association rules. In International Conference on Knowledge Discovery and Data Mining. 283–286. Google Scholar
Digital Library
- Mohammed Javeed Zaki. 2000. Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12, 3 (2000), 372–390. Google Scholar
Digital Library
Index Terms
The Efficient Mining of Skyline Patterns from a Volunteer Computing Network
Recommendations
Mining of frequent itemsets with JoinFI-mine algorithm
AIKED'11: Proceedings of the 10th WSEAS international conference on Artificial intelligence, knowledge engineering and data basesAssociation rule mining among frequent items has been widely studied in data mining field. Many researches have improved the algorithm for generation of all the frequent itemsets. In this paper, we proposed a new algorithm to mine all frequents itemsets ...
Efficiently mining of skyline frequent-utility patterns
Frequent itemset mining (FIM) is one of the most common data mining techniques, which is based on the analysis of the occurrence frequencies of items in transactions. However, it is inapplicable in real-life situations since customers may purchase ...
An efficient approach to mine periodic-frequent patterns in transactional databases
PAKDD'11: Proceedings of the 15th international conference on New Frontiers in Applied Data MiningRecently, temporal occurrences of the frequent patterns in a transactional database has been exploited as an interestingness criterion to discover a class of user-interest-based frequent patterns, called periodic-frequent patterns. Informally, a ...






Comments