Abstract
Utility-oriented pattern mining is an emerging topic, since it can reveal high-utility patterns from different types of data, which provides more information than the traditional frequency/confidence-based pattern mining models. The utilities of various items/objects are not exactly equal in realistic situations; each item/object has its own utility or importance. In general, the user considers a uniform minimum utility (minutil) threshold to identify the set of high-utility sequential patterns (HUSPs). This is unable to find the interesting patterns while the minutil is set extremely high or low. We first design a new utility mining framework namely USPT for mining high-<u>U</u>tility <u>S</u>equential <u>P</u>atterns across multi-sequences with individualized <u>T</u>hresholds. Each item in the designed framework has its own specified minimum utility threshold. Based on the lexicographic-sequential tree and the utility-array structure, the USPT framework is presented to efficiently discover the HUSPs. With the upper-bounds on utility, several pruning strategies are developed to prune the unpromising candidates early in the search space. Several experiments are conducted on both real-life and synthetic datasets to show the performance of the designed USPT algorithm, and the results show that USPT could achieve good effectiveness and efficiency for mining HUSPs with individualized minimum utility thresholds.
- Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. 1993. Database mining: A performance perspective. IEEE Trans. Knowl. Data Eng. 5, 6 (1993), 914--925.Google Scholar
Digital Library
- Rakesh Agrawal and Ramakrishnan Srikant. 1994. Quest Synthetic Data Generator. Retrieved from http://www.Almaden.ibm.com/cs/quest/syndata.html.Google Scholar
- Rakesh Agrawal and Ramakrishnan Srikant. 1995. Mining sequential patterns. In Proceedings of the International Conference on Data Engineering. IEEE, 3--14.Google Scholar
Digital Library
- Rakesh Agrawal, Ramakrishnan Srikant, et al. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, Vol. 1215. 487--499.Google Scholar
Digital Library
- Chowdhury-Farhan Ahmed, Syed-Khairuzzaman Tanbeer, and Byeong-Soo Jeong. 2010. Mining high utility web access sequences in dynamic web log data. In Proceedings of the 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. IEEE, 76--81.Google Scholar
- Chowdhury-Farhan Ahmed, Syed-Khairuzzaman Tanbeer, and Byeong-Soo Jeong. 2010. A novel approach for mining high-utility sequential patterns in sequence databases. ETRI J. 32, 5 (2010), 676--686.Google Scholar
Cross Ref
- Chowdhury-Farhan Ahmed, Syed-Khairuzzaman Tanbeer, Byeong-Soo Jeong, and Young-Koo Lee. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21, 12 (2009), 1708--1721.Google Scholar
Digital Library
- Oznur Kirmemis Alkan and Pinar Karagoz. 2015. CRoM and HuspExt: Improving efficiency of high utility sequential pattern extraction. IEEE Trans. Knowl. Data Eng. 27, 10 (2015), 2645--2657.Google Scholar
Digital Library
- Jay Ayres, Jason Flannick, Johannes Gehrke, and Tomi Yiu. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 429--435.Google Scholar
- Raymond Chan, Qiang Yang, and Yi Dong Shen. 2003. Mining high utility itemsets. In Proceedings of the 3rd IEEE International Conference on Data Mining. IEEE, 19--26.Google Scholar
Cross Ref
- Ming-Syan Chen, Jiawei Han, and Philip S. Yu. 1996. Data mining: An overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8, 6 (1996), 866--883.Google Scholar
Digital Library
- Philippe Fournier-Viger, Jerry Chun-Wei Lin, Rage-Uday Kiran, and Yun-Sing Koh. 2017. A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1, 1 (2017), 54--77.Google Scholar
- Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In Proceedings of the International Symposium on Methodologies for Intelligent Systems. Springer, 83--92.Google Scholar
- Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, Shyue-Liang Wang, and Philip S. Yu. 2018. Privacy preserving utility mining: A survey. In Proceedings of the IEEE International Conference on Big Data. IEEE, 2617--2626.Google Scholar
- Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, and Justin Zhan. 2017. Data mining in distributed environment: A survey. Data Min. Knowl. Discov. 7, 6 (2017), e1216.Google Scholar
- Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, and Han-Chieh Chao. 2016. More efficient algorithms for mining high-utility itemsets with multiple minimum utility thresholds. In Proceedings of the International Conference on Database and Expert Systems Applications. Springer, 71--87.Google Scholar
Digital Library
- Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Tzung-Pei Hong, and Hamido Fujita. 2018. A survey of incremental high-utility itemset mining. Data Min. Knowl. Discov. 8, 2 (2018), e1242.Google Scholar
- Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Vincent S. Tseng, and Philip S. Yu. 2019. A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng. (2019), 1--20. DOI:10.1109/TKDE.2019.2942594Google Scholar
- Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2020. HUOPM: High-utility occupancy pattern mining. IEEE Trans. Cybernet. 50, 3 (2020), 1195--1208.Google Scholar
Cross Ref
- Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2019. A survey of parallel sequential pattern mining. ACM Trans. Knowl. Discov. Data 13, 3 (2019), 25.Google Scholar
- Wensheng Gan, Jerry Chun-Wei Lin, Zhang Jiexiong, Han-Chieh Chao, Hamido Fujita, and Philip S. Yu. 2019. ProUM: High utility sequential pattern mining. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics. IEEE, 767--773.Google Scholar
- Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 1 (2004), 53--87.Google Scholar
Digital Library
- Ya-Han Hu, Fan Wu, and Yi-Jiun Liao. 2013. An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports. J. Syst. Softw. 86, 5 (2013), 1224--1238.Google Scholar
Digital Library
- Yun-Sing Koh and Sri-Devi Ravana. 2016. Unsupervised rare pattern mining: A survey. ACM Trans. Knowl. Discov. Data 10, 4 (2016), 45.Google Scholar
- Srikumar Krishnamoorthy. 2018. Efficient mining of high utility itemsets with multiple minimum utility thresholds. Eng. Appl. Artif. Intell. 69 (2018), 112--126.Google Scholar
- Guo-Cheng Lan, Tzung-Pei Hong, and Vincent S. Tseng. 2011. Discovery of high utility itemsets from on-shelf time periods of products. Expert Syst. Appl. 38, 5 (2011), 5851--5857.Google Scholar
Digital Library
- Guo-Cheng Lan, Tzung-Pei Hong, Vincent S. Tseng, and Shyue-Liang Wang. 2014. Applying the maximum utility measure in high utility sequential pattern mining. Expert Syst. Appl. 41, 11 (2014), 5071--5081.Google Scholar
Cross Ref
- Anthony J. T. Lee, Huei-Wen Wu, Tzu-Yu Lee, Ying-Ho Liu, and Kuo-Tay Chen. 2009. Mining closed patterns in multi-sequence time-series databases. Data Knowl. Eng. 68, 10 (2009), 1071--1090.Google Scholar
Digital Library
- Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu. 2011. An effective tree structure for mining high utility itemsets. Expert Syst. Appl. 38, 6 (2011), 7419--7424.Google Scholar
Digital Library
- Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung-Pei Hong, and Vincent S. Tseng. 2016. Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl.-Based Syst. 96 (2016), 171--187.Google Scholar
- Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung-Pei Hong, and Justin Zhan. 2016. Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowl.-Based Syst. 113 (2016), 100--115.Google Scholar
- Jerry Chun-Wei Lin, Wensheng Gan, and Tzung-Pei Hong. 2015. A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification. Adv. Eng. Inf. 29, 3 (2015), 562--574.Google Scholar
Digital Library
- Jerry Chun-Wei Lin, Wensheng Gan, Tzung-Pei Hong, and Vincent S. Tseng. 2015. Efficient algorithms for mining up-to-date high-utility patterns. Adv. Eng. Inf. 29, 3 (2015), 648--661.Google Scholar
Digital Library
- Jerry Chun-Wei Lin, Jiexiong Zhang, and Philippe Fournier-Viger. 2017. High-utility sequential pattern mining with multiple minimum utility thresholds. In Proceedings of the Asia-Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data. Springer, 215--229.Google Scholar
- Bing Liu, Wynne Hsu, and Yiming Ma. 1999. Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 337--341.Google Scholar
Digital Library
- Mengchi Liu and Junfeng Qu. 2012. Mining high utility itemsets without candidate generation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 55--64.Google Scholar
Digital Library
- Ying Liu, Wei-Keng Liao, and Alok Choudhary. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 689--695.Google Scholar
Digital Library
- Yu-Cheng Liu, Chun-Pei Cheng, and Vincent S. Tseng. 2011. Discovering relational-based association rules with multiple minimum supports on microarray datasets. Bioinformatics 27, 22 (2011), 3142--3148.Google Scholar
Digital Library
- Thang Mai, Bay Vo, and Loan T. T. Nguyen. 2017. A lattice-based approach for mining high utility association rules. Inf. Sci. 399 (2017), 81--97.Google Scholar
Digital Library
- Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. 1997. Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1, 3 (1997), 259--289.Google Scholar
Digital Library
- A. Marshall. 1926. Principles of Economics (8th ed.). Macmillan and Co., London.Google Scholar
- Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umeshwar Dayal, and Mei Chun Hsu. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the International Conference on Data Engineering. IEEE, 215--224.Google Scholar
Digital Library
- Helen Pinto, Jiawei Han, Jian Pei, Ke Wang, Qiming Chen, and Umeshwar Dayal. 2001. Multi-dimensional sequential pattern mining. In Proceedings of the 10th International Conference on Information and Knowledge Management. ACM, 81--88.Google Scholar
Digital Library
- Heungmo Ryang and Unil Yun. 2016. High utility pattern mining over data streams with sliding window technique. Expert Syst. Appl. 57 (2016), 214--231.Google Scholar
Digital Library
- Bai-En Shie, Hui-Fang Hsiao, Vincent S. Tseng, and Philip S. Yu. 2011. Mining high utility mobile sequential patterns in mobile commerce environments. In Proceedings of International Conference on Database Systems for Advanced Applications. Springer, 224--238.Google Scholar
- Ramakrishnan Srikant and Rakesh Agrawal. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of International Conference on Extending Database Technology. Springer, 1--17.Google Scholar
Cross Ref
- Vincent S. Tseng, Bai-En Shie, Cheng-Wei Wu, and Philip S. Yu. 2013. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25, 8 (2013), 1772--1786.Google Scholar
Digital Library
- Vincent S. Tseng, Cheng-Wei Wu, Philippe Fournier-Viger, and Philip S. Yu. 2015. Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans. Knowledge Data Eng. 27, 3 (2015), 726--739.Google Scholar
Cross Ref
- Vincent S. Tseng, Cheng-Wei Wu, Philippe Fournier-Viger, and Philip S. Yu. 2016. Efficient algorithms for mining top- high utility itemsets. IEEE Trans. Knowl. Data Eng. 28, 1 (2016), 54--67.Google Scholar
Digital Library
- Vincent S. Tseng, Cheng-Wei Wu, Bai-En Shie, and Philip S. Yu. 2010. UP-Growth: An efficient algorithm for high utility itemset mining. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 253--262.Google Scholar
- Jun-Zhe Wang and Jiun-Long Huang. 2018. On incremental high utility sequential pattern mining. ACM Trans. Intell. Syst. Technol. 9, 5 (2018), 55.Google Scholar
Digital Library
- Jun-Zhe Wang, Jiun-Long Huang, and Yi-Cheng Chen. 2016. On efficiently mining high utility sequential patterns. Knowl. Inf. Syst. 49, 2 (2016), 597--627.Google Scholar
Digital Library
- Hong Yao, Howard J. Hamilton, and Cory J. Butz. 2004. A foundational approach to mining itemset utilities from databases. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 482--486.Google Scholar
- Hongzhi Yin, Bin Cui, Yizhou Sun, Zhiting Hu, and Ling Chen. 2014. LCARS: A spatial item recommender system. ACM Trans. Inf. Syst. 32, 3 (2014), 11.Google Scholar
Digital Library
- Junfu Yin, Zhigang Zheng, and Longbing Cao. 2012. USpan: An efficient algorithm for mining high utility sequential patterns. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 660--668.Google Scholar
Digital Library
- Junfu Yin, Zhigang Zheng, Longbing Cao, Yin Song, and Wei Wei. 2013. Efficiently mining top- high utility sequential patterns. In Proceedings of the IEEE 13th International Conference on Data Mining. IEEE, 1259--1264.Google Scholar
- Chung-Ching Yu and Yen-Liang Chen. 2005. Mining sequential patterns from multidimensional sequence data. IEEE Trans. Knowl. Data Eng. 17, 1 (2005), 136--140.Google Scholar
Digital Library
- Unil Yun, Gangin Lee, and Eunchul Yoon. 2017. Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans. Industr. Electr. 64, 9 (2017), 7239--7249.Google Scholar
Cross Ref
- Mohammed J. Zaki. 2001. SPADE: An efficient algorithm for mining frequent sequences. Mach. Learn. 42, 1--2 (2001), 31--60.Google Scholar
Cross Ref
- Souleymane Zida, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Cheng-Wei Wu, and Vincent S. Tseng. 2015. EFIM: A highly efficient algorithm for high-utility itemset mining. In Proceedings of the Mexican International Conference on Artificial Intelligence. Springer, 530--546.Google Scholar
Index Terms
Utility Mining across Multi-Sequences with Individualized Thresholds
Recommendations
Utility-Driven Mining of Trend Information for Intelligent System
Special Section on WITS 2018 and Regular ArticlesUseful knowledge, embedded in a database, is likely to change over time. Identifying the recent changes in temporal data can provide valuable up-to-date information to decision makers. Nevertheless, techniques for mining high-utility patterns (HUPs) ...
A Survey of incremental high-utility pattern mining based on storage structure
Traditional association rule mining has been widely studied, but this is not applicable to practical applications that must consider factors such as the unit profit of the item and the purchase quantity. High-utility itemset mining (HUIM) aims to find ...
Efficient mining of high-utility itemsets using multiple minimum utility thresholds
In the field of data mining, the topic of high-utility itemset mining (HUIM) has recently gained a lot of attention from researchers as it takes many factors into account that are useful for decision-making by retail managers. In the past, many ...






Comments