skip to main content
research-article
Open Access

Utility Mining across Multi-Sequences with Individualized Thresholds

Published:30 May 2020Publication History
Skip Abstract Section

Abstract

Utility-oriented pattern mining is an emerging topic, since it can reveal high-utility patterns from different types of data, which provides more information than the traditional frequency/confidence-based pattern mining models. The utilities of various items/objects are not exactly equal in realistic situations; each item/object has its own utility or importance. In general, the user considers a uniform minimum utility (minutil) threshold to identify the set of high-utility sequential patterns (HUSPs). This is unable to find the interesting patterns while the minutil is set extremely high or low. We first design a new utility mining framework namely USPT for mining high-<u>U</u>tility <u>S</u>equential <u>P</u>atterns across multi-sequences with individualized <u>T</u>hresholds. Each item in the designed framework has its own specified minimum utility threshold. Based on the lexicographic-sequential tree and the utility-array structure, the USPT framework is presented to efficiently discover the HUSPs. With the upper-bounds on utility, several pruning strategies are developed to prune the unpromising candidates early in the search space. Several experiments are conducted on both real-life and synthetic datasets to show the performance of the designed USPT algorithm, and the results show that USPT could achieve good effectiveness and efficiency for mining HUSPs with individualized minimum utility thresholds.

References

  1. Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. 1993. Database mining: A performance perspective. IEEE Trans. Knowl. Data Eng. 5, 6 (1993), 914--925.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Rakesh Agrawal and Ramakrishnan Srikant. 1994. Quest Synthetic Data Generator. Retrieved from http://www.Almaden.ibm.com/cs/quest/syndata.html.Google ScholarGoogle Scholar
  3. Rakesh Agrawal and Ramakrishnan Srikant. 1995. Mining sequential patterns. In Proceedings of the International Conference on Data Engineering. IEEE, 3--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rakesh Agrawal, Ramakrishnan Srikant, et al. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, Vol. 1215. 487--499.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chowdhury-Farhan Ahmed, Syed-Khairuzzaman Tanbeer, and Byeong-Soo Jeong. 2010. Mining high utility web access sequences in dynamic web log data. In Proceedings of the 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. IEEE, 76--81.Google ScholarGoogle Scholar
  6. Chowdhury-Farhan Ahmed, Syed-Khairuzzaman Tanbeer, and Byeong-Soo Jeong. 2010. A novel approach for mining high-utility sequential patterns in sequence databases. ETRI J. 32, 5 (2010), 676--686.Google ScholarGoogle ScholarCross RefCross Ref
  7. Chowdhury-Farhan Ahmed, Syed-Khairuzzaman Tanbeer, Byeong-Soo Jeong, and Young-Koo Lee. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21, 12 (2009), 1708--1721.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Oznur Kirmemis Alkan and Pinar Karagoz. 2015. CRoM and HuspExt: Improving efficiency of high utility sequential pattern extraction. IEEE Trans. Knowl. Data Eng. 27, 10 (2015), 2645--2657.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jay Ayres, Jason Flannick, Johannes Gehrke, and Tomi Yiu. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 429--435.Google ScholarGoogle Scholar
  10. Raymond Chan, Qiang Yang, and Yi Dong Shen. 2003. Mining high utility itemsets. In Proceedings of the 3rd IEEE International Conference on Data Mining. IEEE, 19--26.Google ScholarGoogle ScholarCross RefCross Ref
  11. Ming-Syan Chen, Jiawei Han, and Philip S. Yu. 1996. Data mining: An overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8, 6 (1996), 866--883.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Philippe Fournier-Viger, Jerry Chun-Wei Lin, Rage-Uday Kiran, and Yun-Sing Koh. 2017. A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1, 1 (2017), 54--77.Google ScholarGoogle Scholar
  13. Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In Proceedings of the International Symposium on Methodologies for Intelligent Systems. Springer, 83--92.Google ScholarGoogle Scholar
  14. Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, Shyue-Liang Wang, and Philip S. Yu. 2018. Privacy preserving utility mining: A survey. In Proceedings of the IEEE International Conference on Big Data. IEEE, 2617--2626.Google ScholarGoogle Scholar
  15. Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, and Justin Zhan. 2017. Data mining in distributed environment: A survey. Data Min. Knowl. Discov. 7, 6 (2017), e1216.Google ScholarGoogle Scholar
  16. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, and Han-Chieh Chao. 2016. More efficient algorithms for mining high-utility itemsets with multiple minimum utility thresholds. In Proceedings of the International Conference on Database and Expert Systems Applications. Springer, 71--87.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Tzung-Pei Hong, and Hamido Fujita. 2018. A survey of incremental high-utility itemset mining. Data Min. Knowl. Discov. 8, 2 (2018), e1242.Google ScholarGoogle Scholar
  18. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Vincent S. Tseng, and Philip S. Yu. 2019. A survey of utility-oriented pattern mining. IEEE Trans. Knowl. Data Eng. (2019), 1--20. DOI:10.1109/TKDE.2019.2942594Google ScholarGoogle Scholar
  19. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2020. HUOPM: High-utility occupancy pattern mining. IEEE Trans. Cybernet. 50, 3 (2020), 1195--1208.Google ScholarGoogle ScholarCross RefCross Ref
  20. Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2019. A survey of parallel sequential pattern mining. ACM Trans. Knowl. Discov. Data 13, 3 (2019), 25.Google ScholarGoogle Scholar
  21. Wensheng Gan, Jerry Chun-Wei Lin, Zhang Jiexiong, Han-Chieh Chao, Hamido Fujita, and Philip S. Yu. 2019. ProUM: High utility sequential pattern mining. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics. IEEE, 767--773.Google ScholarGoogle Scholar
  22. Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 1 (2004), 53--87.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ya-Han Hu, Fan Wu, and Yi-Jiun Liao. 2013. An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports. J. Syst. Softw. 86, 5 (2013), 1224--1238.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yun-Sing Koh and Sri-Devi Ravana. 2016. Unsupervised rare pattern mining: A survey. ACM Trans. Knowl. Discov. Data 10, 4 (2016), 45.Google ScholarGoogle Scholar
  25. Srikumar Krishnamoorthy. 2018. Efficient mining of high utility itemsets with multiple minimum utility thresholds. Eng. Appl. Artif. Intell. 69 (2018), 112--126.Google ScholarGoogle Scholar
  26. Guo-Cheng Lan, Tzung-Pei Hong, and Vincent S. Tseng. 2011. Discovery of high utility itemsets from on-shelf time periods of products. Expert Syst. Appl. 38, 5 (2011), 5851--5857.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Guo-Cheng Lan, Tzung-Pei Hong, Vincent S. Tseng, and Shyue-Liang Wang. 2014. Applying the maximum utility measure in high utility sequential pattern mining. Expert Syst. Appl. 41, 11 (2014), 5071--5081.Google ScholarGoogle ScholarCross RefCross Ref
  28. Anthony J. T. Lee, Huei-Wen Wu, Tzu-Yu Lee, Ying-Ho Liu, and Kuo-Tay Chen. 2009. Mining closed patterns in multi-sequence time-series databases. Data Knowl. Eng. 68, 10 (2009), 1071--1090.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu. 2011. An effective tree structure for mining high utility itemsets. Expert Syst. Appl. 38, 6 (2011), 7419--7424.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung-Pei Hong, and Vincent S. Tseng. 2016. Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl.-Based Syst. 96 (2016), 171--187.Google ScholarGoogle Scholar
  31. Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung-Pei Hong, and Justin Zhan. 2016. Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowl.-Based Syst. 113 (2016), 100--115.Google ScholarGoogle Scholar
  32. Jerry Chun-Wei Lin, Wensheng Gan, and Tzung-Pei Hong. 2015. A fast updated algorithm to maintain the discovered high-utility itemsets for transaction modification. Adv. Eng. Inf. 29, 3 (2015), 562--574.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jerry Chun-Wei Lin, Wensheng Gan, Tzung-Pei Hong, and Vincent S. Tseng. 2015. Efficient algorithms for mining up-to-date high-utility patterns. Adv. Eng. Inf. 29, 3 (2015), 648--661.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jerry Chun-Wei Lin, Jiexiong Zhang, and Philippe Fournier-Viger. 2017. High-utility sequential pattern mining with multiple minimum utility thresholds. In Proceedings of the Asia-Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data. Springer, 215--229.Google ScholarGoogle Scholar
  35. Bing Liu, Wynne Hsu, and Yiming Ma. 1999. Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 337--341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Mengchi Liu and Junfeng Qu. 2012. Mining high utility itemsets without candidate generation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 55--64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ying Liu, Wei-Keng Liao, and Alok Choudhary. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 689--695.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yu-Cheng Liu, Chun-Pei Cheng, and Vincent S. Tseng. 2011. Discovering relational-based association rules with multiple minimum supports on microarray datasets. Bioinformatics 27, 22 (2011), 3142--3148.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Thang Mai, Bay Vo, and Loan T. T. Nguyen. 2017. A lattice-based approach for mining high utility association rules. Inf. Sci. 399 (2017), 81--97.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. 1997. Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1, 3 (1997), 259--289.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. A. Marshall. 1926. Principles of Economics (8th ed.). Macmillan and Co., London.Google ScholarGoogle Scholar
  42. Jian Pei, Jiawei Han, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umeshwar Dayal, and Mei Chun Hsu. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the International Conference on Data Engineering. IEEE, 215--224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Helen Pinto, Jiawei Han, Jian Pei, Ke Wang, Qiming Chen, and Umeshwar Dayal. 2001. Multi-dimensional sequential pattern mining. In Proceedings of the 10th International Conference on Information and Knowledge Management. ACM, 81--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Heungmo Ryang and Unil Yun. 2016. High utility pattern mining over data streams with sliding window technique. Expert Syst. Appl. 57 (2016), 214--231.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Bai-En Shie, Hui-Fang Hsiao, Vincent S. Tseng, and Philip S. Yu. 2011. Mining high utility mobile sequential patterns in mobile commerce environments. In Proceedings of International Conference on Database Systems for Advanced Applications. Springer, 224--238.Google ScholarGoogle Scholar
  46. Ramakrishnan Srikant and Rakesh Agrawal. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of International Conference on Extending Database Technology. Springer, 1--17.Google ScholarGoogle ScholarCross RefCross Ref
  47. Vincent S. Tseng, Bai-En Shie, Cheng-Wei Wu, and Philip S. Yu. 2013. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25, 8 (2013), 1772--1786.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Vincent S. Tseng, Cheng-Wei Wu, Philippe Fournier-Viger, and Philip S. Yu. 2015. Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans. Knowledge Data Eng. 27, 3 (2015), 726--739.Google ScholarGoogle ScholarCross RefCross Ref
  49. Vincent S. Tseng, Cheng-Wei Wu, Philippe Fournier-Viger, and Philip S. Yu. 2016. Efficient algorithms for mining top- high utility itemsets. IEEE Trans. Knowl. Data Eng. 28, 1 (2016), 54--67.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Vincent S. Tseng, Cheng-Wei Wu, Bai-En Shie, and Philip S. Yu. 2010. UP-Growth: An efficient algorithm for high utility itemset mining. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 253--262.Google ScholarGoogle Scholar
  51. Jun-Zhe Wang and Jiun-Long Huang. 2018. On incremental high utility sequential pattern mining. ACM Trans. Intell. Syst. Technol. 9, 5 (2018), 55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Jun-Zhe Wang, Jiun-Long Huang, and Yi-Cheng Chen. 2016. On efficiently mining high utility sequential patterns. Knowl. Inf. Syst. 49, 2 (2016), 597--627.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Hong Yao, Howard J. Hamilton, and Cory J. Butz. 2004. A foundational approach to mining itemset utilities from databases. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 482--486.Google ScholarGoogle Scholar
  54. Hongzhi Yin, Bin Cui, Yizhou Sun, Zhiting Hu, and Ling Chen. 2014. LCARS: A spatial item recommender system. ACM Trans. Inf. Syst. 32, 3 (2014), 11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Junfu Yin, Zhigang Zheng, and Longbing Cao. 2012. USpan: An efficient algorithm for mining high utility sequential patterns. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 660--668.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Junfu Yin, Zhigang Zheng, Longbing Cao, Yin Song, and Wei Wei. 2013. Efficiently mining top- high utility sequential patterns. In Proceedings of the IEEE 13th International Conference on Data Mining. IEEE, 1259--1264.Google ScholarGoogle Scholar
  57. Chung-Ching Yu and Yen-Liang Chen. 2005. Mining sequential patterns from multidimensional sequence data. IEEE Trans. Knowl. Data Eng. 17, 1 (2005), 136--140.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Unil Yun, Gangin Lee, and Eunchul Yoon. 2017. Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans. Industr. Electr. 64, 9 (2017), 7239--7249.Google ScholarGoogle ScholarCross RefCross Ref
  59. Mohammed J. Zaki. 2001. SPADE: An efficient algorithm for mining frequent sequences. Mach. Learn. 42, 1--2 (2001), 31--60.Google ScholarGoogle ScholarCross RefCross Ref
  60. Souleymane Zida, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Cheng-Wei Wu, and Vincent S. Tseng. 2015. EFIM: A highly efficient algorithm for high-utility itemset mining. In Proceedings of the Mexican International Conference on Artificial Intelligence. Springer, 530--546.Google ScholarGoogle Scholar

Index Terms

  1. Utility Mining across Multi-Sequences with Individualized Thresholds

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!