Editorial Notes
The editors have requested minor, non-substantive changes to the VoR and, in accordance with ACM policies, a Corrected VoR was published on February 9, 2021. For reference purposes the VoR may still be accessed via the Supplemental Material section on this page.
Abstract
The explosion of news text and the development of artificial intelligence provide a new opportunity and challenge to provide high-quality media monitoring service. In this article, we propose a semantic analysis approach based on the Latent Dirichlet Allocation (LDA) and Apriori algorithm, and we realize application to improve media monitoring reports by mining large-scale news text. First, we propose to use LDA model to mine news text topic words and reducing news dimensionality. Then, we propose to use Apriori algorithm to discovering the relationship of topic words. Finally, we discovery the relevance of news text topic words and show the intensity and dependency among topic words through drawing. This application can realize to extract the news topics and discover the correlation and dependency among news topics in mass news text. The results show that the method based on LDA and Apriori can help the media monitoring staff to better understand the hidden knowledge in the news text and improve the media analysis report.
Supplemental Material
Available for Download
Version of Record for "Knowledge Discovery of News Text Based on Artificial Intelligence" by Guangce et al., ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 20, Issue 1 (TALLIP 20:1).
- Zhao Ai-hua, Liu Pei-yu, and Zheng Yan. 2013. Subtopic division in news topic based on latent dirichlet allocation. J. Chinese Comput. Syst. 34, 4 (2013), 732--737.Google Scholar
- R. Agarwal and Swami A. N. Imielinskit. 1993. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 207--216.Google Scholar
- Fan Bingsi. 2012. Text mining: information analysis method for the social science. Library Info. Service 56, 8 (2012), 6--9.Google Scholar
- D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 3 (2003), 993--1022.Google Scholar
Digital Library
- Christopher M. Bishop. 2006. Pattern recognition and machine learning. J. Electr. Imag. 16, 4 (2006), 140--155.Google Scholar
- D. M. Blei. 2012. Probabilistic topic models. Commun. ACM 55, 4 (2012), 77--84.Google Scholar
Digital Library
- Chen Chao. 2015. How to face the information work of many favorable policies? Compet. Intell. 4 (2015), 3.Google Scholar
- H. Cherfi, A. Napoli, and Y. Toussaint. 2006. Towards a text mining methodology using association rule extraction. Soft Comput. 10, 5 (2006), 431--441.Google Scholar
Digital Library
- M. Y. Chen, M. N. Wu, C. C. Chen, Y. L Chen, and H. E. Lin. 2014. Recommendation-aware smartphone sensing system. J. Appl. Res. Technol. 26, 6 (2014), 1040--1050.Google Scholar
Cross Ref
- He Defang and Zeng Jianli. 2012. Study on in-depth integration of library collections based on semantics. J. Library Sci. China 4, (2012), 36--40.Google Scholar
- Li Gang and Li Yang. 2016. Decision-oriented collaborative innovation intelligence service of think-tank: The functional orientation and system construction. Library Info. 1 (2016), 36--43Google Scholar
- T. L. Griffiths and M. Steyvers. 2004. Finding scientific topics. Proc. Natl. Acad. Sci. U.S.A. 101 (2004), 5228--5235.Google Scholar
Cross Ref
- Thomas Hofmann. 2001. Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42 (2001), 177--196.Google Scholar
Cross Ref
- Qui Junping and Yu Fan. 2012. Theoretical research on semantization of library resources based on informetric analysis. J. Library Sci. China 7, (2012), 71--78.Google Scholar
- Cao Lina and Tang Xijin. 2014. Trends of BBS topics based on dynamic topic model. J. Manage. Sci. China 17, 11 (2014), 109--121.Google Scholar
- P. Lenca, B. Valiant, and S. Lallich. 2006. On the robustness of association rules. In Proceedings of the IEEE Conference on Cybernetics and Intelligent Systems. 2006. 1--6. IEEE.Google Scholar
- Hemant Misra, François Yvon, Olivier Cappé, et al. 2011. Text segmentation: A topic modeling perspective original research. Info. Process. Manage. 47, 4 (2011), 528--544.Google Scholar
Digital Library
- Baojun Ma, Nan Zhang, Guannan Liu, et al. 2016. Semantic search for public opinions on urban affairs: A probabilistic topic modeling-based approach. Info. Process. Manage. 52 (2016), 430--445.Google Scholar
Digital Library
- B. Minaei-Bidgoli, R. Barmaki, and M. Nasiri. 2013. Mining numerical association rules via multi-objective genetic algorithms. Info. Sci. 233, 2 (2013), 15--24.Google Scholar
- Jay M. Ponte and W. Bruce Croft. 1998. A language modeling approach to information retrieval. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 275--281.Google Scholar
- C. H. Papadimitriou, P. Raghavan, H. Tamaki, and S. Vempala. 1998. Latent semantic indexing: A probabilistic analysis. J. Comput. Syst. Sci. 61, 2 (1998), 217--235.Google Scholar
Digital Library
- Fabián Riquelme and Pablo González-Cantergiani. 2016. Measuring user influence on Twitter: A survey. Info. Process. Manage. 52, 5 (2016), 949--975.Google Scholar
Digital Library
- J. Rong, HQ Vu, R. Law, and G. Li. 2012. A behavioral analysis of web sharers and browsers in hong kong using targeted association rule mining. Tour. Manage. 33, 4 (2012), 731--740.Google Scholar
Cross Ref
- G. Salton, A. Wong, and C. S. Yang. 1975. A vector space model for automatic indexing. Commun. ACM 18, 11 (1975), 613--620.Google Scholar
Digital Library
- A. Shutz and P. Buitelaar. 2005. RelExt: A tool for relation extraction from text in ontology extension. In Proceedings of the 4th International Semantic Web Conference (ISWC’05). Springer, Berlin, 593--606.Google Scholar
- Y. A. Sekhavat and O. Hoeber. 2013. Visualizing association rules using linked matrix, graph, and detail views. Int. J. Intell. Sci. 3, 1 (2013), 34--49.Google Scholar
Cross Ref
- Chen Xiaomei, Bi Qiang, Teng Guangqing, et al. 2014. A study on the knowledge discovery dimension frame for digital library based on semantic web. J. China Soc. Sci. Tech. Info. 33, 2 (2014), 148--157.Google Scholar
- Zhenlei Yan and Jie Zhou. 2015. Optimal answerer ranking for new questions in community question answering. Info. Process. Manage. 51, 1 (2015), 163--178.Google Scholar
Cross Ref
- Yongwook Yoon and Gary G. Lee. 2013. Two scalable algorithms for associative text classification. Info. Process. Manage. 49, 2 (2013), 484--496.Google Scholar
Digital Library
- M. J. Zaki. 2000. Scalable algorithm for association mining. IEEE Trans. Knowl. Data Eng. 12, (2000), 372--390.Google Scholar
Digital Library
Index Terms
Knowledge Discovery of News Text Based on Artificial Intelligence
Recommendations
Knowledge Discovery in Text Mining Technique Using Association Rules Extraction
CICN '11: Proceedings of the 2011 International Conference on Computational Intelligence and Communication NetworksThis paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association ...
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Mining N-most interesting itemsets without support threshold by the COFI-tree
Data mining is the discovery of interesting and hidden patterns from a large amount of collected data. Applications can be found in many organisations with large databases, for many different purposes such as customer relationships, marketing, planning, ...






Comments