Abstract
We construct a Chinese Economic Event Treebank (CEETB), focusing on revealing economic and finance events and their relations. Investigating economic event relations will benefit academic research and practice in not just economics but many other scientific areas. The characteristics of economic-related texts (e.g., abundant longer enterprises names and terms) and the Chinese language speciality (e.g., component ellipsis in long sentences) have resulted in challenges in the event relation extraction task. Existing Chinese corpora containing economic event relations mainly focused on finance areas (e.g., the equity market) and only covered a few event types. To support research that may involve economic text analysis in Chinese, our CEETB is constructed following a carefully designed process. First, based on practical and research requirements, we summarize nine different types of event relations and four types of component ellipses in economic texts. Then, an excellent annotation scheme is presented to hyalinize the model, strategy, and process in annotation, followed by statistical analysis and quality evaluation for the CEETB corpus. Finally, to demonstrate the strengths of the constructed corpus in practical applications, we conduct experiments on five SOTA models for event relation extraction.
- [1] . 2020. Cause-effect association between event pairs in event datasets. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI). 1202–1208.Google Scholar
Cross Ref
- [2] . 2017. The event storyline corpus: A new benchmark for causal and temporal relation extraction. In Proceedings of the Events and Stories in the News Workshop. 77–86.Google Scholar
Cross Ref
- [3] . 2020. Knowledge graph-based event embedding framework for financial quantitative investments. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 2221–2230.Google Scholar
Digital Library
- [4] . 2017. A sequential model for classifying temporal relations between intra-sentence events. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1796–1802.Google Scholar
Cross Ref
- [5] . 1968. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin 70, 4 (1968), 213.Google Scholar
Cross Ref
- [6] . 2021. Pre-training with whole word masking for Chinese BERT. IEEE Transactions on Audio, Speech and Language Processing (TASLP) 29 (2021), 3504–3514.Google Scholar
Digital Library
- [7] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). 4171–4186.Google Scholar
- [8] . 2015. Deep learning for event-driven stock prediction. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI). 2327–2333.Google Scholar
- [9] . 2019. Modeling event background for if-then commonsense reasoning using context-aware variational autoencoder. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2682–2691.Google Scholar
Cross Ref
- [10] . 2020. Automatic extraction of personal events from dialogue. In Proceedings of the 1st Joint Workshop on Narrative Understanding, Storylines, and Events. 63–71.Google Scholar
Cross Ref
- [11] . 2020. The SOFC-Exp corpus and neural approaches to information extraction in the materials science domain. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 1255–1268.Google Scholar
Cross Ref
- [12] . 2014. HiEve: A corpus for extracting event hierarchies from news stories. In Proceedings of 9th Language Resources and Evaluation Conference (LREC). 3678–3683.Google Scholar
- [13] . 2020. Domain knowledge empowered structured neural net for end-to-end event temporal relation extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 5717–5729.Google Scholar
Cross Ref
- [14] . 2016. A survey of event extraction methods from text for decision support systems. Decision Support Systems 85 (2016), 12–22.Google Scholar
Digital Library
- [15] . 2016. Building a cross-document event-event relation corpus. In Proceedings of the 10th Linguistic Annotation Workshop Held in Conjunction with ACL 2016. 1–6.Google Scholar
Cross Ref
- [16] . 2021. Domain-aware word segmentation for Chinese language: A document-level context-aware model. Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 21, 2 (2021), 1–16.Google Scholar
- [17] . 2020. Biomedical event extraction with hierarchical knowledge graphs. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1277–1285.Google Scholar
- [18] . 2017. Improving event causality recognition with multiple background knowledge sources using multi-column convolutional neural networks. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI). 3466–3473.Google Scholar
Digital Library
- [19] . 2020. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the 8th International Conference on Learning Representations (ICLR). 1–17.Google Scholar
- [20] . 2020. Towards extracting absolute event timelines from English clinical reports. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) 28 (2020), 2710–2719.Google Scholar
Digital Library
- [21] . 2019. Biomedical event extraction based on knowledge-driven tree-LSTM. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). 1421–1430.Google Scholar
- [22] . 2016. Semantics-based joint model of Chinese event trigger extraction. Journal of Software 27, 2 (2016), 280–294.Google Scholar
- [23] . 2020. A unified model for financial event classification, detection and summarization. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI). 4668–4674.Google Scholar
Cross Ref
- [24] . 2014. Building Chinese discourse corpus with connective-driven dependency tree structure. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2105–2114.Google Scholar
Cross Ref
- [25] . 2020. F-HMTC: Detecting financial events for investment decisions based on neural hierarchical multi-label text classification. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI). 4490–4496.Google Scholar
Cross Ref
- [26] . 2020. Extracting events and their relations from texts: A survey on recent research progress and challenges. AI Open 1 (2020), 22–39.Google Scholar
Cross Ref
- [27] . 2019. Open domain event extraction using neural latent variable models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL). 2860–2871.Google Scholar
Cross Ref
- [28] . 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019), 1–13.Google Scholar
- [29] . 1979. Analysis of Chinese Grammar. The Commercial Press, Shanghai.Google Scholar
- [30] . 2020. LearnIt: On-demand rapid customization for event-event relation extraction. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI). 13630–13631.Google Scholar
Cross Ref
- [31] . 2014. Annotating causality in the TempEval-3 corpus. In Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language. 10–19.Google Scholar
Cross Ref
- [32] . 2016. CaTeRS: Causal and temporal relation scheme for semantic annotation of event structures. In Proceedings of the 4th Workshop on Events: Definition, Detection, Coreference, and Representation. 51–61.Google Scholar
Cross Ref
- [33] . 2021. Learning context-aware convolutional filters for implicit discourse relation classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) 29 (2021), 2421–2433.Google Scholar
Digital Library
- [34] . 2020. Towards open domain event trigger identification using adversarial domain adaptation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 7618–7624.Google Scholar
Cross Ref
- [35] . 2018. Joint reasoning for temporal and causal relations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). 2278–2288.Google Scholar
Cross Ref
- [36] . 2018. A multi-axis annotation scheme for event temporal relations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). 1318–1328.Google Scholar
Cross Ref
- [37] . 2016. Richer event description: Integrating event coreference with temporal, causal and bridging annotation. In Proceedings of the 2nd Workshop on Computing News Storylines. 47–56.Google Scholar
Cross Ref
- [38] . 2003. The timebank corpus. In Corpus Linguistics, Vol. 2003. 40.Google Scholar
- [39] . 2019. On detecting business event from the headlines and leads of massive online news articles. Information Processing and Management (IPM) 56, 6 (2019), 1–15.Google Scholar
- [40] . 2017. Semantic-frame representation for event detection on Twitter. In Proceedings of the 2017 International Conference on Asian Language Processing. 264–267.Google Scholar
Cross Ref
- [41] . 2020. Biomedical event extraction as sequence labeling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 5357–5367.Google Scholar
Cross Ref
- [42] . 2020. Hierarchical Chinese legal event extraction via pedal attention mechanism. In Proceedings of the 28th International Conference on Computational Linguistics (COLING). 100–113.Google Scholar
Cross Ref
- [43] . 2019. Literary event detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL). 3623–3634.Google Scholar
Cross Ref
- [44] . 2013. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. Journal of the American Medical Informatics Association 20 (2013), 806–813.Google Scholar
- [45] . 2019. ERNIE: Enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019), 1–8.Google Scholar
- [46] . 2013. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions, events, and temporal relations. In Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). 1–9.Google Scholar
- [47] . 2020. Extracting biographical spatial timelines: Corpus and experiments. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) 28 (2020), 1395–1403.Google Scholar
Digital Library
- [48] . 2020. Extracting temporal and causal relations based on event networks. Information Processing and Management (IPM) 57, 6 (2020), 1–22.Google Scholar
- [49] . 2020. An association-constrained LDA model for joint extraction of product aspects and opinions. Information Sciences 519 (2020), 243–259.Google Scholar
Digital Library
- [50] . 2021. Chinese financial event extraction based on syntactic and semantic dependency parsing. Chinese Journal of Computer 44, 3 (2021), 508–530.Google Scholar
- [51] . 2020. Joint constrained learning for event-event extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 696–706.Google Scholar
Cross Ref
- [52] . 2020. Joint constrained learning for event-event relation extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 696–706.Google Scholar
Cross Ref
- [53] . 1985. Modern Chinese Grammar. The Commercial Press, Shanghai.Google Scholar
- [54] . 2019. Open event extraction from online text using a generative adversarial network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 282–291.Google Scholar
Cross Ref
- [55] . 2019. Topic tensor network for implicit discourse relation recognition in Chinese. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL). 608–618.Google Scholar
Cross Ref
- [56] . 2018. DCFEE: A document-level Chinese financial event extraction system based on automatically labeled training data. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics-System Demonstrations. 1–6.Google Scholar
Cross Ref
- [57] . 2019. Enhancing domain word embedding via latent semantic imputation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD). 557–565.Google Scholar
Digital Library
- [58] . 2018. Construction of a Chinese corpus for the analysis of the emotionality of metaphorical expressions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). 144–150.Google Scholar
Cross Ref
- [59] . 2018. CADEN: A context-aware deep embedding network for financial opinions mining. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM). 757–766.Google Scholar
Cross Ref
- [60] . 2019. Doc2EDAG: An end-to-end document-level framework for Chinese financial event extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 337–346.Google Scholar
Cross Ref
- [61] . 2021. DFM: A parameter-shared deep fused model for knowledge base question answering. Information Sciences 547 (2021), 103–118.Google Scholar
Cross Ref
- [62] . 2021. What the role is vs. what plays the role: Semi-supervised event argument extraction via dual question answering. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), Vol. 35. 14638–14646.Google Scholar
Cross Ref
Index Terms
Construction of a Chinese Corpus for Multi-Type Economic Event Relation
Recommendations
An event-extraction approach for business analysis from online Chinese news
Highlights- This paper presented a business event-extraction approach from online Chinese news.
AbstractExtracting events from business news aids users to perceive market trends, be aware of competitors’ strategies, and to make valuable investment decisions. Prior research lacks event extraction in the area of business and event based ...
Chinese News Event Corpus Construction Method Based on Syntax Tree
ICBDT '20: Proceedings of the 3rd International Conference on Big Data TechnologiesAt present, the weakly supervised model is usually used for the expansion of the event corpus, which avoids the expensive manual annotation process. However, the weakly supervised model relies on the knowledge base and a small part of manually annotated ...
Joint Event Extraction with Contextualized Word Embeddings for the Portuguese Language
Intelligent SystemsAbstractEvent Extraction (EE) is the task of identifying mentions of particular event types and their arguments in text, and it constitutes an important and challenging task within the area of Information Extraction (IE). However, in the context of the ...






Comments