Abstract
Legal judgment prediction (LJP) is used to predict judgment results based on the description of individual legal cases. In order to be more suitable for actual application scenarios in which the case has cited multiple articles and has multiple charges, we formulate legal judgment prediction as a multiple label learning problem and present a deep learning model that can effectively encode the content of each legal case via a multi-residual convolution neural network and the semantics of law articles via an article encoder. An article-wise attention mechanism is proposed to fuse the two types of encoded information. Experimental results derived on the CAIL2018 datasets show that our model provides a significant performance improvement over the existing neural models in predicting relevant law articles and charges.
- [1] . 2019. Charge prediction with legal attention. In Natural Language Processing and Chinese Computing, , , , , and (Eds.), Cham, Dunhuang, China. Springer International Publishing, 447–458. Google Scholar
Digital Library
- [2] . 2019. Charge-based prison term prediction with deep gating network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 6362–6367. Google Scholar
Cross Ref
- [3] . 2012. On label dependence and loss minimization in multi-label classification. Machine Learning 88 (
07 2012), 5–45. Google ScholarDigital Library
- [4] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, MN, 4171–4186. Google Scholar
Cross Ref
- [5] . 2021. Legal judgment prediction via relational learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21). ACM, New York, NY, 983–992. Google Scholar
Digital Library
- [6] . 2021. Judgment prediction via injecting legal knowledge into neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 35, 14 (
May 2021), 12866–12874. https://ojs.aaai.org/index.php/AAAI/article/view/17522.Google Scholar - [7] . 2019. SECaps: A sequence enhanced capsule model for charge prediction. In International Conference on Artificial Neural Networks (ICANN), Munich, Germany. Springer, Cham, 227–239.Google Scholar
- [8] . 2020. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 1381–1393. Google Scholar
Cross Ref
- [9] . 2018. Few-shot charge prediction with discriminative legal attributes. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, NM, 487–498. https://www.aclweb.org/anthology/C18-1041.Google Scholar
- [10] . 2017. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Vancouver, Canada, 562–570. Google Scholar
Cross Ref
- [11] . 2017. A general approach for predicting the behavior of the Supreme Court of the United States. PLoS ONE 12 (2017), 1–18.Google Scholar
Cross Ref
- [12] . 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746–1751. Google Scholar
Cross Ref
- [13] . 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).Google Scholar
- [14] . 1957. Predicting Supreme Court decisions mathematically: A quantitative analysis of the “right to counsel” cases. American Political Science Review 51, 1 (1957), 1–12.Google Scholar
Cross Ref
- [15] . 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724–1734.Google Scholar
Cross Ref
- [16] . 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence 29, 1 (Feb. 2015), 2267–2273.Google Scholar
- [17] . 2020. ICD coding from clinical text using multi-filter residual convolutional neural network. In Proceedings of the AAAI Conference on Artificial Intelligence 34, 5 (2020), 8180–8187.Google Scholar
- [18] . 2012. Exploiting machine learning models for Chinese legal documents labeling, case classification, and sentencing prediction. International Journal of Computational Linguistics Chinese Language Processing 17, 4 (2012), 49–68.Google Scholar
- [19] . 2006. Exploring phrase-based classification of judicial documents for criminal charges in Chinese. In Foundations of Intelligent Systems, , , , and (Eds.). Springer, Berlin, 681–690. Google Scholar
Digital Library
- [20] . 2019. NeuralClassifier: An open-source neural hierarchical multi-label text classification toolkit. In Proceedings of the 57th 350 Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 87–92.Google Scholar
- [21] . 2017. Learning to predict charges for criminal cases with legal basis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 2727–2736. Google Scholar
Cross Ref
- [22] . 2013. Efficient estimation of word representations in vector space. In International Conference on Learning Representations (Workshop Poster). Scottsdale, Arizona, USA, 1–12.Google Scholar
- [23] . 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, New Orleans, Louisiana, USA, 1101–1111.Google Scholar
- [24] . 1964. Applying correlation analysis to case prediction. Texas Law Review 42, 7 (1964), 1006–1017.Google Scholar
- [25] . 2018. Few-shot and zero-shot multi-label learning for structured label spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3132–3142.Google Scholar
- [26] . 2017. Attention is all you need. CoRR abs/1706.03762 (2017). http://arxiv.org/abs/1706.03762.Google Scholar
- [27] . 2018. CAIL2018: A large-scale legal dataset for judgment prediction. http://arxiv.org/abs/1807.02478.Google Scholar
- [28] . 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 3086–3095. Google Scholar
Cross Ref
- [29] . 2019. Legal judgment prediction via multi-perspective Bi-feedback network. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19), Macao, China. International Joint Conferences on Artificial Intelligence Organization, 4085–4091. Google Scholar
Cross Ref
- [30] . 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, USA, 1480–1489. Google Scholar
Cross Ref
- [31] . 2019. Applying data discretization to DPCNN for law article prediction. In Natural Language Processing and Chinese Computing, , , , , and (Eds.). Springer International Publishing, Cham, Dunhuang, China, 459–470.Google Scholar
- [32] . 2018. Legal judgment prediction via topological learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3540–3549. Google Scholar
Cross Ref
- [33] . 2020. How does NLP benefit legal system: A summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5218–5230. Google Scholar
Cross Ref
- [34] . 2020. An element-aware multi-representation model for law article prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). Association for Computational Linguistics, Online, 6663–6668. Google Scholar
Cross Ref
Index Terms
Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction
Recommendations
Contrastive Learning for Legal Judgment Prediction
Legal judgment prediction (LJP) is a fundamental task of legal artificial intelligence. It aims to automatically predict the judgment results of legal cases. Three typical subtasks are relevant law article prediction, charge prediction, and term-of-...
NeurJudge: A Circumstance-aware Neural Framework for Legal Judgment Prediction
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalLegal Judgment Prediction is a fundamental task in legal intelligence of the civil law system, which aims to automatically predict the judgment results of multiple subtasks, such as charge, law article, and term of penalty prediction. Existing studies ...
Legal Judgment Elements Extraction Approach with Law Article-aware Mechanism
Legal judgment elements extraction (LJEE) aims to identify the different judgment features from the fact description in legal documents automatically, which helps to improve the accuracy and interpretability of the judgment results. In real court rulings, ...






Comments