TY - GEN
T1 - BERT-Based Ensemble Model for Statute Law Retrieval and Legal Information Entailment
AU - Shao, Hsuan Lei
AU - Chen, Yi Chia
AU - Huang, Sieh Chuen
N1 - Funding Information:
Acknowledgments. This work was financially supported by: Hsuan-Lei Shao, “From Knowledge Genealogy to Knowledge Map-China Studies in Big Data and Machine Learning” (MOST 107-2410-H-003 -058 -MY3), Sieh-Chuen Huang, Center for Research in Econometric Theory and Applications (Grant no. NTU-110L900203) from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) and Ministry of Science and Technology (MOST 109-2634-F-002-045) in Taiwan.
Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - The Competition on legal information extraction/entailment (COLIEE) is an international information processing and retrieval competition. As an aid to future participants as well as question designers, this article describes how to connect legal questions taken from past Japanese bar exams to relevant statutes (articles of the Japanese Civil Code, Task 3) and how to construct a Yes/No question answering system for legal queries (Task 4) incorporating background materials on Japanese law. We restructured the given data to a dataset which contains all possible combinations of queries and articles as continuous strings as our samples. In this way, the difficult pairing task has been turned into a simpler classification task and samples for training became sufficient in number. Next, we used three BERT-based models to solve binary questions in order to achieve stable performance. As a result, the model achieved an F2-score of 0.6587 in Task 3 (ranked 1st) and an accuracy of 0.6161 in Task 4.
AB - The Competition on legal information extraction/entailment (COLIEE) is an international information processing and retrieval competition. As an aid to future participants as well as question designers, this article describes how to connect legal questions taken from past Japanese bar exams to relevant statutes (articles of the Japanese Civil Code, Task 3) and how to construct a Yes/No question answering system for legal queries (Task 4) incorporating background materials on Japanese law. We restructured the given data to a dataset which contains all possible combinations of queries and articles as continuous strings as our samples. In this way, the difficult pairing task has been turned into a simpler classification task and samples for training became sufficient in number. Next, we used three BERT-based models to solve binary questions in order to achieve stable performance. As a result, the model achieved an F2-score of 0.6587 in Task 3 (ranked 1st) and an accuracy of 0.6161 in Task 4.
KW - BERT-based ensemble model
KW - COLIEE 2020
KW - Information retrieval
KW - Legal AI
KW - Legal analytics
KW - Textual entailment
UR - http://www.scopus.com/inward/record.url?scp=85112189820&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112189820&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-79942-7_15
DO - 10.1007/978-3-030-79942-7_15
M3 - Conference contribution
AN - SCOPUS:85112189820
SN - 9783030799410
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 226
EP - 239
BT - New Frontiers in Artificial Intelligence - JSAI-isAI 2020 Workshops, JURISIN, LENLS 2020 Workshops, 2020, Revised Selected Papers
A2 - Okazaki, Naoaki
A2 - Yada, Katsutoshi
A2 - Satoh, Ken
A2 - Mineshima, Koji
PB - Springer Science and Business Media Deutschland GmbH
T2 - 12th International Symposium on Artificial Intelligence supported by the Japanese Society for Artificial Intelligence, JSAI-isAI 2020, International Workshop on Logic and Engineering of Natural Language Semantics, LENLS 2020, 14th International Workshop on Juris-informatics, JURISIN 2020
Y2 - 15 November 2020 through 17 November 2020
ER -