TY - GEN
T1 - LawLLM
T2 - 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
AU - Shu, Dong
AU - Zhao, Haoran
AU - Liu, Xukun
AU - Demeter, David
AU - Du, Mengnan
AU - Zhang, Yongfeng
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/21
Y1 - 2024/10/21
N2 - In the rapidly evolving field of legal analytics, finding relevant cases and accurately predicting judicial outcomes are challenging because of the complexity of legal language, which often includes specialized terminology, complex syntax, and historical context. Moreover, the subtle distinctions between similar and precedent cases require a deep understanding of legal knowledge. Researchers often conflate these concepts, making it difficult to develop specialized techniques to effectively address these nuanced tasks. In this paper, we introduce the Law Large Language Model (LawLLM), a multi-task model specifically designed for the US legal domain to address these challenges. LawLLM excels at Similar Case Retrieval (SCR), Precedent Case Recommendation (PCR), and Legal Judgment Prediction (LJP). By clearly distinguishing between precedent and similar cases, we provide essential clarity, guiding future research in developing specialized strategies for these tasks. We propose customized data preprocessing techniques for each task that transform raw legal data into a trainable format. Furthermore, we also use techniques such as in-context learning (ICL) and advanced information retrieval methods in LawLLM. The evaluation results demonstrate that LawLLM consistently outperforms existing baselines in both zero-shot and few-shot scenarios, offering unparalleled multi-task capabilities and filling critical gaps in the legal domain. Code and data are available at https://github.com/Tizzzzy/Law_LLM.
AB - In the rapidly evolving field of legal analytics, finding relevant cases and accurately predicting judicial outcomes are challenging because of the complexity of legal language, which often includes specialized terminology, complex syntax, and historical context. Moreover, the subtle distinctions between similar and precedent cases require a deep understanding of legal knowledge. Researchers often conflate these concepts, making it difficult to develop specialized techniques to effectively address these nuanced tasks. In this paper, we introduce the Law Large Language Model (LawLLM), a multi-task model specifically designed for the US legal domain to address these challenges. LawLLM excels at Similar Case Retrieval (SCR), Precedent Case Recommendation (PCR), and Legal Judgment Prediction (LJP). By clearly distinguishing between precedent and similar cases, we provide essential clarity, guiding future research in developing specialized strategies for these tasks. We propose customized data preprocessing techniques for each task that transform raw legal data into a trainable format. Furthermore, we also use techniques such as in-context learning (ICL) and advanced information retrieval methods in LawLLM. The evaluation results demonstrate that LawLLM consistently outperforms existing baselines in both zero-shot and few-shot scenarios, offering unparalleled multi-task capabilities and filling critical gaps in the legal domain. Code and data are available at https://github.com/Tizzzzy/Law_LLM.
KW - large language models
KW - legal system
KW - multitask learning
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85210024413&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85210024413&partnerID=8YFLogxK
U2 - 10.1145/3627673.3680020
DO - 10.1145/3627673.3680020
M3 - Conference contribution
AN - SCOPUS:85210024413
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 4882
EP - 4889
BT - CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 21 October 2024 through 25 October 2024
ER -