language:
- en
license: mit
tags:
- legal
datasets:
- ricdomolm/lawma-all-tasks
Lawma 8B
Lawma 8B is a fine-tune of Llama 3 8B Instruct on 260 legal classification tasks derived from Supreme Court and Songer Court of Appeals databases. Lawma was fine-tuned on over 500k task examples, totalling 2B tokens. As a result, Lawma 8B outperforms GPT-4 on 95% of these legal classification tasks, on average by over 17 accuracy points. See our arXiv preprint and GitHub repository for more details.
Evaluations
We report mean classification accuracy across the 260 legal classification tasks that we consider. We use the standard MMLU multiple-choice prompt, and evaluate models zero-shot. You can find our evaluation code here.
Model | All tasks | Supreme Court tasks | Court of Appeals tasks |
---|---|---|---|
Lawma 70B | 81.9 | 84.1 | 81.5 |
Lawma 8B | 80.3 | 82.4 | 79.9 |
GPT4 | 62.9 | 59.8 | 63.4 |
Llama 3 70B Inst | 58.4 | 47.1 | 60.3 |
Mixtral 8x7B Inst | 43.2 | 24.4 | 46.4 |
Llama 3 8B Inst | 42.6 | 32.8 | 44.2 |
Majority classifier | 41.7 | 31.5 | 43.5 |
Mistral 7B Inst | 39.9 | 19.5 | 43.4 |
Saul 7B Inst | 34.4 | 20.2 | 36.8 |
LegalBert | 24.6 | 13.6 | 26.4 |
FAQ
What are the Lawma models useful for? We recommend using the Lawma models only for the legal classification tasks that they models were fine-tuned on. The main take-away of our paper is that specializing models leads to large improvements in performance. Therefore, we strongly recommend practitioners to further fine-tune Lawma on the actual tasks that the models will be used for. Relatively few examples --i.e, dozens or hundreds-- may already lead to large gains in performance.
What legal classification tasks is Lawma fine-tuned on? We consider almost all of the variables of the Supreme Court and Songer Court of Appeals databases. Our reasons to study these legal classification tasks are both technical and substantive. From a technical machine learning perspective, these tasks provide highly non-trivial classification problems where even the best models leave much room for improvement. From a substantive legal perspective, efficient solutions to such classification problems have rich and important applications in legal research.
Citation
This model was trained for the project
Lawma: The Power of Specizalization for Legal Tasks. Ricardo Dominguez-Olmedo and Vedant Nanda and Rediet Abebe and Stefan Bechtold and Christoph Engel and Jens Frankenreiter and Krishna Gummadi and Moritz Hardt and Michael Livermore. 2024
Please cite as:
@misc{dominguezolmedo2024lawmapowerspecializationlegal,
title={Lawma: The Power of Specialization for Legal Tasks},
author={Ricardo Dominguez-Olmedo and Vedant Nanda and Rediet Abebe and Stefan Bechtold and Christoph Engel and Jens Frankenreiter and Krishna Gummadi and Moritz Hardt and Michael Livermore},
year={2024},
eprint={2407.16615},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2407.16615},
}