Edit model card

You need to agree to share your contact informations to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Alpacoom logo

BART Legal Spanish ⚖️

BART Legal Spanish (base) is a BART-like model trained on A collection of corpora of Spanish legal domain.

BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function and (2) learning a model to reconstruct the original text.

This model is particularly effective when fine-tuned for text generation tasks (e.g., summarization, translation) but also works well for comprehension tasks (e.g., text classification, question answering).

Training details

  • Dataset: Spanish-legal-corpora - 90% for training / 10% for validation.
  • Training script: see here

Evaluation metrics

Metric # Value
Accuracy 0.86
Loss 0.68

Benchmarks 🔨


How to use with transformers

from transformers import BartForConditionalGeneration, BartTokenizer

model_id = "mrm8488/bart-legal-base-es"

model = BartForConditionalGeneration.from_pretrained(model_id, forced_bos_token_id=0)
tokenizer = BartTokenizer.from_pretrained(model_id)

def fill_mask_span(text):
  batch = tokenizer(text, return_tensors="pt")
  generated_ids = model.generate(batch["input_ids"])
  print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True))

text = "Los españoles son <mask> ante la ley."
# Output: ['Los españoles son iguales ante la ley.1.ª y 2.ª ante la']

text = "Los proyectos de reforma Constitucional deberán <mask> por una mayoría de tres quintos de cada una de las Cámaras."
# Output: ['Los proyectos de reforma Constitucional deberán ser aprobados por una mayoría de tres quintos de cada']



If you want to cite this model, you can use this:

@misc {manuel_romero_2023,
    author       = { {Manuel Romero} },
    title        = { bart-legal-base-es (Revision c33ed22) },
    year         = 2023,
    url          = { https://huggingface.co/mrm8488/bart-legal-base-es },
    doi          = { 10.57967/hf/0472 },
    publisher    = { Hugging Face }

Created by Manuel Romero/@mrm8488

Made with in Spain

Downloads last month
Hosted inference API
This model can be loaded on the Inference API on-demand.