You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Roberta-base trained with linearly increasing alpha for alpha-entmax (from 1.0 to 2.0).

To run, do this:

from sparse_roberta import get_custom_model

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained('roberta-base')

# Load the model
model = get_custom_model(
    'mtreviso/sparsemax-roberta',
    initial_alpha=2.0,
    use_triton_entmax=False,
    from_scratch=False,
)

To run glue tasks, you can use the run_glue.py script. For example:

python run_glue.py \
  --model_name_or_path mtreviso/sparsemax-roberta \
  --config_name roberta-base \
  --tokenizer_name roberta-base \
  --task_name rte \
  --output_dir output-rte \
  --do_train \
  --do_eval \
  --max_seq_length 512 \
  --per_device_train_batch_size 32 \
  --learning_rate 3e-5 \
  --num_train_epochs 3 \
  --save_steps 1000 \
  --logging_steps 100 \
  --save_total_limit 1 \
  --overwrite_output_dir
Downloads last month
0
Safetensors
Model size
125M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .