File size: 1,259 Bytes
646511e
 
c1c1927
 
646511e
 
de23226
 
 
 
646511e
 
 
 
 
4fd5974
646511e
 
 
 
81206c0
deceb01
81206c0
4fd5974
81206c0
646511e
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
license: apache-2.0
widget:
- text: "Las [MASK] son adictivas."
---

<img src="o.svg" align="left" alt="logo" width="40" style="margin-right: 5px;" />

LudoBETO is a domain adaptation of a [Spanish BERT](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) language model. <br clear="left"/> It was adapted to the pathological gambling domain with a corpus extracted from a specialised [forum](https://www.ludopatia.org/web/index_es.htm). We automatically compiled with a LLM a lexical resource to guide the masking process of the language model and, therefore, to help it in paying more attention to words related to pathological gambling. 


For training the model we used a batch size of 8, Adam optimizer, with a learning rate of 2e-5, and cross-entropy as a loss function. We trained the model for four epochs using a GPU NVIDIA GeForce RTX 4070 12GB.

## Usage

```python
from transformers import pipeline

pipe = pipeline("fill-mask", model="citiusLTL/ludoBETO")

text = pipe("Las [MASK] son adictivas.")

print(text)
```

## Load model directly

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("citiusLTL/ludoBETO")
model = AutoModelForMaskedLM.from_pretrained("citiusLTL/ludoBETO")