File size: 3,099 Bytes
bac1d54
23211e7
bac1d54
 
 
23211e7
bac1d54
 
de0a40a
23211e7
 
 
 
 
 
 
 
 
de0a40a
23211e7
 
 
 
 
 
 
 
 
 
 
 
 
 
bac1d54
 
cc7f44f
bac1d54
c7d342a
bac1d54
5edd065
86a16c7
801db38
 
 
 
 
 
8fc3e5b
5edd065
bac1d54
f492d8a
103f08f
 
 
 
bac1d54
5edd065
bac1d54
 
bafffc8
 
 
 
 
 
 
 
 
 
 
 
 
bac1d54
 
5edd065
bac1d54
f6a0b2a
3457354
f6a0b2a
3457354
bafffc8
3520491
3457354
 
f6a0b2a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
language:
- en
- es
- eu
- multilingual
datasets:
- squad
widget:
- text: When was Florence Nightingale born?
  context: Florence Nightingale, known for being the founder of modern nursing, was
    born in Florence, Italy, in 1820.
  example_title: English
- text: �Por qu� provincias pasa el Tajo?
  context: 'El Tajo es el r�o m�s largo de la pen�nsula ib�rica, a la que atraviesa
    en su parte central, siguiendo un rumbo este-oeste, con una leve inclinaci�n hacia
    el suroeste, que se acent�a cuando llega a Portugal, donde recibe el nombre de
    Tejo.

    Nace en los montes Universales, en la sierra de Albarrac�n, sobre la rama occidental
    del sistema Ib�rico y, despu�s de recorrer 1007 km, llega al oc�ano Atl�ntico
    en la ciudad de Lisboa. En su desembocadura forma el estuario del mar de la Paja,
    en el que vierte un caudal medio de 456 m�/s. En sus primeros 816 km atraviesa
    Espa�a, donde discurre por cuatro comunidades aut�nomas (Arag�n, Castilla-La Mancha,
    Madrid y Extremadura) y un total de seis provincias (Teruel, Guadalajara, Cuenca,
    Madrid, Toledo y C�ceres).'
  example_title: Espa�ol
- text: Zer beste izenak ditu Tartalo?
  context: 'Tartalo euskal mitologiako izaki begibakar artzain erraldoia da. Tartalo
    izena zenbait euskal hizkeratan herskari-bustidurarekin ahoskatu ohi denez, horrelaxe
    ere idazten da batzuetan: Ttarttalo. Euskal Herriko zenbait tokitan, Torto edo
    Anxo ere esaten diote.'
  example_title: Euskara
---

# ixambert-base-cased finetuned for QA

This is a basic implementation of the multilingual model ["ixambert-base-cased"](https://huggingface.co/ixa-ehu/ixambert-base-cased), fine-tuned on SQuAD v1.1, that is able to answer basic factual questions in English, Spanish and Basque.

## Overview

* **Language model:** ixambert-base-cased
* **Languages:** English, Spanish and Basque
* **Downstream task:** Extractive QA
* **Training data:** SQuAD v1.1
* **Eval data:** SQuAD v1.1
* **Infrastructure:** 1x GeForce RTX 2080

## Outputs

The model outputs the answer to the question, the start and end positions of the answer in the original context, and a score for the probability for that span of text to be the correct answer. For example:

```python
{'score': 0.9667195081710815, 'start': 101, 'end': 105, 'answer': '1820'}
```

## How to use

```python
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline

model_name = "MarcBrun/ixambert-finetuned-squad"

# To get predictions
context = "Florence Nightingale, known for being the founder of modern nursing, was born in Florence, Italy, in 1820"
question = "When was Florence Nightingale born?"
qa = pipeline("question-answering", model=model_name, tokenizer=model_name)
pred = qa(question=question,context=context)

# To load the model and tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
```

## Hyperparameters

```
batch_size = 8
n_epochs = 3
learning_rate = 2e-5
optimizer = AdamW
lr_schedule = linear
max_seq_len = 384
doc_stride = 128
```