metadata

language:
  - en
  - es
  - eu
datasets:
  - squad

Description

This is a basic implementation of the multilingual model "ixambert-base-cased", fine-tuned on SQuAD version 1.1, that is able to answer basic factual questions in English, Spanish and Basque. It extracts the span of text in which the answer is found.

Outputs

The model predicts a span of text from the context and a score for the probability for that span to be the correct answer:

Toxic: the tweet has at least some degree of toxicity.
Very Toxic: the tweet has a strong degree of toxicity.

How to use

The model can be used directly with a question-answering pipeline:

>>> from transformers import pipeline
>>> context = "Florence Nightingale, known for being the founder of modern nursing, was born in Florence, Italy, in 1820"
>>> question = "When was Florence Nightingale born?"
>>> qa = pipeline("question-answering", model="MarcBrun/ixambert-finetuned-squad")
>>> qa(question=question,context=context)
{'score': 0.9667195081710815, 'start': 101, 'end': 105, 'answer': '1820'}

%### Training procedure %The pre-trained model was fine-tuned for question answering using the following hyperparameters, which were selected from a validation set:

%* Batch size = 32 %* Learning rate = 2e-5 %* Epochs = 3

%The optimizer used was AdamW and the loss optimized was binary cross-entropy with class weights proportional to the class imbalance.