Spanish T5 (small) fine-tuned on SQAC for Spanish QA 📖❓

spanish-T5-small fine-tuned on SQAC for Q&A downstream task.

Details of Spanish T5 (small)

T5 (small) like arch trained from scatch on large_spanish_corpus for HuggingFace/Flax/Jax Week.

Details of the dataset 📚

This dataset contains 6,247 contexts and 18,817 questions with their answers, 1 to 5 for each fragment. The sources of the contexts are:

Encyclopedic articles from Wikipedia in Spanish, used under CC-by-sa licence.
News from Wikinews in Spanish, used under CC-by licence.
Text from the Spanish corpus AnCora, which is a mix from diferent newswire and literature sources, used under CC-by licence. This dataset can be used to build extractive-QA.

Results on test dataset 📝

Metric	# Value
BLEU	41.94

Model in Action 🚀

from transformers import T5ForConditionalGeneration, AutoTokenizer
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
ckpt = 'mrm8488/spanish-t5-small-sqac-for-qa'
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = T5ForConditionalGeneration.from_pretrained(ckpt).to(device)



def get_answer(question, context):
  input_text = 'question: %s  context: %s' % (question, context)
  features = tokenizer([input_text ], padding='max_length', truncation=True, max_length=512, return_tensors='pt')
  output = model.generate(input_ids=features['input_ids'].to(device), 
               attention_mask=features['attention_mask'].to(device))

  return tokenizer.decode(output[0], skip_special_tokens=True)
  
context = '''
La ex codirectora del grupo de investigación de IA ética de Google, Margaret Mitchell, 
quien fue despedida en febrero después de una controversia sobre un artículo crítico del que fue coautora, 
se unirá a HuggingFace para ayudar a que los algoritmos de IA sean más justos.
'''

question = '¿Qué hará Margaret Mitchell en HuggingFace?'

print(get_answer(context, question))

# ayudar a que los algoritmos de ia sean más justos

Created by Manuel Romero/@mrm8488 with the support of Narrativa

Made with ♥ in Spain

mrm8488
/

spanish-t5-small-sqac-for-qa

Spanish T5 (small) fine-tuned on SQAC for Spanish QA 📖❓

Details of Spanish T5 (small)

Details of the dataset 📚

Results on test dataset 📝

Model in Action 🚀

Space using mrm8488/spanish-t5-small-sqac-for-qa 1