File size: 1,912 Bytes
3035718 8de3ce6 fc667d3 236f3fd 3035718 8de3ce6 3035718 8de3ce6 d007c44 9601fdd 2845e95 e9c7fbc 9601fdd 41910ce e9c7fbc 8de3ce6 9601fdd 8de3ce6 8d58ada 11633ba 6f010c9 8de3ce6 3035718 8de3ce6 3f4d24f e32502b a7582c2 e32502b 92e6843 e9c7fbc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
---
language:
- en
tags:
- t5
- qa
- askscience
- lfqa
- information retrieval
datasets:
- eli5
metrics:
- rouge
widget:
- text: "why aren't there more planets in our solar system?"
example_title: "solar system"
- text: "question: what is a probability distribution? context: I am just learning about statistics."
example_title: "probability distribution"
- text: "question: What are the underlying physical processes by which exercise helps us lose weight? context: I started working out two weeks ago and already feel a lot better, and started to think about it and became deeply confused."
example_title: "pumpen"
- text: "what is a neural network?"
example_title: "deep learning"
- text: "What are the primary mechanisms that computers use to understand human language?"
example_title: "NLP"
inference:
parameters:
max_length: 96
no_repeat_ngram_size: 2
encoder_no_repeat_ngram_size: 4
repetition_penalty: 3.51
length_penalty: 0.8
num_beams: 4
early_stopping: True
---
# t5 - base- askscience
- [t5-v1_1](https://huggingface.co/google/t5-v1_1-base) trained on the entirety of the _askscience_ sub-section of the eli5 dataset for one epoch.
- compare to bart on eli5 [here](https://huggingface.co/yjernite/bart_eli5)
- note that for the inference API, the model is restricted to outputting 96 tokens - by using the model in python with the transformers library, you can get longer outputs.
## training
- for inputs, the model was presented with the post title and the post selftext encoded as: `question: <post title> context: <post selftext>`. You may see better results if queries are posed in this fashion.
- The top two replies were aggregated and presented to the model as the output text.
- Training for longer will be explored, but given that the dataset has 127k examples and the loss flatlines at 0.5 epochs so this model should be fairly viable. |