pszemraj
/

t5-base-askscience-lfqa

Text2Text Generation

information retrieval

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

checkpoints

This model is a fine-tuned version of google/t5-v1_1-base on the vblagoje/lfqa dataset, with training duration of 2 epochs, for a (somewhat) apples-to-apples comparison with t5-base on the standard eli5 dataset.
- This checkpoint does seem to be more coherent than t5-base on the original dataset.
Compared to bart on lfqa, it seems to be able to respond to some questions independently of retrieval.

NOTE: the inference API is limited to generating approx. 64 chars for runtime reasons, for longer outputs try using it in python as a transformers pipeline object.

Intended uses & limitations

Q&A, information retrieval
it is probably better to use it with a retrieval pipeline than alone

Training and evaluation data

see linked dataset. the dataset was filtered to only included the askscience subreddit in an attempt to focus on academic/technical queries.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
num_epochs: 2

Training results

Framework versions

Transformers 4.16.2
Pytorch 1.10.0+cu113
Datasets 1.18.3
Tokenizers 0.11.0

Downloads last month: 26

Safetensors

Model size

248M params

Tensor type

F32

·

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pszemraj/t5-base-askscience-lfqa

Base model

google/t5-v1_1-base

Finetuned

(12)

this model

Dataset used to train pszemraj/t5-base-askscience-lfqa

Space using pszemraj/t5-base-askscience-lfqa 1