t5e-mini-nl24-flan

25k steps on FLAN as an initial test/validation that code works. Not practically useful.

from transformers import pipeline

pipe = pipeline(
    "text2text-generation",
    model="pszemraj/t5e-mini-nl24-flan",
)
res = pipe(
    "true or false: water is wet.",
    top_k=4,
    penalty_alpha=0.6,
    max_new_tokens=128,
)
print(res[0]["generated_text"])

Quick eval

Quick eval for: pszemraj/t5e-mini-nl24-flan

hf (pretrained=pszemraj/t5e-mini-nl24-flan,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
boolq	2	none	0	acc	↑	0.4541	±	0.0087
openbookqa	1	none	0	acc	↑	0.1300	±	0.0151
		none	0	acc_norm	↑	0.2700	±	0.0199
piqa	1	none	0	acc	↑	0.6159	±	0.0113
		none	0	acc_norm	↑	0.6077	±	0.0114
social_iqa	0	none	0	acc	↑	0.3705	±	0.0109
tinyArc	0	none	25	acc_norm	↑	0.2913	±	N/A
tinyGSM8k	0	flexible-extract	5	exact_match	↑	0.0269	±	N/A
		strict-match	5	exact_match	↑	0.0055	±	N/A
tinyHellaswag	0	none	10	acc_norm	↑	0.3538	±	N/A
tinyMMLU	0	none	0	acc_norm	↑	0.2551	±	N/A
winogrande	1	none	0	acc	↑	0.5217	±	0.0140

base model evals: click to expand

Quick eval for: google/t5-efficient-mini-nl24

hf (pretrained=google/t5-efficient-mini-nl24,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
boolq	2	none	0	acc	↑	0.3783	±	0.0085
openbookqa	1	none	0	acc	↑	0.1280	±	0.0150
		none	0	acc_norm	↑	0.2660	±	0.0198
piqa	1	none	0	acc	↑	0.5473	±	0.0116
		none	0	acc_norm	↑	0.5267	±	0.0116
social_iqa	0	none	0	acc	↑	0.3536	±	0.0108
tinyArc	0	none	25	acc_norm	↑	0.3101	±	N/A
tinyGSM8k	0	flexible-extract	5	exact_match	↑	0.0145	±	N/A
		strict-match	5	exact_match	↑	0.0055	±	N/A
tinyHellaswag	0	none	10	acc_norm	↑	0.2616	±	N/A
tinyMMLU	0	none	0	acc_norm	↑	0.2839	±	N/A
winogrande	1	none	0	acc	↑	0.4996	±	0.0141

pszemraj
/

t5e-mini-nl24-flan

t5e-mini-nl24-flan

Quick eval

Model tree for pszemraj/t5e-mini-nl24-flan

Dataset used to train pszemraj/t5e-mini-nl24-flan