---
library_name: transformers
license: apache-2.0
datasets:
- pszemraj/flan-subsets-deduped
language:
- en
base_model: pszemraj/tFINE-900m-e16-d32-1024ctx
pipeline_tag: text2text-generation
---

# BEE-spoke-data/tFINE-900m-e16-d32-flan

This is a basic text-to-text "instruct" model, similar to Google's original [flan-t5](https://huggingface.co/collections/google/flan-t5-release-65005c39e3201fff885e22fb) model series (but not trained for as long).  


<details>
  <summary>Details: Click here to expand</summary>

Fine-tuned from [the base model](https://hf.co/pszemraj/tFINE-900m-e16-d32-1024ctx) on the `pszemraj/flan-subsets-deduped` dataset, subset `flan-v2` for 1 epoch. It achieves the following results on the evaluation set:
- Loss: 1.4134
- Rouge1: 62.9142
- Rouge2: 22.5279
- Rougel: 61.4902
- Rougelsum: 61.7795
- Gen Len: 12.0586
- Num Input Tokens Seen: 1931815668

### Model features

- pretrained & fine-tuned at 1024 context length (input)
- tokenizer with byte-pair fallback to support understanding and generating text beyond what the original T5 tokenizer does

</details>

## Usage Example

```py
from transformers import pipeline

pipe = pipeline(
    "text2text-generation",
    model="BEE-spoke-data/tFINE-900m-e16-d32-flan",
)
prompt = "What color is tuesday?"
res = pipe(prompt, max_new_tokens=96, top_k=4, penalty_alpha=0.6)
print(res[0]["generated_text"])
```

## Quick eval

Quick eval for:	`BEE-spoke-data/tFINE-900m-e16-d32-flan`


hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-flan,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
|    Tasks    |Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-------------|------:|----------------|-----:|-----------|---|-----:|---|------|
|boolq        |      2|none            |     0|acc        |↑  |0.6700|±  |0.0082|
|openbookqa   |      1|none            |     0|acc        |↑  |0.1900|±  |0.0176|
|             |       |none            |     0|acc_norm   |↑  |0.2980|±  |0.0205|
|piqa         |      1|none            |     0|acc        |↑  |0.6001|±  |0.0114|
|             |       |none            |     0|acc_norm   |↑  |0.6072|±  |0.0114|
|social_iqa   |      0|none            |     0|acc        |↑  |0.4299|±  |0.0112|
|tinyArc      |      0|none            |    25|acc_norm   |↑  |0.3214|±  |   N/A|
|tinyGSM8k    |      0|flexible-extract|     5|exact_match|↑  |0.0492|±  |   N/A|
|             |       |strict-match    |     5|exact_match|↑  |0.0380|±  |   N/A|
|tinyHellaswag|      0|none            |    10|acc_norm   |↑  |0.4005|±  |   N/A|
|tinyMMLU     |      0|none            |     0|acc_norm   |↑  |0.2857|±  |   N/A|
|winogrande   |      1|none            |     0|acc        |↑  |0.4988|±  |0.0141|