File size: 2,333 Bytes
877c26c b162e30 877c26c b162e30 877c26c b162e30 877c26c b162e30 877c26c b162e30 877c26c b162e30 877c26c b162e30 877c26c b162e30 877c26c b162e30 877c26c 0c0b090 877c26c b162e30 877c26c 0c0b090 9f99fac 0c0b090 b162e30 877c26c b162e30 877c26c b162e30 877c26c 0f35d06 877c26c 0f35d06 877c26c b162e30 877c26c b162e30 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
---
license: bigscience-bloom-rail-1.0
base_model: bigscience/bloom-1b7
tags:
- generated_from_trainer
model-index:
- name: Bloom-1b7-winograd-wsc-IT-baseline
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Bloom-1b7-winograd-wsc-IT-baseline
This model is a fine-tuned version of [bigscience/bloom-1b7](https://huggingface.co/bigscience/bloom-1b7) on an unknown dataset.
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
Instruction Tuned on the winograd-wsc task here: https://huggingface.co/datasets/adambjorn/UnrelatedForgettingOverhead/viewer/winograd_wsc
## Training procedure
Given some prompts:
``` python
prompts = [
"Determine which option the pronoun refers to in this text: ",
"Given the text, identify the referent of the pronoun among these options: ",
"Read the text and decide which option is referred to by the pronoun: ",
"In the text below, to whom or what does the pronoun refer? Choose from the options: ",
]
```
Each example is concatenated with the prompt, text, pronoun, quote, options and correct option like so:
``` python
# Concatenate the selected prompt, text, pronoun, quote, options, and the correct option into a single string
input_text = f"{prompt}Text: '{text}' Pronoun: '{pronoun}', Quote: '{quote}'. {options_text}. {correct_option}. </s> "
```
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP
### Training results
Final results: {'loss': 0.0983, 'grad_norm': 4.820842266082764, 'learning_rate': 6.000000000000001e-07, 'epoch': 10.0}
Average results: {'train_runtime': 452.2725, 'train_samples_per_second': 4.422, 'train_steps_per_second': 1.106, 'train_loss': 0.33704672479629516, 'epoch': 10.0}
### Framework versions
- Transformers 4.38.1
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2
|