File size: 1,854 Bytes
114712b
 
 
 
 
 
4bcf161
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
08b2406
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
license: cc-by-4.0
datasets:
- Photolens/alpaca-cleaned-airoboros-2.1-no-code-oasst1-en-merged
language:
- en
---

## Model overview
This model is finetuned on *[a merged dataset of: oasst1-en, alpaca-cleaned and airoboros-2.1-no-code](https://huggingface.co/datasets/Photolens/alpaca-cleaned-airoboros-2.1-no-code-oasst1-en-merged)* on a base model: *[Marx-3b-V2](https://huggingface.co/acrastt/Marx-3B-V2)*
 - License: "`Creative-Commons-Attribution-4.0`"
 - Language: "`en`"
 - Size: "`3.43b params`"

## Prompt template
Prompt template:
```
### SYSTEM:
<system_prompt_here>

### HUMAN:
<prompter_message_here>

### INPUT:
<input_text_here>

### RESPONSE:
<leave_a_blank_line_here>
```
*Note: If you dont have a system or input text, do not include the tokens in the prompt.*

## Training Details
This model took `2:40:54` to train in LoRA on a single `A100 40gb` GPU.<br>
 - *epochs*:  `1`
 - *train batch size*:  `8`
 - *eval batch size*:  `8`
 - *gradient accumulation steps*:  `1`
 - *maximum gradient normal*:  `0.3`
 - *learning rate*:  `2e-4`
 - *weight decay*:  `0.001`
 - *optimizer*:  `paged_adamw_32bit`
 - *learning rate schedule*:  `cosine`
 - *warmup ratio (linear)*:  `0.03`
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_AtAndDev__ShortKing-3b-v0.3)

| Metric                | Value                     |
|-----------------------|---------------------------|
| Avg.                  | 35.75   |
| ARC (25-shot)         | 40.96          |
| HellaSwag (10-shot)   | 70.72    |
| MMLU (5-shot)         | 26.21         |
| TruthfulQA (0-shot)   | 38.78   |
| Winogrande (5-shot)   | 66.93   |
| GSM8K (5-shot)        | 1.21        |
| DROP (3-shot)         | 5.46         |