File size: 5,028 Bytes
09f3204
bca0b1b
 
 
09f3204
bca0b1b
 
bcda440
 
 
 
 
bca0b1b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
09f3204
bcda440
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
893b4e6
b6e2028
 
 
 
 
 
 
 
 
 
 
 
 
 
893b4e6
 
 
bcda440
 
 
 
 
 
bca0b1b
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
---
language:
- en
- ko
license: cc-by-sa-4.0
tags:
- not-for-all-audiences
datasets:
- maywell/ko_wikidata_QA
- kyujinpy/OpenOrca-KO
- Anthropic/hh-rlhf
pipeline_tag: text-generation
model-index:
- name: PiVoT-0.1-Evil-a
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 59.64
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-0.1-Evil-a
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 81.48
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-0.1-Evil-a
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 58.94
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-0.1-Evil-a
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 39.23
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-0.1-Evil-a
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 75.3
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-0.1-Evil-a
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 40.41
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=maywell/PiVoT-0.1-Evil-a
      name: Open LLM Leaderboard
---

# PiVoT-0.1-early

![image/png](./PiVoT.png)

# **Model Details**

### Description
PivoT is Finetuned model based on Mistral 7B. It is variation from Synatra v0.3 RP which has shown decent performance.

PiVoT-0.1-Evil-**a** is an **Evil tuned** Version of PiVoT. It finetuned by method below.

PiVot-0.1-Evil-**b** has Noisy Embedding tuned. It would have more variety in results.

![image/png](./eviltune.png)


<!-- prompt-template start -->
## Prompt template: Alpaca-InstructOnly2

```
### Instruction:
{prompt}

### Response:

```

<!-- prompt-template end -->


### Disclaimer
The AI model provided herein is intended for experimental purposes only. The creator of this model makes no representations or warranties of any kind, either express or implied, as to the model's accuracy, reliability, or suitability for any particular purpose. The creator shall not be held liable for any outcomes, decisions, or actions taken on the basis of the information generated by this model. Users of this model assume full responsibility for any consequences resulting from its use.

OpenOrca Dataset used when finetune PiVoT variation. Arcalive Ai Chat Chan log 7k, [ko_wikidata_QA](https://huggingface.co/datasets/maywell/ko_wikidata_QA), [kyujinpy/OpenOrca-KO](https://huggingface.co/datasets/kyujinpy/OpenOrca-KO) and other datasets used on base model.

Follow me on twitter: https://twitter.com/stablefluffy

Consider Support me making these model alone: https://www.buymeacoffee.com/mwell or with Runpod Credit Gift 💕

Contact me on Telegram: https://t.me/AlzarTakkarsen
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_maywell__PiVoT-0.1-Evil-a)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |59.16|
|AI2 Reasoning Challenge (25-Shot)|59.64|
|HellaSwag (10-Shot)              |81.48|
|MMLU (5-Shot)                    |58.94|
|TruthfulQA (0-shot)              |39.23|
|Winogrande (5-shot)              |75.30|
|GSM8k (5-shot)                   |40.41|