File size: 3,299 Bytes
be6dfac
 
 
 
 
 
 
 
 
 
8ed7b0b
be6dfac
537e94f
be6dfac
 
 
8ed7b0b
 
 
 
 
 
 
537e94f
8ed7b0b
537e94f
19f2ad4
8ed7b0b
537e94f
8ed7b0b
 
 
 
 
 
 
 
 
 
 
 
 
be6dfac
8ed7b0b
be6dfac
 
8ed7b0b
be6dfac
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8ed7b0b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
base_model:
- RozGrov/NemoDori-v0.2-12B-MN-BT
- unsloth/Mistral-Nemo-Instruct-2407
- UsernameJustAnother/Nemo-12B-Marlin-v5
- crestf411/nemo-sunfall-v0.6.1
library_name: transformers
tags:
- mergekit
- merge
pipeline_tag: text-generation
---
# NemoDori-v0.2.1-12B-MN-BT

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

The first child from [NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT).

**The purpose** is to find a way to increase v0.2 capability to stay **aware of the past conversations** and **follow instructions better**, especially the last one (depth-0),
while keeping it's **creativity and capability to (E)RP**.
This model is one of the few childs to try to fulfill that.

In my very short testing so far, I haven't found anything that's different from the parent and worth mentioning. But I think this version is **slightly degraded** somehow, 
(I don't quite know it, I just felt like it did). Anyway, try it as you may, but I think it's parent (**v0.2**) is **better** than this one.

The other child ([**v0.2.2**](https://huggingface.co/RozGrov/NemoDori-v0.2.2-12B-MN-ties)) is out. 
I tested it more than this model and it seems to be improved better than this model, but the response format is not very consistent.

You may give me feedback on anything, or guide me how I can fulfill my-*ahem* it's purpose while keeping it as low as not-70B.
<br>
Fine-tune is... pretty expensive for me, and I'm not ready for that (yet, tho i'm interested).

<p style="font-size: 11px; margin-top: 11px" id="heya-im-a-bit-of-a-programmer">
  (listen, between you and me, i still don't get it. still learning this new hobby of mine, and it's kind of refreshing in a way.
i'll be exploring more other architectures in the future. Yet, this is about how random i pick my straw, just to see how lucky i am.)
  <br>
  (although, i am interested to learn how to make a new merge method. 
  similar to when i'm making a solution for solving specific problem just like good ol days.
  <span style="color: darkred">but hell, this llm stuff is really expensive.</span>)
</p>


## Merge Details

### Merge Method

This model was merged using the `breadcrumbs_ties` merge method using [RozGrov/NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT) as a base.

### Models Merged

The following models were included in the merge:
* [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407)
* [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co/UsernameJustAnother/Nemo-12B-Marlin-v5)
* [crestf411/nemo-sunfall-v0.6.1](https://huggingface.co/crestf411/nemo-sunfall-v0.6.1)

### Configuration

The following YAML configuration was used to produce this model:

```yaml

models:
  - model: crestf411/nemo-sunfall-v0.6.1
    parameters:
      weight: 0.33
  - model: UsernameJustAnother/Nemo-12B-Marlin-v5
    parameters:
      weight: 0.2
  - model: unsloth/Mistral-Nemo-Instruct-2407
    parameters:
      weight: 0.37
  - model: RozGrov/NemoDori-v0.2-12B-MN-BT
    parameters:
      weight: 1
merge_method: breadcrumbs_ties
base_model: RozGrov/NemoDori-v0.2-12B-MN-BT
parameters:
    density: 0.93
    gamma: 0.015
dtype: bfloat16

```