RozGrov commited on
Commit
8ed7b0b
·
verified ·
1 Parent(s): be6dfac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -8,16 +8,44 @@ library_name: transformers
8
  tags:
9
  - mergekit
10
  - merge
11
-
12
  ---
13
  # NemoDori-v0.5-12B-MN-BT
14
 
15
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ## Merge Details
 
18
  ### Merge Method
19
 
20
- This model was merged using the breadcrumbs_ties merge method using [RozGrov/NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT) as a base.
21
 
22
  ### Models Merged
23
 
@@ -52,4 +80,4 @@ parameters:
52
  gamma: 0.015
53
  dtype: bfloat16
54
 
55
- ```
 
8
  tags:
9
  - mergekit
10
  - merge
11
+ pipeline_tag: text-generation
12
  ---
13
  # NemoDori-v0.5-12B-MN-BT
14
 
15
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
 
17
+ The first child from [NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT).
18
+
19
+ **The purpose** is to find a way to increase v0.2 capability to stay **aware of the past conversations** and **follow instructions better**, especially the last one (depth-0),
20
+ while keeping it's **creativity and capability to (E)RP**.
21
+ This model is one of the few childs to try to fulfill that.
22
+
23
+ In my very short testing so far, I haven't found anything that's different from the parent and worth mentioning. But I think this version is **slightly degraded** somehow,
24
+ (I don't quite know it, I just felt like it did). Anyway, try it as you may, I think **v0.2** is **better** than this one.
25
+
26
+ The other child (**v0.6**) will come soon.
27
+ I tested it more than this model and it seems to improve the instruction-following part, but the response format is not very consistent in a way somehow.
28
+ So yeah, just a sneak peak of it... maybe.
29
+
30
+ You may give me feedback on how I can fulfill my-*ahem* it's purpose while keeping it as low as not-70B.
31
+ <br>
32
+ Fine-tune is... pretty expensive for me, and I'm not ready for that (yet, tho i'm interested).
33
+
34
+ <p style="font-size: 11px; margin-top: 11px" id="heya-im-a-bit-of-a-programmer">
35
+ (listen, between you and me, i still don't get it. still learning this new hobby of mine, and it's kind of refreshing in a way.
36
+ i'll be exploring more other architectures in the future. Yet, this is about how random i pick my straw, just to see how lucky i am.)
37
+ <br>
38
+ (although, i am interested to learn how to make a new merge method.
39
+ similar to when i'm making a solution for solving specific problem just like good ol days.
40
+ <span style="color: darkred">but hell, this llm stuff is really expensive.</span>)
41
+ </p>
42
+
43
+
44
  ## Merge Details
45
+
46
  ### Merge Method
47
 
48
+ This model was merged using the `breadcrumbs_ties` merge method using [RozGrov/NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT) as a base.
49
 
50
  ### Models Merged
51
 
 
80
  gamma: 0.015
81
  dtype: bfloat16
82
 
83
+ ```