RozGrov
/

NemoDori-v0.2.1-12B-MN-BT

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

RozGrov commited on Aug 10, 2024

Commit

8ed7b0b

·

verified ·

1 Parent(s): be6dfac

Update README.md

Files changed (1) hide show

README.md +31 -3

README.md CHANGED Viewed

@@ -8,16 +8,44 @@ library_name: transformers
 tags:
 - mergekit
 - merge
 ---
 # NemoDori-v0.5-12B-MN-BT
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ## Merge Details
 ### Merge Method
-This model was merged using the breadcrumbs_ties merge method using [RozGrov/NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT) as a base.
 ### Models Merged
@@ -52,4 +80,4 @@ parameters:
     gamma: 0.015
 dtype: bfloat16
-```

 tags:
 - mergekit
 - merge
+pipeline_tag: text-generation
 ---
 # NemoDori-v0.5-12B-MN-BT
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+The first child from [NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT).
+**The purpose** is to find a way to increase v0.2 capability to stay **aware of the past conversations** and **follow instructions better**, especially the last one (depth-0),
+while keeping it's **creativity and capability to (E)RP**.
+This model is one of the few childs to try to fulfill that.
+In my very short testing so far, I haven't found anything that's different from the parent and worth mentioning. But I think this version is **slightly degraded** somehow,
+(I don't quite know it, I just felt like it did). Anyway, try it as you may, I think **v0.2** is **better** than this one.
+The other child (**v0.6**) will come soon.
+I tested it more than this model and it seems to improve the instruction-following part, but the response format is not very consistent in a way somehow.
+So yeah, just a sneak peak of it... maybe.
+You may give me feedback on how I can fulfill my-*ahem* it's purpose while keeping it as low as not-70B.
+<br>
+Fine-tune is... pretty expensive for me, and I'm not ready for that (yet, tho i'm interested).
+<p style="font-size: 11px; margin-top: 11px" id="heya-im-a-bit-of-a-programmer">
+  (listen, between you and me, i still don't get it. still learning this new hobby of mine, and it's kind of refreshing in a way.
+i'll be exploring more other architectures in the future. Yet, this is about how random i pick my straw, just to see how lucky i am.)
+  <br>
+  (although, i am interested to learn how to make a new merge method.
+  similar to when i'm making a solution for solving specific problem just like good ol days.
+  <span style="color: darkred">but hell, this llm stuff is really expensive.</span>)
+</p>
 ## Merge Details
 ### Merge Method
+This model was merged using the `breadcrumbs_ties` merge method using [RozGrov/NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT) as a base.
 ### Models Merged
     gamma: 0.015
 dtype: bfloat16
+```