File size: 3,266 Bytes
c0b4944
 
e3a83ba
 
 
6dc859b
e3a83ba
 
c0b4944
5638d20
da52a7c
0887ad6
7725966
c0b4944
d09b9fd
 
3e09482
 
d09b9fd
3e09482
cc81004
a94503d
e1dc218
bb5040a
1c16c76
 
d7c1025
549ef2e
 
21bb0de
8bb6577
 
 
 
549ef2e
d7c1025
d09b9fd
 
0d28425
d09b9fd
0d28425
e1dc218
 
 
30ec89c
1a566df
ec5158b
1a566df
 
 
e7ba366
 
0f8cfbc
 
3e09482
2ca2901
 
ec5158b
0d7c6a6
23abe88
1a566df
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language:
- en
tags:
- merge
license: llama2
library_name: transformers
pipeline_tag: text-generation
---

#
<img src=https://huggingface.co/altomek/CodeRosa-70B-AB1/resolve/main/CodeRosa.png>
<a href="https://www.youtube.com/watch?v=DfXLf402I94" title="Dust of the Saturn - Dynatron" target="_blank">intro music...</a>

## CodeRosa-70B-AB1

I desired a model that could serve as an everyday helpful companion with some coding skills.
The idea was that Llama's censorship implies a deeper understanding of human emotions and I wanted this part of Llama to integrate into this merge.

Model adopted a task-oriented approach from CodeLlama Python and thus requires precise prompting. It can produce longer texts as well as shorter responses. It tends to avoid happy endings and instead surprises with open-ended scenarios inviting further interaction. It prefers spelling numbers over writing them down but YMMV.

This model is for personal use, made for myself as experiment. I would like to make next iteration of this model in future. Mission is the same: very nice bot, able to talk about variety of topics in a very emetional way with some kick for programming and with ability to teach some things, beside all this to be good text summarizer ideally with Polish language as available option. This is a purpose. Did I succed with this merge? I have to experiment with below two models more. I like this result, love how it aproaches problems, this was iteration worth publishing even thought it is not much tested!

<img src=https://huggingface.co/altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTalk1.png>
<br>
<img src=https://huggingface.co/altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTalk2.png>
<br>
<img src=https://huggingface.co/altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaTalk3.png>
<br>
Context size of 11K did not yield satisfactory results... :P
<img src=https://huggingface.co/altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaNuts1.png>
<br>
<img src=https://huggingface.co/altomek/CodeRosa-70B-AB1/resolve/main/CodeRosaNuts2.png>
<br>


### Ingridients

 - [Midnight-Rose-70B-v2.0.3](https://huggingface.co/sophosympatheia/Midnight-Rose-70B-v2.0.3)

 - [CodeLlama-70b-Python-hf](https://huggingface.co/codellama/CodeLlama-70b-Python-hf)

### Settings

Setting from Midnight-Rose should work in SillyTavern. This is almost same what I use for testing.

I use max_seq_len 8K with alpha_value 2.65.

### Quants

- [6bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-6bpw-EXL2)
- [5bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-5bpw-EXL2)
- [4.5bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-4.5bpw-EXL2)
- [4bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-4bpw-EXL2)
- [3.92bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-3.92bpw-EXL2) --> 40GB VRAM
- [3.5bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-3.5bpw-EXL2)
- [3bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-3bpw-EXL2)
- [2.4bpw](https://huggingface.co/altomek/CodeRosa-70B-AB1-2.4bpw-EXL2) --> 24GB VRAM
- [measurements](https://huggingface.co/altomek/measurements/resolve/main/CodeRosa-AB1_measurement.json) --> ExLlamav2 measurments
- [GGUF](https://huggingface.co/mradermacher/CodeRosa-70B-AB1-GGUF)

### PS
I welcome your comments about this model.