danlou commited on
Commit
e1cf27e
·
verified ·
1 Parent(s): 50c30f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -4
README.md CHANGED
@@ -22,6 +22,7 @@ library_name: transformers
22
  - [License](#license)
23
  - [Citation](#citation)
24
 
 
25
  ## Introduction: LLMs as IRCs
26
 
27
  Relay is motivated by this question: What does it take to chat with a base LLM?
@@ -37,6 +38,7 @@ Nevertheless, Relay can simulate more natural conversations (it’s not an assis
37
 
38
  Post-training methods also support the safety and alignment of LLMs. This important concern was also addressed in the development of the based-chat synthetic dataset, and tested for with the resulting fine-tuned model (see [Safety testing](#safety-testing)).
39
 
 
40
  ## How to use
41
 
42
  If you have a CUDA GPU (>=12GB VRAM), the best way to use Relay is with the [relaylm.py](https://github.com/danlou/relay/blob/main/relaylm.py) inference script. Just run:
@@ -45,7 +47,7 @@ curl https://danlou.co/f/relaylm.py | python -
45
  ```
46
 
47
  This script will select the best model for the available VRAM, download, load, and start an interactive chat session.
48
- It does not have any dependencies besides `transformers >= 4.45.1`.
49
 
50
  If you want to use a particular model, you can pass the model name as an argument:
51
  ```bash
@@ -80,6 +82,7 @@ print(favorite_holiday(relay, 'China'))
80
 
81
  More examples available in the [project's GitHub](https://github.com/danlou/relay).
82
 
 
83
  ## Safety testing
84
 
85
  While this model is intended for research purposes, it's still relevant to explore how this conversational model (and its self-supervised approach) compares on safety risk against other conversational models trained on the same base LLM.
@@ -92,15 +95,32 @@ As can be seen in the plot below, Relay v0.1 refuses to answer the majority of t
92
 
93
  <img src="https://cdn-uploads.huggingface.co/production/uploads/60f808c5c1adf9100f1f263c/0m-dMagE7yKy1V-EB-fJ3.png" width="800"/>
94
 
95
- It's also worth noting that some refusals are a variation of "I don't know". As seen with all LLMs, bad actors may be able to find ways around refusals.
 
96
 
97
  ## Fine-tuning setup
98
 
99
- TODO
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
 
101
  ## Limitations
102
 
103
- TODO
 
 
104
 
105
  ## License
106
 
@@ -108,6 +128,7 @@ This model is licensed under [CC-BY-NC 4.0](https://creativecommons.org/licenses
108
  While [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) is licensed under Apache 2.0, this Relay fine-tune is trained with a CC-BY-NC 4.0 dataset ([based-chat-v0.1](https://huggingface.co/datasets/danlou/based-chat-v0.1-Mistral-Nemo-Base-2407)).
109
  The `relaylm.py` script is Apache 2.0.
110
 
 
111
  ## Citation
112
 
113
  If you use Relay in your research, please cite it as follows:
 
22
  - [License](#license)
23
  - [Citation](#citation)
24
 
25
+
26
  ## Introduction: LLMs as IRCs
27
 
28
  Relay is motivated by this question: What does it take to chat with a base LLM?
 
38
 
39
  Post-training methods also support the safety and alignment of LLMs. This important concern was also addressed in the development of the based-chat synthetic dataset, and tested for with the resulting fine-tuned model (see [Safety testing](#safety-testing)).
40
 
41
+
42
  ## How to use
43
 
44
  If you have a CUDA GPU (>=12GB VRAM), the best way to use Relay is with the [relaylm.py](https://github.com/danlou/relay/blob/main/relaylm.py) inference script. Just run:
 
47
  ```
48
 
49
  This script will select the best model for the available VRAM, download, load, and start an interactive chat session.
50
+ It does not have any dependencies besides `transformers >= 4.45.1`. You can also download the script manually and then run python, of course.
51
 
52
  If you want to use a particular model, you can pass the model name as an argument:
53
  ```bash
 
82
 
83
  More examples available in the [project's GitHub](https://github.com/danlou/relay).
84
 
85
+
86
  ## Safety testing
87
 
88
  While this model is intended for research purposes, it's still relevant to explore how this conversational model (and its self-supervised approach) compares on safety risk against other conversational models trained on the same base LLM.
 
95
 
96
  <img src="https://cdn-uploads.huggingface.co/production/uploads/60f808c5c1adf9100f1f263c/0m-dMagE7yKy1V-EB-fJ3.png" width="800"/>
97
 
98
+ It's also worth noting that some refusals are a variation of "I don't know". As seen with all LLMs, bad actors may be able to find ways around refusals. Please use this model responsibly.
99
+
100
 
101
  ## Fine-tuning setup
102
 
103
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
104
+
105
+ This model is a merge of [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) with a QLoRA adapter trained on [based-chat-v0.1](https://huggingface.co/datasets/danlou/based-chat-v0.1-Mistral-Nemo-Base-2407) using axolotl.
106
+
107
+ Main details:
108
+ - Approach: 4-bit QLoRA (rank=8, alpha=16, all linear layers)
109
+ - Template: ChatML (with `user`/`anon` roles, instead of standard `assistant`/`user`)
110
+ - Training Time: 9h 6m 12s on a single RTX 4090 (1 epoch)
111
+ - Train Loss: 0.6633
112
+ - Eval Loss: 0.6881
113
+
114
+ Full details:
115
+ - Training Config: see [axolotl config](https://github.com/danlou/relay/blob/main/axolotl_configs/relay-v0_1-Mistral-Nemo-Base-2407.yml)
116
+ - Training Run: see [W&B workspace](https://wandb.ai/danlou/relay-v0-1/runs/c0bz0xal/workspace?nw=nwuserdanlou)
117
+
118
 
119
  ## Limitations
120
 
121
+ This is not a typical AI Assistant. It should perform worse on benchmarks compared to instruct variants.
122
+ QLoRa 4bit fine-tuning may be too coarse for preserving integrity of pre-training knowledge.
123
+
124
 
125
  ## License
126
 
 
128
  While [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) is licensed under Apache 2.0, this Relay fine-tune is trained with a CC-BY-NC 4.0 dataset ([based-chat-v0.1](https://huggingface.co/datasets/danlou/based-chat-v0.1-Mistral-Nemo-Base-2407)).
129
  The `relaylm.py` script is Apache 2.0.
130
 
131
+
132
  ## Citation
133
 
134
  If you use Relay in your research, please cite it as follows: