danlou
/

relay-v0.1-Mistral-Nemo-2407

@@ -25,7 +25,7 @@ library_name: transformers
 ## Introduction: LLMs as IRCs
-Relay is motivated by this question: What does it take to chat with a base LLM?
 Several papers (e.g., [URIAL](https://arxiv.org/abs/2312.01552)) have shown that base models can be used more reliably than expected. At the same time, we also increasingly find that RLHF, and other post-training approaches, may [limit](https://x.com/aidan_mclau/status/1860026205547954474) the creativity of LLMs.
 LLMs can be more than smart assistants. In fact, they should have the potential to emulate all sorts of behaviours or patterns found in their pre-training datasets (usually a large chunk of the internet).
@@ -57,7 +57,7 @@ python relaylm.py danlou/relay-v0.1-Mistral-Nemo-2407-4bit
 You should see something similar to this demo:
 <a href="https://asciinema.org/a/MrPFq2mgIRPruKygYehCbeqwc" target="_blank"><img src="https://asciinema.org/a/MrPFq2mgIRPruKygYehCbeqwc.svg" /></a>
-Alternatively, if you do not have a CUDA GPU (e.g., on a Mac), you can use the [GGUF versions](https://huggingface.co/danlou/relay-v0.1-Mistral-Nemo-2407-GGUF) through LM Studio.
 With [relaylm.py](https://github.com/danlou/relay/blob/main/relaylm.py), you can also use the model declaratively, outside of an interactive chat session:
@@ -76,8 +76,8 @@ def favorite_holiday(relay: RelayLM, country: str) -> str:
 model_info = suggest_relay_model()
 relay = RelayLM(**model_info)
-print(favorite_holiday(relay, 'Portugal'))
-print(favorite_holiday(relay, 'China'))
 ```
 More examples available in the [project's GitHub](https://github.com/danlou/relay).

 ## Introduction: LLMs as IRCs
+What does it take to chat with a base LLM?
 Several papers (e.g., [URIAL](https://arxiv.org/abs/2312.01552)) have shown that base models can be used more reliably than expected. At the same time, we also increasingly find that RLHF, and other post-training approaches, may [limit](https://x.com/aidan_mclau/status/1860026205547954474) the creativity of LLMs.
 LLMs can be more than smart assistants. In fact, they should have the potential to emulate all sorts of behaviours or patterns found in their pre-training datasets (usually a large chunk of the internet).
 You should see something similar to this demo:
 <a href="https://asciinema.org/a/MrPFq2mgIRPruKygYehCbeqwc" target="_blank"><img src="https://asciinema.org/a/MrPFq2mgIRPruKygYehCbeqwc.svg" /></a>
+Alternatively, if you do not have a CUDA GPU (e.g., on a Mac), you can use the [GGUF versions](https://huggingface.co/danlou/relay-v0.1-Mistral-Nemo-2407-GGUF) through LM Studio (some functionality will be missing, see the GGUF model page).
 With [relaylm.py](https://github.com/danlou/relay/blob/main/relaylm.py), you can also use the model declaratively, outside of an interactive chat session:
 model_info = suggest_relay_model()
 relay = RelayLM(**model_info)
+print(favorite_holiday(relay, 'Portugal'))  # I love Christmas! It is a time for family and friends to come ...
+print(favorite_holiday(relay, 'China'))  # My favorite holiday is Chinese New Year because it means family ...
 ```
 More examples available in the [project's GitHub](https://github.com/danlou/relay).