fix notebook link
Browse files
README.md
CHANGED
@@ -43,17 +43,17 @@ From a *practical* point of view, [Failspy](https://huggingface.co/failspy) show
|
|
43 |
|
44 |
Inspired by Failspy's work, I adapted the approach to the rap use case.
|
45 |
|
46 |
-
π [Notebook: Steer Llama to respond with a rap style](
|
47 |
|
48 |
π£ Steps
|
49 |
1. Load the Llama-3-8B-Instruct model.
|
50 |
2. Load 1024 examples from Alpaca (instruction dataset).
|
51 |
3. Prepare a system prompt to make the model act like a rapper.
|
52 |
4. Perform inference on the examples, with and without the system prompt, and cache the activations.
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
|
58 |
## π§ Limitations of this approach
|
59 |
(Maybe a trivial observation)
|
|
|
43 |
|
44 |
Inspired by Failspy's work, I adapted the approach to the rap use case.
|
45 |
|
46 |
+
π [Notebook: Steer Llama to respond with a rap style](steer_llama_to_rap_style.ipynb)
|
47 |
|
48 |
π£ Steps
|
49 |
1. Load the Llama-3-8B-Instruct model.
|
50 |
2. Load 1024 examples from Alpaca (instruction dataset).
|
51 |
3. Prepare a system prompt to make the model act like a rapper.
|
52 |
4. Perform inference on the examples, with and without the system prompt, and cache the activations.
|
53 |
+
5. Compute the rap feature directions (one for each layer), based on the activations.
|
54 |
+
6. Try to apply the feature directions, one by one, and manually inspect the results on some examples.
|
55 |
+
7. Select the best-performing feature direction.
|
56 |
+
8. Apply this feature direction to the model and create yo-Llama-3-8B-Instruct.
|
57 |
|
58 |
## π§ Limitations of this approach
|
59 |
(Maybe a trivial observation)
|