Update README.md
Browse files
README.md
CHANGED
@@ -21,6 +21,9 @@ widget:
|
|
21 |
|
22 |
[My Jupyter "cookbook" to replicate the methodology can be found here, refined library coming soon](https://huggingface.co/failspy/llama-3-70B-Instruct-abliterated/blob/main/ortho_cookbook.ipynb)
|
23 |
|
|
|
|
|
|
|
24 |
## Summary
|
25 |
|
26 |
This is [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) with orthogonalized bfloat16 safetensor weights, generated with a refined methodology based on that which was described in the preview paper/blog post: '[Refusal in LLMs is mediated by a single direction](https://www.alignmentforum.org/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction)' which I encourage you to read to understand more.
|
|
|
21 |
|
22 |
[My Jupyter "cookbook" to replicate the methodology can be found here, refined library coming soon](https://huggingface.co/failspy/llama-3-70B-Instruct-abliterated/blob/main/ortho_cookbook.ipynb)
|
23 |
|
24 |
+
## What's this?
|
25 |
+
Well, after my abliterated models, I figured I should cover all the possible ground of such work and introduce a model that acts like the polar opposite of them. This is the result of that, and I feel it lines it up in performance to a certain search engine's AI model series.
|
26 |
+
|
27 |
## Summary
|
28 |
|
29 |
This is [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) with orthogonalized bfloat16 safetensor weights, generated with a refined methodology based on that which was described in the preview paper/blog post: '[Refusal in LLMs is mediated by a single direction](https://www.alignmentforum.org/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction)' which I encourage you to read to understand more.
|