Syed-Hasan-8503
commited on
Commit
•
1a44756
1
Parent(s):
b18b0fb
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,58 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: transformers
|
3 |
+
license: apache-2.0
|
4 |
+
datasets:
|
5 |
+
- teknium/OpenHermes-2.5
|
6 |
+
---
|
7 |
+
|
8 |
+
# Llama-3-openhermes-reft
|
9 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e09e72e43b9464c835735f/QNExOQXrsO4xJnR6Dd8xc.png)
|
10 |
+
|
11 |
+
**Llama-3-openhermes-reft** is a fine-tuned version of **[meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)** on a 10K subset of **[teknium/OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5)**
|
12 |
+
dataset using **Representation Fine-Tuning (ReFT)**. The model has been trained for **1 epoch** on **1x A100** using PyReFT library.
|
13 |
+
|
14 |
+
#### What is ReFT?
|
15 |
+
|
16 |
+
ReFT methods are drop-in replacements for weight-based PEFTs. Parameter-efficient finetuning (PEFT) methods propose a efficient and cheaper alternative to full fine-tuning by updating a small fraction of weights, while using less memory and finishing training faster.
|
17 |
+
Current state-of-art PEFTs like LoRA and DoRA modify weights of model but not the representations. Representation Finetuning (ReFT) operates on a frozen base model and learn task-specific interventions on hidden representations.
|
18 |
+
|
19 |
+
#### PyReFT
|
20 |
+
PyReFT, a Python library made for training and sharing ReFTs.
|
21 |
+
|
22 |
+
This library is built on top of pyvene, a library for performing and training activation interventions on arbitrary PyTorch models.
|
23 |
+
* Codebase: **[PyReFT](https://github.com/stanfordnlp/pyreft)**
|
24 |
+
* PyPI release: **[](https://pypi.org/project/pyreft/)**
|
25 |
+
* Any pretrained LM available on HuggingFace is supported through pyreft for finetuning with ReFT methods, and finetuned models can be easily uploaded to HuggingFace.
|
26 |
+
|
27 |
+
|
28 |
+
#### Inference
|
29 |
+
|
30 |
+
```python
|
31 |
+
import torch, transformers, pyreft
|
32 |
+
device = "cuda"
|
33 |
+
|
34 |
+
model_name_or_path = "meta-llama/Meta-Llama-3-8B"
|
35 |
+
model = transformers.AutoModelForCausalLM.from_pretrained(
|
36 |
+
model_name_or_path, torch_dtype=torch.bfloat16, device_map=device)
|
37 |
+
|
38 |
+
reft_model = pyreft.ReftModel.load(
|
39 |
+
"Syed-Hasan-8503/Llama-3-openhermes-reft", model, from_huggingface_hub=True
|
40 |
+
)
|
41 |
+
|
42 |
+
reft_model.set_device("cuda")
|
43 |
+
|
44 |
+
instruction = "A rectangular garden has a length of 25 feet and a width of 15 feet. If you want to build a fence around the entire garden, how many feet of fencing will you need?"
|
45 |
+
|
46 |
+
prompt_no_input_template = """<|begin_of_text|><|start_header_id|>user<|end_header_id|>%s<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""
|
47 |
+
# tokenize and prepare the input
|
48 |
+
prompt = prompt_no_input_template % instruction
|
49 |
+
prompt = tokenizer(prompt, return_tensors="pt").to(device)
|
50 |
+
|
51 |
+
base_unit_location = prompt["input_ids"].shape[-1] - 1 # last position
|
52 |
+
_, reft_response = reft_model.generate(
|
53 |
+
prompt, unit_locations={"sources->base": (None, [[[base_unit_location]]])},
|
54 |
+
intervene_on_prompt=True, max_new_tokens=512, do_sample=True,
|
55 |
+
eos_token_id=tokenizer.eos_token_id, early_stopping=True
|
56 |
+
)
|
57 |
+
print(tokenizer.decode(reft_response[0], skip_special_tokens=True))
|
58 |
+
```
|