fcyin
/

llama2_7B_base_lofit_truthfulqa

Model card Files Files and versions Community

fcyin commited on 21 days ago

Commit

9256cd1

·

verified ·

1 Parent(s): eca2c55

Create README.md

Files changed (1) hide show

README.md +53 -0

README.md ADDED Viewed

	@@ -0,0 +1,53 @@

+# Model Card for Model ID
+This is a Llama-2-7b model fine-tuned on TruthfulQA using Localized Fine-tuning on LLM Representations (LoFiT; https://arxiv.org/abs/2406.01563). This model checkpoint modifies the attention outputs of 96 attention heads (10% of all attention heads).
+### Model Description
+- **License:** mit
+- **Finetuned from model:** meta-llama/Llama-2-7b-hf
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/fc2869/lo-fit
+- **Paper:** https://arxiv.org/abs/2406.01563
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+Please use the lofit github repo (https://github.com/fc2869/lo-fit) and then use the following code snippet to run evaluations on TruthfulQA in the repo with this checkpoint.
+```
+from models.modeling_llama import LlamaModel,LlamaForCausalLM
+from transformers import AutoTokenizer
+import torch
+from utils.evaluate import evaluate_tqa
+from utils.dataloaders import TQA
+checkpoint = 'fcyin/llama2_7B_base_lofit_truthfulqa'
+model_name = 'llama2_7B'
+device = 'cuda'
+cache_dir = './'
+applied_module = 'attention'
+torch_dtype = torch.float32
+model = LlamaForCausalLM.custom_from_pretrained(checkpoint,
+                                                device_map=device,
+                                                cache_dir=cache_dir,
+                                                applied_module = applied_module,
+                                                torch_dtype=torch_dtype).to(device)
+tokenizer = AutoTokenizer.from_pretrained(checkpoint)
+dataloader = TQA(
+    iti_split_dir = './dataset/truthfulqa',
+    fold_num = 0,
+    data_gen_seed = 42
+)
+dataset = dataloader.load_data()
+evaluate_tqa(fname='./',eval_dataset = dataset['test'],model_name = model_name,metrics=['mc'],tokenizer=tokenizer,model=model)
+```
+## Training Details
+Please refer to the [paper](https://arxiv.org/abs/2406.01563) for the training details.