VityaVitalich
commited on
Commit
•
8118fb7
1
Parent(s):
5e9c0ba
Update README.md
Browse files
README.md
CHANGED
@@ -1,21 +1,97 @@
|
|
1 |
---
|
2 |
-
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
-
## Training procedure
|
5 |
|
|
|
6 |
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
- load_in_4bit: True
|
11 |
-
- llm_int8_threshold: 6.0
|
12 |
-
- llm_int8_skip_modules: None
|
13 |
-
- llm_int8_enable_fp32_cpu_offload: False
|
14 |
-
- llm_int8_has_fp16_weight: False
|
15 |
-
- bnb_4bit_quant_type: fp4
|
16 |
-
- bnb_4bit_use_double_quant: False
|
17 |
-
- bnb_4bit_compute_dtype: float32
|
18 |
-
### Framework versions
|
19 |
|
|
|
20 |
|
21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: cc-by-sa-4.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
- es
|
6 |
+
- it
|
7 |
---
|
|
|
8 |
|
9 |
+
# Model Card for TaxoLLaMA
|
10 |
|
11 |
+
TaxoLLaMA is a lightweight fine-tune of LLaMA2-7b model, aimed at solving multiple Lexical Semantics task with focus on Taxonomy related tasks, achieving SoTA results on multiple benchmarks.
|
12 |
+
It was pretrained with instructive dataset, collected from WordNet 3.0 to generate hypernyms for a given hyponym.
|
13 |
+
This model also could be used for identifying hypernymy with perplexity, that is useful for Lexical Entailment or Taxonomy Construction.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
+
For more details, read paper: [TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Sematic Tasks](google.com)
|
16 |
|
17 |
+
## Model description
|
18 |
+
|
19 |
+
- **Finetuned from model:** [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
|
20 |
+
- **Language(s) (NLP):** Primarily English, but could be easily extended to other languages. Achieves SoTA also for Spanish and Italian
|
21 |
+
|
22 |
+
### Model Sources
|
23 |
+
|
24 |
+
- **Repository:** [https://github.com/VityaVitalich/TaxoLLaMA](https://github.com/VityaVitalich/TaxoLLaMA)
|
25 |
+
- **Instruction Set:** TBD
|
26 |
+
|
27 |
+
## Performance
|
28 |
+
|
29 |
+
| Model | | Hypernym Discovery (Eng., MRR) | Hypernym Discovery (Span., MRR) | Taxonomy Construction (Enivornment, F1) | Taxonomy Enrichment (WordNet Verb, MRR) |
|
30 |
+
|-------------|-----|----|---------------|----|
|
31 |
+
| **TaxoLLaMA** | **54.39** | **58.61** | **45.13** | **52.4** |
|
32 |
+
| **TaxoLLaMA-bench** | **51.39** | **57.44** | **44.82** | **51.9** |
|
33 |
+
| **Previous SoTA** | **45.22** | **37.56** | **40.00** | **45.2** |
|
34 |
+
|
35 |
+
## Input Format
|
36 |
+
|
37 |
+
The model is trained to use the following format :
|
38 |
+
```
|
39 |
+
<s>[INST] <<SYS>> You are a helpfull assistant. List all the possible words divided with a coma. Your answer should not include anything except the words divided by a coma<</SYS>>
|
40 |
+
hyponym: tiger (large feline of forests in most of Asia having a tawny coat with black stripes)| hypernyms: [/INST]
|
41 |
+
```
|
42 |
+
### Training hyperparameters
|
43 |
+
|
44 |
+
The following hyperparameters were used for instruction tuning:
|
45 |
+
- learning_rate: 3e-04
|
46 |
+
- total_train_batch_size: 32
|
47 |
+
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-09
|
48 |
+
- lr_scheduler_type: CosineAnnealing
|
49 |
+
- num_epochs: 1.0
|
50 |
+
|
51 |
+
|
52 |
+
|
53 |
+
## Usage Example
|
54 |
+
|
55 |
+
```
|
56 |
+
import torch
|
57 |
+
from transformers import LlamaForCausalLM, LlamaTokenizer
|
58 |
+
from peft import PeftConfig, PeftModel
|
59 |
+
|
60 |
+
torch.set_default_device('cuda')
|
61 |
+
config = PeftConfig.from_pretrained('VityaVitalich/TaxoLLaMA')
|
62 |
+
# Do not forget your token for Llama2 models
|
63 |
+
model = LlamaForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, torch_dtype=torch.bfloat16)
|
64 |
+
tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
|
65 |
+
inference_model = PeftModel.from_pretrained(model, 'VityaVitalich/TaxoLLaMA'')
|
66 |
+
|
67 |
+
processed_term = 'hyponym: tiger | hypernyms:'
|
68 |
+
|
69 |
+
system_prompt = """<s>[INST] <<SYS>> You are a helpfull assistant. List all the possible words divided with a coma. Your answer should not include anything except the words divided by a coma<</SYS>>"""
|
70 |
+
processed_term = system_prompt + '\n' + processed_term + '[/INST]'
|
71 |
+
|
72 |
+
input_ids = tokenizer(processed_term, return_tensors='pt')
|
73 |
+
|
74 |
+
# This is an example of generation hyperparameters, they could be modified to fit your task
|
75 |
+
gen_conf = {
|
76 |
+
"no_repeat_ngram_size": 3,
|
77 |
+
"do_sample": True,
|
78 |
+
"num_beams": 8,
|
79 |
+
"num_return_sequences": 2,
|
80 |
+
"max_new_tokens": 32,
|
81 |
+
"top_k": 20,
|
82 |
+
}
|
83 |
+
|
84 |
+
out = inference_model.generate(inputs=input_ids['input_ids'].to('cuda'), **gen_conf)
|
85 |
+
|
86 |
+
text = tokenizer.batch_decode(out)[0][len(system_prompt):]
|
87 |
+
print(text)
|
88 |
+
|
89 |
+
```
|
90 |
+
|
91 |
+
## Citation
|
92 |
+
|
93 |
+
If you find TaxoLLaMA is useful in your work, please cite it with:
|
94 |
+
|
95 |
+
```
|
96 |
+
TBD
|
97 |
+
```
|