VityaVitalich commited on
Commit
a603203
1 Parent(s): d2feb75

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -15
README.md CHANGED
@@ -1,21 +1,107 @@
1
  ---
2
- library_name: peft
 
 
 
 
 
 
 
 
 
3
  ---
4
- ## Training procedure
5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
- The following `bitsandbytes` quantization config was used during training:
8
- - quant_method: bitsandbytes
9
- - load_in_8bit: False
10
- - load_in_4bit: True
11
- - llm_int8_threshold: 6.0
12
- - llm_int8_skip_modules: None
13
- - llm_int8_enable_fp32_cpu_offload: False
14
- - llm_int8_has_fp16_weight: False
15
- - bnb_4bit_quant_type: fp4
16
- - bnb_4bit_use_double_quant: False
17
- - bnb_4bit_compute_dtype: float32
18
- ### Framework versions
19
 
 
20
 
21
- - PEFT 0.5.0
 
 
 
1
  ---
2
+ model-index:
3
+ - name: TaxoLLaMA-bench
4
+ results: []
5
+ license: cc-by-sa-4.0
6
+ language:
7
+ - en
8
+ - es
9
+ - it
10
+ base_model: meta-llama/Llama-2-7b-hf
11
+
12
  ---
 
13
 
14
+ <img src="https://huggingface.co/VityaVitalich/TaxoLLaMA/resolve/main/pipeline_final_final24.drawio-1.png?download=true" alt="TaxoLLaMA banner" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
15
+
16
+
17
+ # Model Card for TaxoLLaMA-bench
18
+
19
+ TaxoLLaMA-bench is a lightweight fine-tune of LLaMA2-7b model, aimed at solving multiple Lexical Semantics task with focus on Taxonomy related tasks, achieving SoTA results on multiple benchmarks.
20
+ It was pretrained with instructive dataset, collected from WordNet 3.0 to generate hypernyms for a given hyponym.
21
+ This model also could be used for identifying hypernymy with perplexity, that is useful for Lexical Entailment or Taxonomy Construction.
22
+
23
+ It was not pretrained on the data used in later benchmarks from the paper. This model should be use to replicate results on Taxonomy test datasets or to boost them. For other task we strongly recommend using TaxoLLaMA
24
+
25
+ For more details, read paper: [TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Sematic Tasks](google.com)
26
+
27
+ ## Model description
28
+
29
+ - **Finetuned from model:** [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
30
+ - **Language(s) (NLP):** Primarily English, but could be easily extended to other languages. Achieves SoTA also for Spanish and Italian
31
+
32
+ ### Model Sources
33
+
34
+ - **Repository:** [https://github.com/VityaVitalich/TaxoLLaMA](https://github.com/VityaVitalich/TaxoLLaMA)
35
+ - **Instruction Set:** TBD
36
+
37
+ ## Performance
38
+
39
+ | Model | Hypernym Discovery (Eng., MRR) | Hypernym Discovery (Span., MRR) | Taxonomy Construction (Enivornment, F1) | Taxonomy Enrichment (WordNet Verb, MRR) |
40
+ |-------------|---------------|---------------|---------------|---------------|
41
+ | **TaxoLLaMA** | **54.39** | **58.61** | **45.13** | **52.4** |
42
+ | **TaxoLLaMA-bench** | **51.39** | **57.44** | **44.82** | **51.9** |
43
+ | **Previous SoTA** | **45.22** | **37.56** | **40.00** | **45.2** |
44
+
45
+ ## Input Format
46
+
47
+ The model is trained to use the following format :
48
+ ```
49
+ <s>[INST] <<SYS>> You are a helpfull assistant. List all the possible words divided with a coma. Your answer should not include anything except the words divided by a coma<</SYS>>
50
+ hyponym: tiger (large feline of forests in most of Asia having a tawny coat with black stripes)| hypernyms: [/INST]
51
+ ```
52
+ ### Training hyperparameters
53
+
54
+ The following hyperparameters were used for instruction tuning:
55
+ - learning_rate: 3e-04
56
+ - total_train_batch_size: 32
57
+ - optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-09
58
+ - lr_scheduler_type: CosineAnnealing
59
+ - num_epochs: 1.0
60
+
61
+
62
+
63
+ ## Usage Example
64
+
65
+ ```python
66
+ import torch
67
+ from transformers import LlamaForCausalLM, LlamaTokenizer
68
+ from peft import PeftConfig, PeftModel
69
+
70
+ torch.set_default_device('cuda')
71
+ config = PeftConfig.from_pretrained('VityaVitalich/TaxoLLaMA-bench')
72
+ # Do not forget your token for Llama2 models
73
+ model = LlamaForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, torch_dtype=torch.bfloat16)
74
+ tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
75
+ inference_model = PeftModel.from_pretrained(model, 'VityaVitalich/TaxoLLaMA-bench'')
76
+
77
+ processed_term = 'hyponym: tiger | hypernyms:'
78
+
79
+ system_prompt = """<s>[INST] <<SYS>> You are a helpfull assistant. List all the possible words divided with a coma. Your answer should not include anything except the words divided by a coma<</SYS>>"""
80
+ processed_term = system_prompt + '\n' + processed_term + '[/INST]'
81
+
82
+ input_ids = tokenizer(processed_term, return_tensors='pt')
83
+
84
+ # This is an example of generation hyperparameters, they could be modified to fit your task
85
+ gen_conf = {
86
+ "no_repeat_ngram_size": 3,
87
+ "do_sample": True,
88
+ "num_beams": 8,
89
+ "num_return_sequences": 2,
90
+ "max_new_tokens": 32,
91
+ "top_k": 20,
92
+ }
93
+
94
+ out = inference_model.generate(inputs=input_ids['input_ids'].to('cuda'), **gen_conf)
95
+
96
+ text = tokenizer.batch_decode(out)[0][len(system_prompt):]
97
+ print(text)
98
+
99
+ ```
100
 
101
+ ## Citation
 
 
 
 
 
 
 
 
 
 
 
102
 
103
+ If you find TaxoLLaMA is useful in your work, please cite it with:
104
 
105
+ ```
106
+ TBD
107
+ ```