English
Spanish
Italian
VityaVitalich commited on
Commit
8118fb7
1 Parent(s): 5e9c0ba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -15
README.md CHANGED
@@ -1,21 +1,97 @@
1
  ---
2
- library_name: peft
 
 
 
 
3
  ---
4
- ## Training procedure
5
 
 
6
 
7
- The following `bitsandbytes` quantization config was used during training:
8
- - quant_method: bitsandbytes
9
- - load_in_8bit: False
10
- - load_in_4bit: True
11
- - llm_int8_threshold: 6.0
12
- - llm_int8_skip_modules: None
13
- - llm_int8_enable_fp32_cpu_offload: False
14
- - llm_int8_has_fp16_weight: False
15
- - bnb_4bit_quant_type: fp4
16
- - bnb_4bit_use_double_quant: False
17
- - bnb_4bit_compute_dtype: float32
18
- ### Framework versions
19
 
 
20
 
21
- - PEFT 0.5.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: cc-by-sa-4.0
3
+ language:
4
+ - en
5
+ - es
6
+ - it
7
  ---
 
8
 
9
+ # Model Card for TaxoLLaMA
10
 
11
+ TaxoLLaMA is a lightweight fine-tune of LLaMA2-7b model, aimed at solving multiple Lexical Semantics task with focus on Taxonomy related tasks, achieving SoTA results on multiple benchmarks.
12
+ It was pretrained with instructive dataset, collected from WordNet 3.0 to generate hypernyms for a given hyponym.
13
+ This model also could be used for identifying hypernymy with perplexity, that is useful for Lexical Entailment or Taxonomy Construction.
 
 
 
 
 
 
 
 
 
14
 
15
+ For more details, read paper: [TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Sematic Tasks](google.com)
16
 
17
+ ## Model description
18
+
19
+ - **Finetuned from model:** [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
20
+ - **Language(s) (NLP):** Primarily English, but could be easily extended to other languages. Achieves SoTA also for Spanish and Italian
21
+
22
+ ### Model Sources
23
+
24
+ - **Repository:** [https://github.com/VityaVitalich/TaxoLLaMA](https://github.com/VityaVitalich/TaxoLLaMA)
25
+ - **Instruction Set:** TBD
26
+
27
+ ## Performance
28
+
29
+ | Model | | Hypernym Discovery (Eng., MRR) | Hypernym Discovery (Span., MRR) | Taxonomy Construction (Enivornment, F1) | Taxonomy Enrichment (WordNet Verb, MRR) |
30
+ |-------------|-----|----|---------------|----|
31
+ | **TaxoLLaMA** | **54.39** | **58.61** | **45.13** | **52.4** |
32
+ | **TaxoLLaMA-bench** | **51.39** | **57.44** | **44.82** | **51.9** |
33
+ | **Previous SoTA** | **45.22** | **37.56** | **40.00** | **45.2** |
34
+
35
+ ## Input Format
36
+
37
+ The model is trained to use the following format :
38
+ ```
39
+ <s>[INST] <<SYS>> You are a helpfull assistant. List all the possible words divided with a coma. Your answer should not include anything except the words divided by a coma<</SYS>>
40
+ hyponym: tiger (large feline of forests in most of Asia having a tawny coat with black stripes)| hypernyms: [/INST]
41
+ ```
42
+ ### Training hyperparameters
43
+
44
+ The following hyperparameters were used for instruction tuning:
45
+ - learning_rate: 3e-04
46
+ - total_train_batch_size: 32
47
+ - optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-09
48
+ - lr_scheduler_type: CosineAnnealing
49
+ - num_epochs: 1.0
50
+
51
+
52
+
53
+ ## Usage Example
54
+
55
+ ```
56
+ import torch
57
+ from transformers import LlamaForCausalLM, LlamaTokenizer
58
+ from peft import PeftConfig, PeftModel
59
+
60
+ torch.set_default_device('cuda')
61
+ config = PeftConfig.from_pretrained('VityaVitalich/TaxoLLaMA')
62
+ # Do not forget your token for Llama2 models
63
+ model = LlamaForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, torch_dtype=torch.bfloat16)
64
+ tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
65
+ inference_model = PeftModel.from_pretrained(model, 'VityaVitalich/TaxoLLaMA'')
66
+
67
+ processed_term = 'hyponym: tiger | hypernyms:'
68
+
69
+ system_prompt = """<s>[INST] <<SYS>> You are a helpfull assistant. List all the possible words divided with a coma. Your answer should not include anything except the words divided by a coma<</SYS>>"""
70
+ processed_term = system_prompt + '\n' + processed_term + '[/INST]'
71
+
72
+ input_ids = tokenizer(processed_term, return_tensors='pt')
73
+
74
+ # This is an example of generation hyperparameters, they could be modified to fit your task
75
+ gen_conf = {
76
+ "no_repeat_ngram_size": 3,
77
+ "do_sample": True,
78
+ "num_beams": 8,
79
+ "num_return_sequences": 2,
80
+ "max_new_tokens": 32,
81
+ "top_k": 20,
82
+ }
83
+
84
+ out = inference_model.generate(inputs=input_ids['input_ids'].to('cuda'), **gen_conf)
85
+
86
+ text = tokenizer.batch_decode(out)[0][len(system_prompt):]
87
+ print(text)
88
+
89
+ ```
90
+
91
+ ## Citation
92
+
93
+ If you find TaxoLLaMA is useful in your work, please cite it with:
94
+
95
+ ```
96
+ TBD
97
+ ```