cosimoiaia commited on
Commit
7e6eb70
•
1 Parent(s): b142f95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md CHANGED
@@ -1,3 +1,87 @@
1
  ---
2
  license: cc-by-nc-2.0
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-2.0
3
+ datasets:
4
+ - cosimoiaia/Loquace-102k
5
+ language:
6
+ - it
7
+ pipeline_tag: conversational
8
+ tags:
9
+ - alpaca
10
+ - llama
11
+ - llm
12
+ - finetune
13
+ - Italian
14
+ - qlora
15
  ---
16
+
17
+ Model Card for Loquace-20B
18
+
19
+ # 🇮🇹 Loquace-20B 🇮🇹
20
+
21
+ An exclusively Italian speaking, instruction finetuned, Large Language model. 🇮🇹
22
+
23
+ The Loquace Italian LLM models are created as a proof-of-concept to evaluate on how language tuning can be achieved using QLoRa by instruct-tunings foundational LLMs
24
+ using dataset of a specific language.
25
+
26
+
27
+ The QLoRa (https://github.com/artidoro/qlora) method of fine-tuning significantly lower the resources requirements compared to any other methods available,
28
+ this allow to easily execute the process on significanly larger dataset while still using consumers GPUs and still achieve high accuracy.
29
+
30
+ ## Model Description
31
+
32
+ Loquace-20B is the first 20B italian Large Language Model trained using QLoRa on a large dataset of 102k question/answer pairs
33
+ exclusively in Italian.
34
+
35
+ The related code can be found at:
36
+ https://github.com/cosimoiaia/Loquace
37
+
38
+
39
+ Loquace-20B is part of the big Loquace family:
40
+
41
+ https://huggingface.co/cosimoiaia/Loquace-70m - Based on pythia-70m
42
+ https://huggingface.co/cosimoiaia/Loquace-410m - Based on pythia-410m
43
+ https://huggingface.co/cosimoiaia/Loquace-7B - Based on Falcon-7B
44
+ https://huggingface.co/cosimoiaia/Loquace-12B - Based on pythia-12B
45
+ https://huggingface.co/cosimoiaia/Loquace-20B - Based on gpt-neox-20B
46
+
47
+ ## Usage
48
+
49
+
50
+ ```python
51
+ from transformers import (
52
+ AutoTokenizer,
53
+ AutoModelForCausalLM,
54
+ BitsAndBytesConfig
55
+ )
56
+
57
+ tokenizer = AutoTokenizer.from_pretrained("cosimoiaia/Loquace-20B", padding_side="right", use_fast=True)
58
+ model = AutoModelForCausalLM.from_pretrained(
59
+ "cosimoiaia/Loquace-20B",
60
+ load_in_8bit=True,
61
+ device_map="auto",
62
+ quantization_config=BitsAndBytesConfig(
63
+ load_in_4bit=True,
64
+ llm_int8_has_fp16_weight=False
65
+ )
66
+ )
67
+ ```
68
+
69
+
70
+ ## Training
71
+
72
+ Loquace-20B was trained on a conversational dataset comprising 102k question/answer pairs in Italian language.
73
+ The training data was constructed by putting together translations from the original alpaca Dataset and other sources like the OpenAssistant dataset.
74
+ The model was trained for only 3000 iterations and took 18 hours on 4 RTX 3090, kindly provided by Genesis Cloud. (https://gnsiscld.co/26qhlf)
75
+
76
+ ## Limitations
77
+
78
+ - Loquace-20B may not handle complex or nuanced queries well and may struggle with ambiguous or poorly formatted inputs.
79
+ - The model may generate responses that are factually incorrect or nonsensical. It should be used with caution, and outputs should be carefully verified.
80
+ - The training data primarily consists of conversational examples and may not generalize well to other types of tasks or domains.
81
+
82
+ ## Dependencies
83
+
84
+ - PyTorch
85
+ - Transformers library by Hugging Face
86
+ - Bitsandbites
87
+ - QLoRa