cosimoiaia commited on
Commit
ce8dac3
•
1 Parent(s): 3ca3bfe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -28
README.md CHANGED
@@ -5,44 +5,59 @@ datasets:
5
  language:
6
  - it
7
  pipeline_tag: conversational
 
 
 
 
 
 
 
8
  ---
9
 
10
  Model Card for Loquace-7B
11
 
12
- ## Model Details
13
 
14
- - Model Name: Loquace-7B
15
- - Model Version: 1.0
16
- - Hugging Face Model Hub Link: [Link to the model on the Hugging Face Model Hub]
17
- - License: CC-BY-NC (Creative Commons Attribution-NonCommercial)
18
 
19
  ## Model Description
20
 
21
- Loquace-7B is a fine-tuned conversational model for the Italian language. It has been trained on a dataset of 102,000 question/answer examples in the Alpaca style. The model is based on the Falcon-7B architecture and was fine-tuned using the qload framework.
 
22
 
23
- ## Intended Use
 
24
 
25
- Loquace-7B is designed to facilitate Italian language conversations. It can be used by developers, researchers, or anyone interested in building conversational systems, chatbots, or dialogue-based applications in Italian.
26
 
27
- ## Model Inputs
28
 
29
- The model expects input in the form of text strings representing questions or prompts in Italian. The input should follow natural language conventions, and longer inputs may need to be truncated or split into multiple parts to fit the model's maximum sequence length.
 
 
 
 
30
 
31
- ## Model Outputs
32
 
33
- The model generates responses as text strings in Italian, providing answers or replies based on the given input. The outputs can be post-processed or presented as-is, depending on the desired application.
34
 
35
- ## Training Data
 
 
36
 
37
- Loquace-7B was trained on a conversational dataset comprising 102,000 question/answer pairs in Italian. The training data was formatted in the Alpaca style, which emphasizes conversational exchanges. The specific sources and characteristics of the training data are not disclosed.
 
 
 
 
 
 
38
 
39
- ## Evaluation Data
40
 
41
- The model's performance was evaluated using a separate evaluation dataset, which consisted of human-labeled assessments and metrics tailored to the conversational nature of the model. The specific details of the evaluation data, such as size and sources, are not provided.
42
 
43
- ## Ethical Considerations
44
-
45
- As with any language model, Loquace-7B may reflect biases present in the training data. Care should be taken when using the model to ensure fair and unbiased interactions. Additionally, as the model is released under the CC-BY-NC license, it should not be used for commercial purposes without proper authorization.
46
 
47
  ## Limitations
48
 
@@ -54,12 +69,5 @@ As with any language model, Loquace-7B may reflect biases present in the trainin
54
 
55
  - PyTorch
56
  - Transformers library by Hugging Face
57
-
58
- ## Contact Information
59
-
60
- For any questions, issues, or inquiries related to Loquace-7B, please contact the developers at [contact email or link].
61
-
62
- ## Citation
63
-
64
- [If the model is based on or inspired by a research paper, provide the citation here.]
65
-
 
5
  language:
6
  - it
7
  pipeline_tag: conversational
8
+ tags:
9
+ - alpaca
10
+ - llama
11
+ - llm
12
+ - finetune
13
+ - Italian
14
+ - qlora
15
  ---
16
 
17
  Model Card for Loquace-7B
18
 
19
+ # 🇮🇹 Loquace-7B 🇮🇹
20
 
21
+ An exclusively Italian speaking, instruction finetuned, Large Language model. 🇮🇹
 
 
 
22
 
23
  ## Model Description
24
 
25
+ Loquace-7B is the first 7B italian Large Language Model trained using QLoRa on a large dataset of 102k question/answer pairs
26
+ exclusively in Italian.
27
 
28
+ The related code can be found at:
29
+ https://github.com/cosimoiaia/Loquace
30
 
 
31
 
32
+ Loquace-7B is part of the big Loquace family:
33
 
34
+ https://huggingface.co/cosimoiaia/Loquace-70m - Based on pythia-70m
35
+ https://huggingface.co/cosimoiaia/Loquace-410m - Based on pythia-410m
36
+ https://huggingface.co/cosimoiaia/Loquace-7B - Based on Falcon-7B
37
+ https://huggingface.co/cosimoiaia/Loquace-12B - Based on pythia-12B
38
+ https://huggingface.co/cosimoiaia/Loquace-20B - Based on gpt-neox-20B
39
 
40
+ ## Usage
41
 
 
42
 
43
+ ```python
44
+ from peft import PeftModel
45
+ from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
46
 
47
+ tokenizer = LLaMATokenizer.from_pretrained("cosimoiaia/Loquace-7B")
48
+ model = LLaMAForCausalLM.from_pretrained(
49
+ "cosimoiaia/Loquace-7B",
50
+ load_in_8bit=True,
51
+ device_map="auto",
52
+ )
53
+ ```
54
 
 
55
 
56
+ ## Training
57
 
58
+ Loquace-7B was trained on a conversational dataset comprising 102k question/answer pairs in Italian language.
59
+ The training data was constructed by putting together translations from the original alpaca Dataset and other sources like the OpenAssistant dataset.
60
+ The model was trained for only 3000 iterations and took 18 hours on a single RTX 3090, kindly provided by Genesis Cloud.
61
 
62
  ## Limitations
63
 
 
69
 
70
  - PyTorch
71
  - Transformers library by Hugging Face
72
+ - Bitsandbites
73
+ - QLoRa