cosimoiaia
/

Loquace-20B

Text Generation

Model card Files Files and versions Community

cosimoiaia commited on Jun 10, 2023

Commit

7e6eb70

·

1 Parent(s): b142f95

Update README.md

Files changed (1) hide show

README.md +84 -0

README.md CHANGED Viewed

@@ -1,3 +1,87 @@
 ---
 license: cc-by-nc-2.0
 ---

 ---
 license: cc-by-nc-2.0
+datasets:
+- cosimoiaia/Loquace-102k
+language:
+- it
+pipeline_tag: conversational
+tags:
+- alpaca
+- llama
+- llm
+- finetune
+- Italian
+- qlora
 ---
+Model Card for Loquace-20B
+# 🇮🇹 Loquace-20B 🇮🇹
+An exclusively Italian speaking, instruction finetuned, Large Language model. 🇮🇹
+The Loquace Italian LLM models are created as a proof-of-concept to evaluate on how language tuning can be achieved using QLoRa by instruct-tunings foundational LLMs
+using dataset of a specific language.
+The QLoRa (https://github.com/artidoro/qlora) method of fine-tuning significantly lower the resources requirements compared to any other methods available,
+this allow to easily execute the process on significanly larger dataset while still using consumers GPUs and still achieve high accuracy.
+## Model Description
+Loquace-20B is the first 20B italian Large Language Model trained using QLoRa on a large dataset of 102k question/answer pairs
+exclusively in Italian.
+The related code can be found at:
+https://github.com/cosimoiaia/Loquace
+Loquace-20B is part of the big Loquace family:
+https://huggingface.co/cosimoiaia/Loquace-70m   -   Based on pythia-70m
+https://huggingface.co/cosimoiaia/Loquace-410m  -   Based on pythia-410m
+https://huggingface.co/cosimoiaia/Loquace-7B    -   Based on Falcon-7B
+https://huggingface.co/cosimoiaia/Loquace-12B   -   Based on pythia-12B
+https://huggingface.co/cosimoiaia/Loquace-20B   -   Based on gpt-neox-20B
+## Usage
+```python
+from transformers import (
+    AutoTokenizer,
+    AutoModelForCausalLM,
+    BitsAndBytesConfig
+)
+tokenizer = AutoTokenizer.from_pretrained("cosimoiaia/Loquace-20B", padding_side="right", use_fast=True)
+model = AutoModelForCausalLM.from_pretrained(
+    "cosimoiaia/Loquace-20B",
+    load_in_8bit=True,
+    device_map="auto",
+    quantization_config=BitsAndBytesConfig(
+      load_in_4bit=True,
+      llm_int8_has_fp16_weight=False
+    )
+)
+```
+## Training
+Loquace-20B was trained on a conversational dataset comprising 102k question/answer pairs in Italian language.
+The training data was constructed by putting together translations from the original alpaca Dataset and other sources like the OpenAssistant dataset.
+The model was trained for only 3000 iterations and took 18 hours on 4 RTX 3090, kindly provided by Genesis Cloud. (https://gnsiscld.co/26qhlf)
+## Limitations
+- Loquace-20B may not handle complex or nuanced queries well and may struggle with ambiguous or poorly formatted inputs.
+- The model may generate responses that are factually incorrect or nonsensical. It should be used with caution, and outputs should be carefully verified.
+- The training data primarily consists of conversational examples and may not generalize well to other types of tasks or domains.
+## Dependencies
+- PyTorch
+- Transformers library by Hugging Face
+- Bitsandbites
+- QLoRa