Update README.md

Browse files

Files changed (1) hide show

README.md +36 -28

README.md CHANGED Viewed

@@ -5,44 +5,59 @@ datasets:
 language:
 - it
 pipeline_tag: conversational
 ---
 Model Card for Loquace-7B
-## Model Details
-- Model Name: Loquace-7B
-- Model Version: 1.0
-- Hugging Face Model Hub Link: [Link to the model on the Hugging Face Model Hub]
-- License: CC-BY-NC (Creative Commons Attribution-NonCommercial)
 ## Model Description
-Loquace-7B is a fine-tuned conversational model for the Italian language. It has been trained on a dataset of 102,000 question/answer examples in the Alpaca style. The model is based on the Falcon-7B architecture and was fine-tuned using the qload framework.
-## Intended Use
-Loquace-7B is designed to facilitate Italian language conversations. It can be used by developers, researchers, or anyone interested in building conversational systems, chatbots, or dialogue-based applications in Italian.
-## Model Inputs
-The model expects input in the form of text strings representing questions or prompts in Italian. The input should follow natural language conventions, and longer inputs may need to be truncated or split into multiple parts to fit the model's maximum sequence length.
-## Model Outputs
-The model generates responses as text strings in Italian, providing answers or replies based on the given input. The outputs can be post-processed or presented as-is, depending on the desired application.
-## Training Data
-Loquace-7B was trained on a conversational dataset comprising 102,000 question/answer pairs in Italian. The training data was formatted in the Alpaca style, which emphasizes conversational exchanges. The specific sources and characteristics of the training data are not disclosed.
-## Evaluation Data
-The model's performance was evaluated using a separate evaluation dataset, which consisted of human-labeled assessments and metrics tailored to the conversational nature of the model. The specific details of the evaluation data, such as size and sources, are not provided.
-## Ethical Considerations
-As with any language model, Loquace-7B may reflect biases present in the training data. Care should be taken when using the model to ensure fair and unbiased interactions. Additionally, as the model is released under the CC-BY-NC license, it should not be used for commercial purposes without proper authorization.
 ## Limitations
@@ -54,12 +69,5 @@ As with any language model, Loquace-7B may reflect biases present in the trainin
 - PyTorch
 - Transformers library by Hugging Face
-## Contact Information
-For any questions, issues, or inquiries related to Loquace-7B, please contact the developers at [contact email or link].
-## Citation
-[If the model is based on or inspired by a research paper, provide the citation here.]

 language:
 - it
 pipeline_tag: conversational
+tags:
+- alpaca
+- llama
+- llm
+- finetune
+- Italian
+- qlora
 ---
 Model Card for Loquace-7B
+# 🇮🇹 Loquace-7B 🇮🇹
+An exclusively Italian speaking, instruction finetuned, Large Language model. 🇮🇹
 ## Model Description
+Loquace-7B is the first 7B italian Large Language Model trained using QLoRa on a large dataset of 102k question/answer pairs
+exclusively in Italian.
+The related code can be found at:
+https://github.com/cosimoiaia/Loquace
+Loquace-7B is part of the big Loquace family:
+https://huggingface.co/cosimoiaia/Loquace-70m   -   Based on pythia-70m
+https://huggingface.co/cosimoiaia/Loquace-410m  -   Based on pythia-410m
+https://huggingface.co/cosimoiaia/Loquace-7B    -   Based on Falcon-7B
+https://huggingface.co/cosimoiaia/Loquace-12B   -   Based on pythia-12B
+https://huggingface.co/cosimoiaia/Loquace-20B   -   Based on gpt-neox-20B
+## Usage
+```python
+from peft import PeftModel
+from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
+tokenizer = LLaMATokenizer.from_pretrained("cosimoiaia/Loquace-7B")
+model = LLaMAForCausalLM.from_pretrained(
+    "cosimoiaia/Loquace-7B",
+    load_in_8bit=True,
+    device_map="auto",
+)
+```
+## Training
+Loquace-7B was trained on a conversational dataset comprising 102k question/answer pairs in Italian language.
+The training data was constructed by putting together translations from the original alpaca Dataset and other sources like the OpenAssistant dataset.
+The model was trained for only 3000 iterations and took 18 hours on a single RTX 3090, kindly provided by Genesis Cloud.
 ## Limitations
 - PyTorch
 - Transformers library by Hugging Face
+- Bitsandbites
+- QLoRa