File size: 9,047 Bytes
11187a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
Quantization made by Richard Erkhov.

[Github](https://github.com/RichardErkhov)

[Discord](https://discord.gg/pvy7H8DZMG)

[Request more models](https://github.com/RichardErkhov/quant_request)


Llama-2-7b-alpaca-es - GGUF
- Model creator: https://huggingface.co/4i-ai/
- Original model: https://huggingface.co/4i-ai/Llama-2-7b-alpaca-es/


| Name | Quant method | Size |
| ---- | ---- | ---- |
| [Llama-2-7b-alpaca-es.Q2_K.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q2_K.gguf) | Q2_K | 2.36GB |
| [Llama-2-7b-alpaca-es.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.IQ3_XS.gguf) | IQ3_XS | 2.6GB |
| [Llama-2-7b-alpaca-es.IQ3_S.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.IQ3_S.gguf) | IQ3_S | 2.75GB |
| [Llama-2-7b-alpaca-es.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q3_K_S.gguf) | Q3_K_S | 2.75GB |
| [Llama-2-7b-alpaca-es.IQ3_M.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.IQ3_M.gguf) | IQ3_M | 2.9GB |
| [Llama-2-7b-alpaca-es.Q3_K.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q3_K.gguf) | Q3_K | 3.07GB |
| [Llama-2-7b-alpaca-es.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q3_K_M.gguf) | Q3_K_M | 3.07GB |
| [Llama-2-7b-alpaca-es.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q3_K_L.gguf) | Q3_K_L | 3.35GB |
| [Llama-2-7b-alpaca-es.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.IQ4_XS.gguf) | IQ4_XS | 3.4GB |
| [Llama-2-7b-alpaca-es.Q4_0.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q4_0.gguf) | Q4_0 | 3.56GB |
| [Llama-2-7b-alpaca-es.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.IQ4_NL.gguf) | IQ4_NL | 3.58GB |
| [Llama-2-7b-alpaca-es.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q4_K_S.gguf) | Q4_K_S | 3.59GB |
| [Llama-2-7b-alpaca-es.Q4_K.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q4_K.gguf) | Q4_K | 3.8GB |
| [Llama-2-7b-alpaca-es.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q4_K_M.gguf) | Q4_K_M | 3.8GB |
| [Llama-2-7b-alpaca-es.Q4_1.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q4_1.gguf) | Q4_1 | 3.95GB |
| [Llama-2-7b-alpaca-es.Q5_0.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q5_0.gguf) | Q5_0 | 4.33GB |
| [Llama-2-7b-alpaca-es.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q5_K_S.gguf) | Q5_K_S | 4.33GB |
| [Llama-2-7b-alpaca-es.Q5_K.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q5_K.gguf) | Q5_K | 4.45GB |
| [Llama-2-7b-alpaca-es.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q5_K_M.gguf) | Q5_K_M | 4.45GB |
| [Llama-2-7b-alpaca-es.Q5_1.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q5_1.gguf) | Q5_1 | 4.72GB |
| [Llama-2-7b-alpaca-es.Q6_K.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q6_K.gguf) | Q6_K | 5.15GB |
| [Llama-2-7b-alpaca-es.Q8_0.gguf](https://huggingface.co/RichardErkhov/4i-ai_-_Llama-2-7b-alpaca-es-gguf/blob/main/Llama-2-7b-alpaca-es.Q8_0.gguf) | Q8_0 | 6.67GB |




Original model description:
---
license: cc-by-nc-4.0
datasets:
- bertin-project/alpaca-spanish
language:
- es
inference: false
---


# Model Card for Model ID

This model is the Llama-2-7b-hf fine-tuned with an adapter on the Spanish Alpaca dataset.

## Model Details

### Model Description

This is a Spanish chat model fine-tuned on a Spanish instruction dataset. 

The model expect a prompt containing the instruction, with an option to add an input (see examples below). 



- **Developed by:** 4i Intelligent Insights
- **Model type:** Chat model
- **Language(s) (NLP):** Spanish
- **License:** cc-by-nc-4.0 (inhereted from the alpaca-spanish dataset), 
- **Finetuned from model  :** Llama 2 7B  ([license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/))


## Uses

The model is intended to be used directly without the need of further fine-tuning.


## Bias, Risks, and Limitations

This model inherits the bias, risks, and limitations of its base model, Llama 2, and of the dataset used for fine-tuning. 
Note that the Spanish Alpaca dataset was obtained by translating the original Alpaca dataset. It contains translation errors that may have negatively impacted the fine-tuning of the model. 



## How to Get Started with the Model

Use the code below to get started with the model for inference. The adapter was directly merged into the original Llama 2 model. 


The following code sample uses 4-bit quantization, you may load the model without it if you have enough VRAM.

```py
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, TrainingArguments, GenerationConfig
import torch
model_name = "4i-ai/Llama-2-7b-alpaca-es"


#Tokenizer

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

def create_and_prepare_model():
        compute_dtype = getattr(torch, "float16")
        bnb_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=compute_dtype,
            bnb_4bit_use_double_quant=True,
        )
        model = AutoModelForCausalLM.from_pretrained(
            model_name, quantization_config=bnb_config, device_map={"": 0}
        )
        return model
model = create_and_prepare_model()

def generate(instruction, input=None):
    #Format the prompt to look like the training data
    if input is not None:
        prompt = "### Instruction:\n"+instruction+"\n\n### Input:\n"+input+"\n\n### Response:\n"
    else :
        prompt = "### Instruction:\n"+instruction+"\n\n### Response:\n"


    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs["input_ids"].cuda()

    generation_output = model.generate(
            input_ids=input_ids,
            generation_config=GenerationConfig(temperature=1.0, top_p=0.75, top_k=40, num_beams=10), #hyperparameters for generation
            return_dict_in_generate=True,
            output_scores=True,
            max_new_tokens=150, #maximum tokens generated, increase if you want longer asnwer (up to 2048 - the length of the prompt), generation "looks" slower for longer response

    )
    for seq in generation_output.sequences:
        output = tokenizer.decode(seq, skip_special_tokens=True)
        print(output.split("### Response:")[1].strip())
        
generate("Háblame de la superconductividad.")
print("-----------")
generate("Encuentra la capital de España.")
print("-----------")
generate("Encuentra la capital de Portugal.")
print("-----------")
generate("Organiza los números dados en orden ascendente.", "2, 3, 0, 8, 4, 10")
print("-----------")
generate("Compila una lista de 5 estados de EE. UU. ubicados en el Oeste.")
print("-----------")
generate("¿Cuál es el color de una fresa?")
print("-----------")
generate("¿Cuál es el color de la siguiente fruta?", "fresa")
print("-----------")

```

Expected output:

```
La superconductividad es un fenómeno físico en el que algunos materiales se convierten en conductores de corriente eléctrica a temperaturas muy bajas. Esto significa que la corriente eléctrica puede fluir a través del material sin pérdida de energía. La superconductividad fue descubierta por primera vez en 1911 por el físico alemán Heike Kamerlingh Onnes, quien descubrió que algunos materiales se convierten en conductores de corriente eléctrica a temperaturas muy bajas. Desde entonces, la superconductividad se ha utiliz
-----------
La capital de España es Madrid.
-----------
La capital de Portugal es Lisboa.
-----------
2, 3, 4, 8, 10, 0
-----------
California, Oregón, Washington, Nevada y Arizona.
-----------
El color de una fresa es rosa.
-----------
El color de la fresa es rojo.
```





## Contact Us
[4i.ai](https://4i.ai/) provides natural language processing solutions with dialog, vision and voice capabilities to deliver real-life multimodal human-machine conversations. 
Please contact us at info@4i.ai