File size: 4,446 Bytes
6898ff1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150

---

language:
- en
pipeline_tag: text-generation
tags:
- esper
- esper-2
- valiant
- valiant-labs
- llama
- llama-3.2
- llama-3.2-instruct
- llama-3.2-instruct-3b
- llama-3
- llama-3-instruct
- llama-3-instruct-3b
- 3b
- code
- code-instruct
- python
- dev-ops
- terraform
- azure
- aws
- gcp
- architect
- engineer
- developer
- conversational
- chat
- instruct
base_model: meta-llama/Llama-3.2-3B-Instruct
datasets:
- sequelbox/Titanium
- sequelbox/Tachibana
- sequelbox/Supernova
model-index:
- name: ValiantLabs/Llama3.2-3B-Esper2
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-Shot)
      type: Winogrande
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 65.27
      name: acc
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: ARC Challenge (25-Shot)
      type: arc-challenge
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 43.17
      name: normalized accuracy
model_type: llama
license: llama3.2

---

[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)


# QuantFactory/Llama3.2-3B-Esper2-GGUF
This is quantized version of [ValiantLabs/Llama3.2-3B-Esper2](https://huggingface.co/ValiantLabs/Llama3.2-3B-Esper2) created using llama.cpp

# Original Model Card



![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/4I6oK8DG0so4VD8GroFsd.jpeg)


Esper 2 is a DevOps and cloud architecture code specialist built on Llama 3.2 3b.
- Expertise-driven, an AI assistant focused on AWS, Azure, GCP, Terraform, Dockerfiles, pipelines, shell scripts and more!
- Real world problem solving and high quality code instruct performance within the Llama 3.2 Instruct chat format
- Finetuned on synthetic [DevOps-instruct](https://huggingface.co/datasets/sequelbox/Titanium) and [code-instruct](https://huggingface.co/datasets/sequelbox/Tachibana) data generated with Llama 3.1 405b.
- Overall chat performance supplemented with [generalist chat data.](https://huggingface.co/datasets/sequelbox/Supernova)

Try our code-instruct AI assistant [Enigma!](https://huggingface.co/ValiantLabs/Llama3.1-8B-Enigma)


## Version

This is the **2024-10-03** release of Esper 2 for Llama 3.2 3b.

Esper 2 is also available for [Llama 3.1 8b!](https://huggingface.co/ValiantLabs/Llama3.1-8B-Esper2)

Esper 2 will be coming to more model sizes soon :)


## Prompting Guide
Esper 2 uses the [Llama 3.2 Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) prompt format. The example script below can be used as a starting point for general chat:

```python
import transformers
import torch

model_id = "ValiantLabs/Llama3.2-3B-Esper2"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an AI assistant."},
    {"role": "user", "content": "Hi, how do I optimize the size of a Docker image?"}
]

outputs = pipeline(
    messages,
    max_new_tokens=2048,
)

print(outputs[0]["generated_text"][-1])
```

## The Model
Esper 2 is built on top of Llama 3.2 3b Instruct, improving performance through high quality DevOps, code, and chat data in Llama 3.2 Instruct prompt style.

Our current version of Esper 2 is trained on DevOps data from [sequelbox/Titanium](https://huggingface.co/datasets/sequelbox/Titanium), supplemented by code-instruct data from [sequelbox/Tachibana](https://huggingface.co/datasets/sequelbox/Tachibana) and general chat data from [sequelbox/Supernova.](https://huggingface.co/datasets/sequelbox/Supernova)


![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/VCJ8Fmefd8cdVhXSSxJiD.jpeg)


Esper 2 is created by [Valiant Labs.](http://valiantlabs.ca/)

[Check out our HuggingFace page for Shining Valiant 2, Enigma, and our other Build Tools models for creators!](https://huggingface.co/ValiantLabs)

[Follow us on X for updates on our models!](https://twitter.com/valiant_labs)

We care about open source.
For everyone to use.

We encourage others to finetune further from our models.