Awakate commited on
Commit
29d4879
·
verified ·
1 Parent(s): 0e2d4c3

Update README.md

Browse files

Hola, si lees esto, espero te sirve este trabajo, saludos a OMCaicedo

Files changed (1) hide show
  1. README.md +235 -37
README.md CHANGED
@@ -6,35 +6,90 @@ tags:
6
  - base_model:adapter:llama32-3b
7
  - lora
8
  - transformers
 
 
 
 
 
9
  ---
10
 
11
  # Model Card for Model ID
 
 
 
12
 
13
- <!-- Provide a quick summary of what the model is/does. -->
 
 
 
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
 
17
  ## Model Details
18
 
19
  ### Model Description
20
 
21
- <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
 
 
23
 
 
 
 
 
24
 
25
- - **Developed by:** [More Information Needed]
26
- - **Funded by [optional]:** [More Information Needed]
27
- - **Shared by [optional]:** [More Information Needed]
28
- - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
- - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
 
 
 
 
 
 
 
 
32
 
33
  ### Model Sources [optional]
34
 
35
  <!-- Provide the basic links for the model. -->
36
 
37
- - **Repository:** [More Information Needed]
38
  - **Paper [optional]:** [More Information Needed]
39
  - **Demo [optional]:** [More Information Needed]
40
 
@@ -43,40 +98,116 @@ tags:
43
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
44
 
45
  ### Direct Use
46
-
47
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
48
-
49
- [More Information Needed]
 
 
 
 
 
 
 
50
 
51
  ### Downstream Use [optional]
52
 
53
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
 
 
54
 
55
- [More Information Needed]
 
 
 
 
56
 
57
  ### Out-of-Scope Use
58
 
 
 
59
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
60
 
61
- [More Information Needed]
 
 
 
 
 
 
 
 
62
 
63
  ## Bias, Risks, and Limitations
64
 
65
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
66
 
67
- [More Information Needed]
68
-
 
 
 
 
 
 
 
 
 
69
  ### Recommendations
70
 
71
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
72
 
73
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 
 
 
 
 
 
 
 
 
 
74
 
75
  ## How to Get Started with the Model
76
 
77
  Use the code below to get started with the model.
78
 
79
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
 
81
  ## Training Details
82
 
@@ -84,26 +215,58 @@ Use the code below to get started with the model.
84
 
85
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
86
 
87
- [More Information Needed]
 
 
 
 
 
 
 
 
 
88
 
89
  ### Training Procedure
90
 
91
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
92
 
 
 
 
 
 
93
  #### Preprocessing [optional]
94
 
95
- [More Information Needed]
96
 
97
 
 
 
 
98
  #### Training Hyperparameters
99
 
100
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
 
 
 
 
 
 
101
 
102
  #### Speeds, Sizes, Times [optional]
103
 
104
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
105
 
106
- [More Information Needed]
 
 
 
 
 
107
 
108
  ## Evaluation
109
 
@@ -115,7 +278,15 @@ Use the code below to get started with the model.
115
 
116
  <!-- This should link to a Dataset Card if possible. -->
117
 
118
- [More Information Needed]
 
 
 
 
 
 
 
 
119
 
120
  #### Factors
121
 
@@ -131,11 +302,27 @@ Use the code below to get started with the model.
131
 
132
  ### Results
133
 
134
- [More Information Needed]
 
 
 
 
 
 
 
 
135
 
136
  #### Summary
137
 
 
 
 
 
138
 
 
 
 
 
139
 
140
  ## Model Examination [optional]
141
 
@@ -149,17 +336,20 @@ Use the code below to get started with the model.
149
 
150
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
151
 
152
- - **Hardware Type:** [More Information Needed]
153
- - **Hours used:** [More Information Needed]
154
- - **Cloud Provider:** [More Information Needed]
155
  - **Compute Region:** [More Information Needed]
156
- - **Carbon Emitted:** [More Information Needed]
157
 
158
  ## Technical Specifications [optional]
159
 
160
  ### Model Architecture and Objective
161
 
162
- [More Information Needed]
 
 
 
163
 
164
  ### Compute Infrastructure
165
 
@@ -167,7 +357,12 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
167
 
168
  #### Hardware
169
 
170
- [More Information Needed]
 
 
 
 
 
171
 
172
  #### Software
173
 
@@ -193,15 +388,18 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
193
 
194
  ## More Information [optional]
195
 
196
- [More Information Needed]
197
 
198
  ## Model Card Authors [optional]
199
 
200
- [More Information Needed]
 
201
 
202
  ## Model Card Contact
203
 
204
- [More Information Needed]
 
 
205
  ### Framework versions
206
 
207
- - PEFT 0.18.0
 
6
  - base_model:adapter:llama32-3b
7
  - lora
8
  - transformers
9
+ - network-automation
10
+ - cisco
11
+ license: llama3.2
12
+ language:
13
+ - es
14
  ---
15
 
16
  # Model Card for Model ID
17
+ All translatios were done in DeepL.com (free version)
18
+ EN:
19
+ This model is an LLM specialized in Cisco network configuration, fine-tuned with 4-bit LoRA on a LLaMA 3.2 3B base, focused on:
20
 
21
+ - Interface configuration
22
+ - VLAN configuration
23
+ - DHCP configuration
24
+ - Technical responses for OSPF, NAT, ACL, DNS, BGP (text mode)
25
 
26
+ In addition, it was designed to integrate with agents that use network automation tools. Developed as a project for the Recent Topics in Networking course at the University of Cauca, it was trained with an artificially generated dataset.
27
+ The model was trained on a dataset of 10,000 examples, with 10,000 examples of training data and 10,000 examples of test data.
28
+
29
+ ES:
30
+
31
+ Este modelo es un LLM especializado en configuración de redes Cisco, ajustado mediante fine-tuning con LoRA a 4 bits sobre una base LLaMA 3.2 3B, enfocado en:
32
+
33
+ - Configuración de interfaces
34
+ - Configuración de VLAN
35
+ - Configuración de DHCP
36
+ - Respuestas técnicas para OSPF, NAT, ACL, DNS, BGP (modo textual)
37
+
38
+ Además, fue diseñado para integrarse con agentes que usen herramientas para automatización de red. Desarrollado como proyecto para la materia Recent Topics in netwroking
39
+ de la universidad del cauca, fue entrando con un dataset generado de manera artificial.
40
 
41
 
42
  ## Model Details
43
 
44
  ### Model Description
45
 
46
+ EN:
47
+ This model was adjusted with a specialized dataset of real Cisco commands, with an instruction-input-output structure.
48
+ It is optimized to run on low-power GPUs thanks to:
49
+
50
+ - 4-bit quantization
51
+ - LoRA adapters
52
+
53
+ **Key features:**
54
+
55
+ - Natural language understanding in Spanish
56
+ - Generation of real Cisco commands
57
+ - Compatible with multi-agent systems
58
+ - Able to detect when to use external tools
59
+
60
+ ES:
61
+ Este modelo fue ajustado con un dataset especializado en comandos reales de CiscO, con estructura instrucción–entrada–salida.
62
+ Está optimizado para ejecutarse en GPUs de bajo consumo gracias a:
63
 
64
+ - Cuantización 4-bit
65
+ - Adaptadores LoRA
66
 
67
+ - Comprensión de lenguaje natural en español
68
+ - Generación de comandos Cisco reales
69
+ - Compatible con sistemas multi-agente
70
+ - Capaz de detectar cuándo usar herramientas externas
71
 
72
+ **Key features:**
73
+
74
+ - Comprensión de lenguaje natural en español
75
+ - Generación de comandos Cisco reales
76
+ - Compatible con sistemas multi-agente
77
+ - Capaz de detectar cuándo usar herramientas externas
78
+
79
+
80
+ - **Developed by:** Juan Jose Angel Duran Calvache, Alison Daniela Ruiz Muñoz. -Universidad del Cauca
81
+ - **Funded by [optional]:** Juan Jose Angel Duran Calvache, Alison Daniela Ruiz Muñoz.
82
+ - **Shared by [optional]:** Juan Jose Angel Duran Calvache, Alison Daniela Ruiz Muñoz.
83
+ - **Model type:** Causal Language Model (Text Generation)
84
+ - **Language(s) (NLP):** Español
85
+ - **License:** LLaMA 3.2
86
+ - **Finetuned from model [optional]:** llama32-3b
87
 
88
  ### Model Sources [optional]
89
 
90
  <!-- Provide the basic links for the model. -->
91
 
92
+ - **Repository:** https://github.com/3NombresJJA/Cisco-llm-agent
93
  - **Paper [optional]:** [More Information Needed]
94
  - **Demo [optional]:** [More Information Needed]
95
 
 
98
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
99
 
100
  ### Direct Use
101
+ EN:
102
+ - Direct generation of Cisco IOS configurations
103
+ - Support for networking students
104
+ - Simulation of router and switch configurations
105
+ - Technical conversational assistants
106
+
107
+ ES:
108
+ - Generación directa de configuraciones Cisco IOS
109
+ - Soporte a estudiantes de redes
110
+ - Simulación de configuraciones de routers y switches
111
+ - Asistentes conversacionales técnicos
112
 
113
  ### Downstream Use [optional]
114
 
115
+ EN:
116
+ - Integration with LangGraph + LangChain
117
+ - Automation of real configurations
118
+ - Virtual laboratory systems
119
+ - Educational platforms
120
 
121
+ ES:
122
+ - Integración con LangGraph + LangChain
123
+ - Automatización de configuraciones reales
124
+ - Sistemas de laboratorio virtual
125
+ - Plataformas educativas
126
 
127
  ### Out-of-Scope Use
128
 
129
+
130
+
131
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
132
 
133
+ EN:
134
+ - Not designed for offensive pentesting
135
+ - Not designed for production in real critical infrastructures
136
+ - Does not guarantee security validation
137
+
138
+ ES:
139
+ - No diseñado para pentesting ofensivo
140
+ - No diseñado para producción en infraestructuras críticas reales
141
+ - No garantiza validación de seguridad
142
 
143
  ## Bias, Risks, and Limitations
144
 
145
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
146
 
147
+ EN:
148
+ - The model may invent IP addresses if they are not specified
149
+ - Does not validate real topologies
150
+ - May produce incomplete configurations if the prompt is ambiguous
151
+ - Was only trained on basic to intermediate configurations
152
+
153
+ ES:
154
+ - El modelo puede inventar direcciones IP si no se especifican
155
+ - No valida topologías reales
156
+ - Puede producir configuraciones incompletas si el prompt es ambiguo
157
+ - Solo fue entrenado en configuraciones básicas – intermedias
158
  ### Recommendations
159
 
160
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
161
 
162
+ EN:
163
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
164
+ - Use in educational or simulated environments
165
+ - Combine with verification agents
166
+
167
+ ES:
168
+ Los usuarios (tanto directos como secundarios) deben ser conscientes de los riesgos, sesgos y limitaciones del modelo.
169
+ - Siempre validar las configuraciones antes de aplicarlas a producción
170
+ - Usar en entornos educativos o simulados
171
+ - Combinar con agentes de verificación
172
+
173
 
174
  ## How to Get Started with the Model
175
 
176
  Use the code below to get started with the model.
177
 
178
+ ```bash
179
+ pip install transformers peft accelerate bitsandbytes torch
180
+
181
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
182
+ from peft import PeftModel
183
+ import torch
184
+
185
+ base_model = "llama32-3b"
186
+ lora_repo = "Awakate/llama32-router-lora"
187
+
188
+ tokenizer = AutoTokenizer.from_pretrained(lora_repo)
189
+ model = AutoModelForCausalLM.from_pretrained(
190
+ base_model,
191
+ load_in_4bit=True,
192
+ device_map="auto"
193
+ )
194
+
195
+ model = PeftModel.from_pretrained(model, lora_repo)
196
+
197
+ pipe = pipeline(
198
+ "text-generation",
199
+ model=model,
200
+ tokenizer=tokenizer,
201
+ max_new_tokens=200,
202
+ temperature=0.1
203
+ )
204
+
205
+ prompt = "Configura la interfaz Gi0/0 con ip 192.168.1.1 máscara 255.255.255.0 y vlan 10"
206
+ print(pipe(prompt)[0]["generated_text"])
207
+
208
+
209
+ ```
210
+
211
 
212
  ## Training Details
213
 
 
215
 
216
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
217
 
218
+ The dataset used is posted in the github link, it was a personalized dastaset in format JSON with the following structure:
219
+
220
+ El dataset utilizado se encuentra publicado en el enlace de GitHub. Se trata de un dataset personalizado en formato JSON con la siguiente estructura:
221
+ {
222
+ "instruction": "Configurar interfaz",
223
+ "input": "Gi0/0 con IP 192.168.1.1",
224
+ "output": "interface Gi0/0..."
225
+ }
226
+ Contain examples of: Interfaces, VLAN, DHCP, OSPF, NAT, ACL and DNS
227
+ Contiene ejemplos de: Interfaces, VLAN, DHCP, OSPF, NAT, ACL y DNS
228
 
229
  ### Training Procedure
230
 
231
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
232
 
233
+ - Fine-tuning con LoRA (Low Rank Adaptation)
234
+ - Cuantización 4-bit
235
+ - Framework: transformers + peft
236
+
237
+
238
  #### Preprocessing [optional]
239
 
 
240
 
241
 
242
+ There is no pre proccesing of the data.
243
+ No se hizo procesamiento de los datos.
244
+
245
  #### Training Hyperparameters
246
 
247
+ - **Training regime:**
248
+ Epochs: 5
249
+ Batch size: 2
250
+ Gradient accumulation: 8
251
+ Learning rate: 8e-5
252
+ LoRA r: 16
253
+ LoRA alpha: 32
254
+ Precision: FP16
255
+ LoRA: Dropout: 0.05
256
+ Max length: 200
257
+
258
+ <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
259
 
260
  #### Speeds, Sizes, Times [optional]
261
 
262
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
263
 
264
+ EN:
265
+ The finetune weights with Lora have a size of 18MB and were processed in an hour and a half of compilation in 115 checkpoints.
266
+
267
+ ES:
268
+ Los pesos del finetune con lora tienen un peso de 18MB, fue procesado en hora y media de compilación en 115 checkpoints.
269
+
270
 
271
  ## Evaluation
272
 
 
278
 
279
  <!-- This should link to a Dataset Card if possible. -->
280
 
281
+ EN:
282
+ -Human testing with real prompts
283
+ -Integration with LangGraph agents
284
+ -Manual validation of Cisco commands
285
+
286
+ ES:
287
+ -Pruebas humanas con prompts reales
288
+ -Integración con agentes LangGraph
289
+ -Validación manual de comandos Cisco
290
 
291
  #### Factors
292
 
 
302
 
303
  ### Results
304
 
305
+
306
+ ![Screenshot 2025-12-04 145318](https://cdn-uploads.huggingface.co/production/uploads/6922899fcc2ce4db552cdafd/QmdvNwWzCS7PMjX6-nxqQ.png)
307
+
308
+
309
+ ![Screenshot 2025-12-04 115148](https://cdn-uploads.huggingface.co/production/uploads/6922899fcc2ce4db552cdafd/FdROKu4Sg3gkp2qi3cbca.png)
310
+
311
+
312
+ ![Screenshot 2025-12-04 145542](https://cdn-uploads.huggingface.co/production/uploads/6922899fcc2ce4db552cdafd/yysGWm2uYM5C6NRyQRuyD.png)
313
+
314
 
315
  #### Summary
316
 
317
+ EN:
318
+ The agent works in a curious way. An example of agent integration can be found in the GitHub repository, where it was possible to verify through prompt engineering
319
+ the understanding of the knowledge model added by the fine-tune. However, it does not respond effectively 100% of the time, so the results must be taken with a grain of salt.
320
+ The agent works in a curious way. An example of agent integration can be found in the GitHub repository, where it was possible to verify through prompt engineering
321
 
322
+ ES:
323
+ El agente funciona de forma curiosa, se encuentra un ejemplo de integracion a agente en el repositorio de github, donde se pudo comprobar atravez de prompt engineering
324
+ el entendimiento del modelo del conocimiento agregado por el finetune, sin embargo no responde de forma efectiva el 100% de las veces por lo que se debe de tomar con detalle
325
+ los resultados.
326
 
327
  ## Model Examination [optional]
328
 
 
336
 
337
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
338
 
339
+ - **Hardware Type:** GPU RTX 4050 6GB laptop version
340
+ - **Hours used:** 4
341
+ - **Cloud Provider:** Local
342
  - **Compute Region:** [More Information Needed]
343
+ - **Carbon Emitted:** Menos de 0.5
344
 
345
  ## Technical Specifications [optional]
346
 
347
  ### Model Architecture and Objective
348
 
349
+ -Base: LLaMA 3.2 – 3B parameters
350
+ -Adaptation: LoRA
351
+ -Precision: 4-bit
352
+ -Objective: Causal Language Modeling
353
 
354
  ### Compute Infrastructure
355
 
 
357
 
358
  #### Hardware
359
 
360
+ Linux in WSL
361
+
362
+ Intel i5 12500H
363
+ 8GB RAM DDR4
364
+ RTX 4050 6GB laptop
365
+ SSD M.2
366
 
367
  #### Software
368
 
 
388
 
389
  ## More Information [optional]
390
 
391
+ Proyecto Academico para RTN 2025-2
392
 
393
  ## Model Card Authors [optional]
394
 
395
+ Juan Jose Angel Duran Calvache
396
+ Alison Daniela Ruiz Muñoz
397
 
398
  ## Model Card Contact
399
 
400
+ joseduran@unicauca.edu.co
401
+ alisonruiz@unicauca.edu.co
402
+
403
  ### Framework versions
404
 
405
+ - PEFT 0.18.0