YongdongWang commited on
Commit
d6c53f6
·
verified ·
1 Parent(s): e1e77b6

Upload fine-tuned Llama 3.1 8B QLoRA model

Browse files
README.md CHANGED
@@ -1,64 +1,65 @@
 
1
  ---
2
- library_name: peft
3
- base_model: meta-llama/Llama-3.1-8B
4
- tags:
5
- - llama
6
- - lora
7
- - qlora
8
- - fine-tuned
9
- license: llama3.1
10
- language:
11
- - en
12
- pipeline_tag: text-generation
13
- ---
14
 
15
- # Llama 3.1 8B - Robot Task Planning (QLoRA Fine-tuned)
16
 
17
- This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) specialized for **robot task planning** using QLoRA (4-bit quantization + LoRA).
18
 
19
- The model converts natural language robot commands into structured task sequences for construction robots including excavators and dump trucks.
20
 
21
- ## Model Details
22
 
23
- - **Base Model**: meta-llama/Llama-3.1-8B
24
- - **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
25
- - **LoRA Rank**: 16
26
- - **LoRA Alpha**: 32
27
- - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
28
 
29
- ## Usage
30
 
31
- ```python
32
- from transformers import AutoTokenizer, AutoModelForCausalLM
33
- from peft import PeftModel
34
 
35
- # Load tokenizer and base model
36
- tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
37
- base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
38
 
39
- # Load LoRA adapter
40
- model = PeftModel.from_pretrained(base_model, "YongdongWang/llama-3.1-8b-dart-qlora")
41
 
42
- # Generate robot task sequence
43
- command = "Deploy Excavator 1 to Soil Area 1 for excavation"
44
- inputs = tokenizer(command, return_tensors="pt")
45
- outputs = model.generate(**inputs, max_new_tokens=5120)
46
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
47
- print(response)
48
 
49
- Training Details
50
 
51
- Training Data: DART LLM Tasks - Robot command and task planning dataset
52
- Domain: Construction robotics (excavators, dump trucks, soil/rock areas)
53
- Training Epochs: 5
54
- Batch Size: 16 (with gradient accumulation)
55
- Learning Rate: 2e-4
56
- Optimizer: paged_adamw_8bit
57
 
58
- Capabilities
59
 
60
- Multi-robot coordination: Handle multiple excavators and dump trucks
61
- Task dependencies: Generate proper task sequences with dependencies
62
- Spatial reasoning: Understand soil areas, rock areas, puddles, and navigation
63
- Action planning: Convert commands to structured JSON task definitions
64
-
 
1
+
2
  ---
3
+ library_name: peft
4
+ base_model: meta-llama/Llama-3.1-8B
5
+ tags:
6
+ - llama
7
+ - lora
8
+ - qlora
9
+ - fine-tuned
10
+ license: llama3.1
11
+ language:
12
+ - en
13
+ pipeline_tag: text-generation
14
+ ---
15
 
16
+ # Llama 3.1 8B - Robot Task Planning (QLoRA Fine-tuned)
17
 
18
+ This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) specialized for **robot task planning** using QLoRA (4-bit quantization + LoRA).
19
 
20
+ The model converts natural language robot commands into structured task sequences for construction robots including excavators and dump trucks.
21
 
22
+ ## Model Details
23
 
24
+ - **Base Model**: meta-llama/Llama-3.1-8B
25
+ - **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
26
+ - **LoRA Rank**: 16
27
+ - **LoRA Alpha**: 32
28
+ - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
29
 
30
+ ## Usage
31
 
32
+ ```python
33
+ from transformers import AutoTokenizer, AutoModelForCausalLM
34
+ from peft import PeftModel
35
 
36
+ # Load tokenizer and base model
37
+ tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
38
+ base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
39
 
40
+ # Load LoRA adapter
41
+ model = PeftModel.from_pretrained(base_model, "YongdongWang/llama-3.1-8b-dart-qlora")
42
 
43
+ # Generate robot task sequence
44
+ command = "Deploy Excavator 1 to Soil Area 1 for excavation"
45
+ inputs = tokenizer(command, return_tensors="pt")
46
+ outputs = model.generate(**inputs, max_new_tokens=5120)
47
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
48
+ print(response)
49
 
50
+ Training Details
51
 
52
+ Training Data: DART LLM Tasks - Robot command and task planning dataset
53
+ Domain: Construction robotics (excavators, dump trucks, soil/rock areas)
54
+ Training Epochs: 5
55
+ Batch Size: 16 (with gradient accumulation)
56
+ Learning Rate: 2e-4
57
+ Optimizer: paged_adamw_8bit
58
 
59
+ Capabilities
60
 
61
+ Multi-robot coordination: Handle multiple excavators and dump trucks
62
+ Task dependencies: Generate proper task sequences with dependencies
63
+ Spatial reasoning: Understand soil areas, rock areas, puddles, and navigation
64
+ Action planning: Convert commands to structured JSON task definitions
65
+
adapter_config.json CHANGED
@@ -24,13 +24,13 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
- "v_proj",
28
- "down_proj",
29
- "up_proj",
30
- "q_proj",
31
  "k_proj",
 
 
 
32
  "o_proj",
33
- "gate_proj"
 
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
 
 
 
 
27
  "k_proj",
28
+ "down_proj",
29
+ "gate_proj",
30
+ "v_proj",
31
  "o_proj",
32
+ "up_proj",
33
+ "q_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a2c34c3f7912c764c06876b19e3358996b1df3fd836e8639590935332bcdb878
3
  size 167832240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:686aa6705c3b268bfc62872bc6a20ee3f78af943123a851082ea5a8beecf764b
3
  size 167832240
checkpoint-24/adapter_config.json CHANGED
@@ -24,13 +24,13 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
- "v_proj",
28
- "down_proj",
29
- "up_proj",
30
- "q_proj",
31
  "k_proj",
 
 
 
32
  "o_proj",
33
- "gate_proj"
 
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
 
 
 
 
27
  "k_proj",
28
+ "down_proj",
29
+ "gate_proj",
30
+ "v_proj",
31
  "o_proj",
32
+ "up_proj",
33
+ "q_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
checkpoint-24/adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1badd577fd317617bda766d393842d69930dd84a7c9cbf5635c0f3723724fbe8
3
  size 167832240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff80ada358f7d45551b795de60e3448a3be3b37ca434a09529d49259e740e104
3
  size 167832240
checkpoint-24/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ce411169d94121fa1d28e23b2471e35ac5577673cc20d674bf41714359bb5b13
3
  size 85728532
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b5ab0f12eed125e6b567965fc0df61f90b689c01a7cf9d5b32828e25f5b9211
3
  size 85728532
checkpoint-24/tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:881912e4e7f25194ad8e82fd5f12292ba4a70376303f608aa270fb4bde3bc9b7
3
- size 17210189
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b9e4e7fb171f92fd137b777cc2714bf87d11576700a1dcd7a399e7bbe39537b
3
+ size 17209920
checkpoint-24/trainer_state.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "best_global_step": 24,
3
- "best_metric": 0.15350762009620667,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-24",
5
  "epoch": 4.0,
6
  "eval_steps": 500,
@@ -11,62 +11,62 @@
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
- "grad_norm": 0.5172705054283142,
15
  "learning_rate": 0.00019594929736144976,
16
- "loss": 0.9618,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
- "eval_loss": 0.5014150142669678,
22
- "eval_runtime": 2.0191,
23
- "eval_samples_per_second": 5.448,
24
- "eval_steps_per_second": 5.448,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
- "grad_norm": 0.44574207067489624,
30
  "learning_rate": 0.00015406408174555976,
31
- "loss": 0.3429,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
- "eval_loss": 0.21539340913295746,
37
- "eval_runtime": 2.0264,
38
- "eval_samples_per_second": 5.428,
39
- "eval_steps_per_second": 5.428,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
- "grad_norm": 0.338784396648407,
45
  "learning_rate": 8.57685161726715e-05,
46
- "loss": 0.1674,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
- "eval_loss": 0.16001786291599274,
52
- "eval_runtime": 2.029,
53
- "eval_samples_per_second": 5.421,
54
- "eval_steps_per_second": 5.421,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
- "grad_norm": 0.2452668696641922,
60
  "learning_rate": 2.4425042564574184e-05,
61
- "loss": 0.1408,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
- "eval_loss": 0.15350762009620667,
67
- "eval_runtime": 2.0182,
68
- "eval_samples_per_second": 5.451,
69
- "eval_steps_per_second": 5.451,
70
  "step": 24
71
  }
72
  ],
 
1
  {
2
  "best_global_step": 24,
3
+ "best_metric": 0.026073265820741653,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-24",
5
  "epoch": 4.0,
6
  "eval_steps": 500,
 
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
+ "grad_norm": 0.5106796026229858,
15
  "learning_rate": 0.00019594929736144976,
16
+ "loss": 0.6176,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
+ "eval_loss": 0.15218934416770935,
22
+ "eval_runtime": 2.7083,
23
+ "eval_samples_per_second": 4.062,
24
+ "eval_steps_per_second": 4.062,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
+ "grad_norm": 0.4140273630619049,
30
  "learning_rate": 0.00015406408174555976,
31
+ "loss": 0.1118,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
+ "eval_loss": 0.04035872593522072,
37
+ "eval_runtime": 2.7366,
38
+ "eval_samples_per_second": 4.02,
39
+ "eval_steps_per_second": 4.02,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
+ "grad_norm": 0.19475381076335907,
45
  "learning_rate": 8.57685161726715e-05,
46
+ "loss": 0.0205,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
+ "eval_loss": 0.029018037021160126,
52
+ "eval_runtime": 2.6384,
53
+ "eval_samples_per_second": 4.169,
54
+ "eval_steps_per_second": 4.169,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
+ "grad_norm": 0.0900561586022377,
60
  "learning_rate": 2.4425042564574184e-05,
61
+ "loss": 0.0216,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
+ "eval_loss": 0.026073265820741653,
67
+ "eval_runtime": 2.7519,
68
+ "eval_samples_per_second": 3.997,
69
+ "eval_steps_per_second": 3.997,
70
  "step": 24
71
  }
72
  ],
checkpoint-24/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2846e9375e9180e7cfcd8c59ec58088e8cc4b41268da419e806212bdaeecdb21
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5c790f6569290a09d32f8a5b1e2904334b6f58a1c2321c61a41455b1cc5658b8
3
  size 5432
checkpoint-25/adapter_config.json CHANGED
@@ -24,13 +24,13 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
- "v_proj",
28
- "down_proj",
29
- "up_proj",
30
- "q_proj",
31
  "k_proj",
 
 
 
32
  "o_proj",
33
- "gate_proj"
 
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
 
 
 
 
27
  "k_proj",
28
+ "down_proj",
29
+ "gate_proj",
30
+ "v_proj",
31
  "o_proj",
32
+ "up_proj",
33
+ "q_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
checkpoint-25/adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a2c34c3f7912c764c06876b19e3358996b1df3fd836e8639590935332bcdb878
3
  size 167832240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:686aa6705c3b268bfc62872bc6a20ee3f78af943123a851082ea5a8beecf764b
3
  size 167832240
checkpoint-25/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b390f6d1a3d168cedf3457a776111498b89dfb304d7edd9f7047b94305887e1e
3
  size 85728532
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b2d76bcc892f6aa50acb4b2f920ced51c29d48a2e262a4e75f95ad24696d858
3
  size 85728532
checkpoint-25/tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:881912e4e7f25194ad8e82fd5f12292ba4a70376303f608aa270fb4bde3bc9b7
3
- size 17210189
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b9e4e7fb171f92fd137b777cc2714bf87d11576700a1dcd7a399e7bbe39537b
3
+ size 17209920
checkpoint-25/trainer_state.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "best_global_step": 25,
3
- "best_metric": 0.15348903834819794,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-25",
5
  "epoch": 4.175824175824176,
6
  "eval_steps": 500,
@@ -11,77 +11,77 @@
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
- "grad_norm": 0.5172705054283142,
15
  "learning_rate": 0.00019594929736144976,
16
- "loss": 0.9618,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
- "eval_loss": 0.5014150142669678,
22
- "eval_runtime": 2.0191,
23
- "eval_samples_per_second": 5.448,
24
- "eval_steps_per_second": 5.448,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
- "grad_norm": 0.44574207067489624,
30
  "learning_rate": 0.00015406408174555976,
31
- "loss": 0.3429,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
- "eval_loss": 0.21539340913295746,
37
- "eval_runtime": 2.0264,
38
- "eval_samples_per_second": 5.428,
39
- "eval_steps_per_second": 5.428,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
- "grad_norm": 0.338784396648407,
45
  "learning_rate": 8.57685161726715e-05,
46
- "loss": 0.1674,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
- "eval_loss": 0.16001786291599274,
52
- "eval_runtime": 2.029,
53
- "eval_samples_per_second": 5.421,
54
- "eval_steps_per_second": 5.421,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
- "grad_norm": 0.2452668696641922,
60
  "learning_rate": 2.4425042564574184e-05,
61
- "loss": 0.1408,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
- "eval_loss": 0.15350762009620667,
67
- "eval_runtime": 2.0182,
68
- "eval_samples_per_second": 5.451,
69
- "eval_steps_per_second": 5.451,
70
  "step": 24
71
  },
72
  {
73
  "epoch": 4.175824175824176,
74
- "grad_norm": 0.26511526107788086,
75
  "learning_rate": 0.0,
76
- "loss": 0.1287,
77
  "step": 25
78
  },
79
  {
80
  "epoch": 4.175824175824176,
81
- "eval_loss": 0.15348903834819794,
82
- "eval_runtime": 2.0275,
83
- "eval_samples_per_second": 5.425,
84
- "eval_steps_per_second": 5.425,
85
  "step": 25
86
  }
87
  ],
 
1
  {
2
  "best_global_step": 25,
3
+ "best_metric": 0.02604079246520996,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-25",
5
  "epoch": 4.175824175824176,
6
  "eval_steps": 500,
 
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
+ "grad_norm": 0.5106796026229858,
15
  "learning_rate": 0.00019594929736144976,
16
+ "loss": 0.6176,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
+ "eval_loss": 0.15218934416770935,
22
+ "eval_runtime": 2.7083,
23
+ "eval_samples_per_second": 4.062,
24
+ "eval_steps_per_second": 4.062,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
+ "grad_norm": 0.4140273630619049,
30
  "learning_rate": 0.00015406408174555976,
31
+ "loss": 0.1118,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
+ "eval_loss": 0.04035872593522072,
37
+ "eval_runtime": 2.7366,
38
+ "eval_samples_per_second": 4.02,
39
+ "eval_steps_per_second": 4.02,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
+ "grad_norm": 0.19475381076335907,
45
  "learning_rate": 8.57685161726715e-05,
46
+ "loss": 0.0205,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
+ "eval_loss": 0.029018037021160126,
52
+ "eval_runtime": 2.6384,
53
+ "eval_samples_per_second": 4.169,
54
+ "eval_steps_per_second": 4.169,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
+ "grad_norm": 0.0900561586022377,
60
  "learning_rate": 2.4425042564574184e-05,
61
+ "loss": 0.0216,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
+ "eval_loss": 0.026073265820741653,
67
+ "eval_runtime": 2.7519,
68
+ "eval_samples_per_second": 3.997,
69
+ "eval_steps_per_second": 3.997,
70
  "step": 24
71
  },
72
  {
73
  "epoch": 4.175824175824176,
74
+ "grad_norm": 0.09460670500993729,
75
  "learning_rate": 0.0,
76
+ "loss": 0.0157,
77
  "step": 25
78
  },
79
  {
80
  "epoch": 4.175824175824176,
81
+ "eval_loss": 0.02604079246520996,
82
+ "eval_runtime": 2.8313,
83
+ "eval_samples_per_second": 3.885,
84
+ "eval_steps_per_second": 3.885,
85
  "step": 25
86
  }
87
  ],
checkpoint-25/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2846e9375e9180e7cfcd8c59ec58088e8cc4b41268da419e806212bdaeecdb21
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5c790f6569290a09d32f8a5b1e2904334b6f58a1c2321c61a41455b1cc5658b8
3
  size 5432
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:881912e4e7f25194ad8e82fd5f12292ba4a70376303f608aa270fb4bde3bc9b7
3
- size 17210189
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6b9e4e7fb171f92fd137b777cc2714bf87d11576700a1dcd7a399e7bbe39537b
3
+ size 17209920
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2846e9375e9180e7cfcd8c59ec58088e8cc4b41268da419e806212bdaeecdb21
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5c790f6569290a09d32f8a5b1e2904334b6f58a1c2321c61a41455b1cc5658b8
3
  size 5432