YongdongWang
/

llama-3.1-8b-lora-qlora-dart-llm

@@ -1,4 +1,5 @@
 ---
 library_name: peft
 base_model: meta-llama/Llama-3.1-8B
 tags:
@@ -9,15 +10,15 @@ tags:
 - robotics
 - task-planning
 - construction
-license: llama3.1
 language:
 - en
 pipeline_tag: text-generation
 ---
-# Llama 3.1 8B - Robot Task Planning (QLoRA Fine-tuned)
-This model is a QLoRA fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) specialized for **robot task planning** in construction environments.
 The model converts natural language commands into structured task sequences for construction robots including excavators and dump trucks.
@@ -25,27 +26,43 @@ The model converts natural language commands into structured task sequences for
 - **Base Model**: meta-llama/Llama-3.1-8B
 - **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
-- **LoRA Rank**: 16
-- **LoRA Alpha**: 32
 - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
 ## Usage
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 from peft import PeftModel
 # Load tokenizer and base model
 tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
-base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
 # Load LoRA adapter
-model = PeftModel.from_pretrained(base_model, "YongdongWang/llama-3.1-8b-dart-qlora")
 # Generate robot task sequence
-command = "Deploy Excavator 1 to Soil Area 1 for excavation"
-inputs = tokenizer(command, return_tensors="pt")
-outputs = model.generate(**inputs, max_new_tokens=512)
 response = tokenizer.decode(outputs[0], skip_special_tokens=True)
 print(response)
 ```
@@ -54,10 +71,10 @@ print(response)
 - **Training Data**: DART LLM Tasks - Robot command and task planning dataset
 - **Domain**: Construction robotics (excavators, dump trucks, soil/rock areas)
-- **Training Epochs**: 5
-- **Batch Size**: 16 (with gradient accumulation)
-- **Learning Rate**: 2e-4
-- **Optimizer**: paged_adamw_8bit
 ## Capabilities
@@ -68,8 +85,45 @@ print(response)
 ## Example Output
-The model generates structured task sequences in JSON format for robot execution.
 ## Limitations
-This model is specifically trained for construction robotics scenarios and may not generalize to other domains without additional fine-tuning.

 ---
+license: llama3.1
 library_name: peft
 base_model: meta-llama/Llama-3.1-8B
 tags:
 - robotics
 - task-planning
 - construction
+- dart-llm
 language:
 - en
 pipeline_tag: text-generation
 ---
+# Llama 3.1 8B - DART LLM Robot Task Planning (QLoRA Fine-tuned)
+This model is a QLoRA fine-tuned version of **meta-llama/Llama-3.1-8B** specialized for **robot task planning** in construction environments.
 The model converts natural language commands into structured task sequences for construction robots including excavators and dump trucks.
 - **Base Model**: meta-llama/Llama-3.1-8B
 - **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
+- **LoRA Rank**: 16-32 (optimized per model size)
+- **LoRA Alpha**: 16-32
 - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+- **Dataset**: YongdongWang/dart_llm_tasks_pretty
+- **Training Domain**: Construction robotics
 ## Usage
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 from peft import PeftModel
+import torch
 # Load tokenizer and base model
 tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
+base_model = AutoModelForCausalLM.from_pretrained(
+    "meta-llama/Llama-3.1-8B",
+    torch_dtype=torch.float16,
+    device_map="auto"
+)
 # Load LoRA adapter
+model = PeftModel.from_pretrained(base_model, "YongdongWang/llama3.1-8b-lora-qlora-dart-llm")
 # Generate robot task sequence
+instruction = "Deploy Excavator 1 to Soil Area 1 for excavation"
+prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"
+inputs = tokenizer(prompt, return_tensors="pt")
+with torch.no_grad():
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=512,
+        do_sample=False,
+        pad_token_id=tokenizer.eos_token_id
+    )
 response = tokenizer.decode(outputs[0], skip_special_tokens=True)
 print(response)
 ```
 - **Training Data**: DART LLM Tasks - Robot command and task planning dataset
 - **Domain**: Construction robotics (excavators, dump trucks, soil/rock areas)
+- **Training Epochs**: 6-12 (optimized per model size)
+- **Batch Size**: 1 (with gradient accumulation)
+- **Learning Rate**: 1e-4 to 3e-4 (scaled by model size)
+- **Optimizer**: paged_adamw_8bit or adamw_torch
 ## Capabilities
 ## Example Output
+The model generates structured task sequences in JSON format for robot execution:
+```json
+{
+  "tasks": [
+    {
+      "instruction_function": {
+        "dependencies": [],
+        "name": "target_area_for_specific_robots",
+        "object_keywords": ["soil_area_1"],
+        "robot_ids": ["robot_excavator_01"],
+        "robot_type": null
+      },
+      "task": "target_area_for_specific_robots_1"
+    }
+  ]
+}
+```
 ## Limitations
+This model is specifically trained for construction robotics scenarios and may not generalize to other domains without additional fine-tuning.
+## Citation
+```bibtex
+@misc{llama3.1_8b_lora_qlora_dart_llm,
+  title={Llama 3.1 8B Fine-tuned with QLoRA for DART LLM Tasks},
+  author={YongdongWang},
+  year={2024},
+  publisher={Hugging Face},
+  url={https://huggingface.co/YongdongWang/llama3.1-8b-lora-qlora-dart-llm}
+}
+```
+## Model Card Authors
+YongdongWang
+## Model Card Contact
+For questions or issues, please open an issue in the repository or contact the model author.