Raymond-dev-546730
/

MaterialsAnalyst-AI-7B

+MaterialsAnalyst-AI-7B Training Documentation
+================================================
+Model Training Details
+---------------------
+Base Model:               Qwen 2.5 Instruct 7B
+Fine-tuning Method:       LoRA (Low-Rank Adaptation)
+Training Infrastructure:  Single NVIDIA A100 GPU
+Training Duration:        Approximately 5.4 hours
+Training Dataset:         Custom curated dataset for materials analysis
+Dataset Specifications
+---------------------
+Total Token Count:        6,441,671
+Total Sample Count:       6,000
+Average Tokens/Sample:    1,073.61
+Dataset Creation:         Generated using DeepSeekV3 API
+Training Configuration
+---------------------
+LoRA Parameters:
+- Rank:                   32
+- Alpha:                  64
+- Dropout:                0.1
+- Target Modules:         q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head
+Training Hyperparameters:
+- Learning Rate:          5e-5
+- Batch Size:             4
+- Gradient Accumulation:  5
+- Effective Batch Size:   20
+- Max Sequence Length:    2048
+- Epochs:                 3
+- Warmup Ratio:           0.01
+- Weight Decay:           0.01
+- Max Grad Norm:          1.0
+- LR Scheduler:           Cosine
+Hardware & Environment
+---------------------
+GPU:                      NVIDIA A100 SXM4 (40GB)
+Operating System:         Ubuntu
+CUDA Version:             11.8
+PyTorch Version:          2.7.0
+Compute Capability:       8.0
+Optimization:             FP16, Gradient Checkpointing
+Training Performance
+---------------------
+Training Runtime:         5.37 hours (19,348 seconds)
+Train Samples/Second:     0.884
+Train Steps/Second:       0.044
+Training Loss (Final):    0.170
+Validation Loss (Final):  0.136
+Total Training Steps:     855