Raymond-dev-546730 commited on
Commit
bd867ce
·
verified ·
1 Parent(s): b161266

Update Training/Training_Documentation.txt

Browse files
Files changed (1) hide show
  1. Training/Training_Documentation.txt +60 -0
Training/Training_Documentation.txt CHANGED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MaterialsAnalyst-AI-7B Training Documentation
2
+ ================================================
3
+
4
+ Model Training Details
5
+ ---------------------
6
+
7
+ Base Model: Qwen 2.5 Instruct 7B
8
+ Fine-tuning Method: LoRA (Low-Rank Adaptation)
9
+ Training Infrastructure: Single NVIDIA A100 GPU
10
+ Training Duration: Approximately 5.4 hours
11
+ Training Dataset: Custom curated dataset for materials analysis
12
+
13
+ Dataset Specifications
14
+ ---------------------
15
+
16
+ Total Token Count: 6,441,671
17
+ Total Sample Count: 6,000
18
+ Average Tokens/Sample: 1,073.61
19
+ Dataset Creation: Generated using DeepSeekV3 API
20
+
21
+ Training Configuration
22
+ ---------------------
23
+
24
+ LoRA Parameters:
25
+ - Rank: 32
26
+ - Alpha: 64
27
+ - Dropout: 0.1
28
+ - Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head
29
+
30
+ Training Hyperparameters:
31
+ - Learning Rate: 5e-5
32
+ - Batch Size: 4
33
+ - Gradient Accumulation: 5
34
+ - Effective Batch Size: 20
35
+ - Max Sequence Length: 2048
36
+ - Epochs: 3
37
+ - Warmup Ratio: 0.01
38
+ - Weight Decay: 0.01
39
+ - Max Grad Norm: 1.0
40
+ - LR Scheduler: Cosine
41
+
42
+ Hardware & Environment
43
+ ---------------------
44
+
45
+ GPU: NVIDIA A100 SXM4 (40GB)
46
+ Operating System: Ubuntu
47
+ CUDA Version: 11.8
48
+ PyTorch Version: 2.7.0
49
+ Compute Capability: 8.0
50
+ Optimization: FP16, Gradient Checkpointing
51
+
52
+ Training Performance
53
+ ---------------------
54
+
55
+ Training Runtime: 5.37 hours (19,348 seconds)
56
+ Train Samples/Second: 0.884
57
+ Train Steps/Second: 0.044
58
+ Training Loss (Final): 0.170
59
+ Validation Loss (Final): 0.136
60
+ Total Training Steps: 855