rshacter commited on
Commit
a056ebf
·
verified ·
1 Parent(s): d8297f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -2
README.md CHANGED
@@ -1,6 +1,11 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
4
  ---
5
 
6
  # Model Card for Model ID
@@ -28,6 +33,9 @@ This is the model card of a 🤗 transformers model that has been pushed on the
28
  ### Model Sources [optional]
29
 
30
  <!-- Provide the basic links for the model. -->
 
 
 
31
 
32
  - **Repository:** [More Information Needed]
33
  - **Paper [optional]:** [More Information Needed]
@@ -36,10 +44,12 @@ This is the model card of a 🤗 transformers model that has been pushed on the
36
  ## Uses
37
 
38
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
39
 
40
  ### Direct Use
41
 
42
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
43
 
44
  [More Information Needed]
45
 
@@ -52,6 +62,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
52
  ### Out-of-Scope Use
53
 
54
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
55
 
56
  [More Information Needed]
57
 
@@ -74,16 +85,28 @@ Use the code below to get started with the model.
74
  [More Information Needed]
75
 
76
  ## Training Details
 
 
77
 
78
  ### Training Data
79
 
80
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
81
 
 
 
 
 
82
  [More Information Needed]
83
 
84
  ### Training Procedure
85
 
86
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
 
 
 
 
87
 
88
  #### Preprocessing [optional]
89
 
@@ -94,6 +117,15 @@ Use the code below to get started with the model.
94
 
95
  - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
 
 
 
 
 
 
 
 
 
 
97
  #### Speeds, Sizes, Times [optional]
98
 
99
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
@@ -103,6 +135,15 @@ Use the code below to get started with the model.
103
  ## Evaluation
104
 
105
  <!-- This section describes the evaluation protocols and provides the results. -->
 
 
 
 
 
 
 
 
 
106
 
107
  ### Testing Data, Factors & Metrics
108
 
@@ -192,7 +233,7 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
192
 
193
  ## Model Card Authors [optional]
194
 
195
- [More Information Needed]
196
 
197
  ## Model Card Contact
198
 
 
1
  ---
2
  library_name: transformers
3
+ datasets:
4
+ - mlabonne/orpo-dpo-mix-40k
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Llama-3.2-1B-Instruct
9
  ---
10
 
11
  # Model Card for Model ID
 
33
  ### Model Sources [optional]
34
 
35
  <!-- Provide the basic links for the model. -->
36
+ https://uplimit.com/course/open-source-llms/session/session_clu1q3j6f016d128r2zxe3uyj/assignment/assignment_clyvnyyjh019h199337oef4ur
37
+ https://uplimit.com/ugc-assets/course/course_clmz6fh2a00aa12bqdtjv6ygs/assets/1728565337395-85hdx93s03d0v9bd8j1nnxfjylyty2/uplimitopensourcellmsoctoberweekone.ipynb
38
+
39
 
40
  - **Repository:** [More Information Needed]
41
  - **Paper [optional]:** [More Information Needed]
 
44
  ## Uses
45
 
46
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
47
+ Hands-on learning: Finetuning LLMs
48
 
49
  ### Direct Use
50
 
51
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
52
+ Introduction to Finetuning LLMs
53
 
54
  [More Information Needed]
55
 
 
62
  ### Out-of-Scope Use
63
 
64
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
65
+ This should not yet be usedd in the world
66
 
67
  [More Information Needed]
68
 
 
85
  [More Information Needed]
86
 
87
  ## Training Details
88
+ Hardware: A100 GPU
89
+ Framework: PyTorch
90
 
91
  ### Training Data
92
 
93
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
94
+ For training data the model used:'mlabonne/orpo-dpo-mix-40k'
95
 
96
+ This dataset is designed for ORPO (Optimizing Reward and Preference Objectives) or DPO (Direct Preference Optimization) training of language models.
97
+ * It contains 44,245 examples in the training split.
98
+ * Includes prompts, chosen answers, and rejected answers for each sample.
99
+ * Combines various high-quality DPO datasets.
100
  [More Information Needed]
101
 
102
  ### Training Procedure
103
 
104
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
105
+ This model was fine-tuned using the ORPO (Optimizing Reward and Preference Objectives) technique on the meta-llama/Llama-3.2-1B-Instruct base model.
106
+
107
+ Base Model: meta-llama/Llama-3.2-1B-Instruct
108
+ Training Technique: ORPO (Optimizing Reward and Preference Objectives)
109
+ Efficient Fine-tuning Method: LoRA (Low-Rank Adaptation)
110
 
111
  #### Preprocessing [optional]
112
 
 
117
 
118
  - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
119
 
120
+ Learning Rate: 2e-5
121
+ Batch Size: 4
122
+ Gradient Accumulation Steps: 4
123
+ Training Steps: 500
124
+ Warmup Steps: 20
125
+ LoRA Rank: 16
126
+ LoRA Alpha: 32
127
+
128
+
129
  #### Speeds, Sizes, Times [optional]
130
 
131
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
135
  ## Evaluation
136
 
137
  <!-- This section describes the evaluation protocols and provides the results. -->
138
+ For evaluation the model used Hellaswag
139
+ Results:
140
+
141
+
142
+ | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
143
+ |---------|------:|------|-----:|--------|---|-----:|---|-----:|
144
+ |hellaswag| 1|none | 0|acc |↑ |0.4516|± |0.0050|
145
+ | | |none | 0|acc_norm|↑ |0.6139|± |0.0049|
146
+
147
 
148
  ### Testing Data, Factors & Metrics
149
 
 
233
 
234
  ## Model Card Authors [optional]
235
 
236
+ Ruth Shacterman
237
 
238
  ## Model Card Contact
239