zkolter commited on
Commit
1ea0044
·
verified ·
1 Parent(s): d311e38

Upload Chat-Tuning homework models and data

Browse files
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ ultrachat_dpo_neg.json filter=lfs diff=lfs merge=lfs -text
37
+ ultrachat_dpo_pos.json filter=lfs diff=lfs merge=lfs -text
38
+ ultrachat_short.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: pytorch
3
+ pipeline_tag: text-generation
4
+ tags:
5
+ - text-generation
6
+ - pytorch
7
+ - fineweb-edu
8
+ - ultrachat
9
+ - homework
10
+ datasets:
11
+ - HuggingFaceFW/fineweb-edu
12
+ - HuggingFaceH4/ultrachat_200k
13
+ ---
14
+
15
+ # Chat-Tuning-Homework
16
+
17
+ This is a course-homework model repo containing both checkpoints and derived data artifacts.
18
+
19
+ ## Contents
20
+
21
+ - `model_base.pth`: 1.1M-step base model checkpoint in the homework's LLaMA-like single-file format.
22
+ - `model_chat.pth`: chat-tuned checkpoint in the homework model format.
23
+ - `params.json`: model architecture parameters used by the homework `LLM` loader.
24
+ - `ultrachat_short.json`: filtered short-form UltraChat conversations used for chat tuning.
25
+ - `ultrachat_dpo_pos.json`: positive DPO preference data.
26
+ - `ultrachat_dpo_neg.json`: negative DPO preference data.
27
+
28
+ ## Model Card
29
+
30
+ ### Architecture
31
+
32
+ The checkpoints use the homework transformer architecture with:
33
+
34
+ - dimension: 1024
35
+ - feed-forward dimension: 4096
36
+ - heads: 16
37
+ - layers: 8
38
+ - maximum sequence length: 1024
39
+ - vocabulary size: 50432
40
+
41
+ These values are also stored in `params.json`.
42
+
43
+ ### Training Summary
44
+
45
+ - `model_base.pth` is the pretrained base checkpoint exported from the 1.1M-step FineWebEDU run.
46
+ - `model_chat.pth` is the chat-tuned checkpoint saved after supervised chat tuning in the homework notebook workflow.
47
+
48
+ These files are intended for loading with the homework `LLM` implementation and the corresponding `load_weights(...)` function.
49
+
50
+ ### Intended Use
51
+
52
+ - educational experiments
53
+ - homework reproduction
54
+ - lightweight chat fine-tuning exercises
55
+
56
+ ### Limitations
57
+
58
+ - this is a homework model, not a production model
59
+ - outputs can be repetitive, unstable, or factually incorrect
60
+ - the chat-tuned model was trained on a filtered subset of UltraChat-derived data
61
+
62
+ ## Data Card
63
+
64
+ ### Data Sources
65
+
66
+ - FineWebEDU for base pretraining
67
+ - UltraChat 200k for chat tuning and preference-style data preparation
68
+
69
+ ### Included Data Files
70
+
71
+ - `ultrachat_short.json`: shortened chat-tuning corpus
72
+ - `ultrachat_dpo_pos.json`: preferred responses
73
+ - `ultrachat_dpo_neg.json`: dispreferred responses
74
+
75
+ ### Data Notes
76
+
77
+ These data files are included here for homework reproducibility. They are derived artifacts prepared locally for the assignment workflow rather than canonical upstream dataset exports.
78
+
79
+ ## File Format Notes
80
+
81
+ - `model_base.pth` and `model_chat.pth` are PyTorch checkpoint dictionaries
82
+ - attention weights are stored in the homework-compatible unpacked format
83
+ - all exported weights are stored as `bfloat16`
84
+
model_base.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f3df23ea539ee489eb5fe48702cd459e6f42b19cace1fb035f761452df8ff178
3
+ size 410006722
model_chat.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc0f260b99b4259d0a69c889d2ed8d6be45c15448675bae9dae0ac7ace947480
3
+ size 410009717
params.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dim": 1024,
3
+ "ffn_dim": 4096,
4
+ "max_seq_len": 1024,
5
+ "n_heads": 16,
6
+ "n_layers": 8,
7
+ "num_tokens": 50432
8
+ }
ultrachat_dpo_neg.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d9a6473df24ef8cd2429067dccc59e4432ac145656421a9b29ae709fddfb4cd
3
+ size 34866275
ultrachat_dpo_pos.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7592f16b73960e266c7b3ab1ba3fb7dcf601e3a4d9bc6c7824b43c6c8d1f91f2
3
+ size 47060892
ultrachat_short.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c7e6efa58b27be6ee8674d31b1e0ee5e75b00a5739583bf9af15dcd84f5b5680
3
+ size 319227374