HaileyStorm
/

chess-mamba-vs-xformer

Model card Files Files and versions Community

HaileyStorm commited on May 19

Commit

46797f2

•

1 Parent(s): 5191834

Update Report/REPORT.md

Files changed (1) hide show

Report/REPORT.md +1 -1

Report/REPORT.md CHANGED Viewed

@@ -40,7 +40,7 @@ Together, this reduced the dictionary size to 28 by eliminating tokens for '0',
 This more efficient representation allowed for longer games within the context size and more efficient training.
-## 3. d_Model Configuration Experiments and Scaling Analysis
 In the course of developing the Mamba 50M model, a series of experiments were conducted to determine the optimal configuration of d_state, d_model, and layer count. These experiments were critical in understanding the architecture's performance and informed the development of three versions of the Mamba 50M model.

 This more efficient representation allowed for longer games within the context size and more efficient training.
+## 3. Model Configuration Experiments and Scaling Analysis
 In the course of developing the Mamba 50M model, a series of experiments were conducted to determine the optimal configuration of d_state, d_model, and layer count. These experiments were critical in understanding the architecture's performance and informed the development of three versions of the Mamba 50M model.