MJerome commited on
Commit
e5af28f
1 Parent(s): 2784669

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ datasets:
5
+ - Leon-LLM/Leon-Chess-Dataset-71k
6
+ ---
7
+
8
+ # Chess Language Model
9
+
10
+ This model is a GPT-2-based language model trained on chess game sequences using our unique [xLAN](https://github.zhaw.ch/schmila7/leon-llm#xlan-format) notation, focusing on predicting legal chess moves and understanding game dynamics.
11
+ For more information see [GitHub](https://github.zhaw.ch/schmila7/leon-llm)
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ - **Developed by:** Schmid Lars, Maag Jerome
17
+ - **Model type:** GPT-2 adaptation for chess language understanding
18
+ - **Language(s) (NLP):** xLAN (Chess Notation)
19
+
20
+ ### Model Sources
21
+
22
+ - **Repository:** [Leon LLM Chess Research](https://github.zhaw.ch/schmila7/leon-llm)
23
+
24
+ ## Uses
25
+
26
+ ### Direct Use
27
+
28
+ The model is intended for predicting chess moves, analyzing game positions, and studying chess strategies.
29
+
30
+ ### Out-of-Scope Use
31
+
32
+ This model is not designed for general language understanding or tasks unrelated to chess.
33
+
34
+ ## Bias, Risks, and Limitations
35
+
36
+ The model reflects the strategies and styles present in the training dataset and may not encompass all possible chess scenarios.
37
+
38
+ ## How to Get Started with the Model
39
+
40
+ To use the model use the Notebooks on [GitHub](https://github.zhaw.ch/schmila7/leon-llm)
41
+
42
+ ## Training Details
43
+
44
+ ### Training Data
45
+
46
+ The model was trained on the "Leon-LLM/Leon-Chess-Dataset-71k" dataset, sourced from Lichess database (September 2023).
47
+
48
+ ### Training Procedure
49
+
50
+ #### Preprocessing
51
+
52
+ Chess games from Lichess were converted from PGN into [xLAN](https://github.zhaw.ch/schmila7/leon-llm#xlan-format) as part of the preprocessing.
53
+
54
+ #### Training Hyperparameters
55
+
56
+ - Batch Size: 23
57
+ - Epochs: 4
58
+ - Learning Rate: 0.0001
59
+
60
+ ## Evaluation
61
+
62
+ ### Metrics
63
+
64
+ The model was tested on 3 metrics:
65
+
66
+ 1. **Average Number of Correct Plies:** Measures the model's ability to simulate chess games, evaluating the average number of correctly generated plies in 100 games.
67
+
68
+ 2. **Hard Position Accuracy:** Assesses the model's handling of 67 challenging chess positions, including unusual castling, pawn promotions, and checkmate scenarios. Success is measured by the model's ability to generate legal moves in these complex positions.
69
+
70
+ 3. **Legal Piece Moves Accuracy:** Examines the model's proficiency in keeping the board state various situations, including checks, pinned pieces, and pawn promotions. The metric focuses on the model's understanding of the board state.
71
+
72
+
73
+ ### Results
74
+
75
+
76
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b81dff25b0493d515e317c/vGtALYEVbY_T5MIQ-_ZaY.png)
77
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b81dff25b0493d515e317c/2cKX1OayEZEhjKQpBBBrG.png)
78
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b81dff25b0493d515e317c/WRzIQNeiKvNw0775Lq0FD.png)
79
+
80
+ ## Model Architecture
81
+
82
+ GPT2 config with the following changes:
83
+
84
+ - VOCAB_SIZE = 75
85
+ - N_POSITION = 512
86
+ - PAD_TOKEN_ID = 0
87
+ - EOS_TOKEN_ID = 74