nelson2424 commited on
Commit
70cb507
1 Parent(s): 902c1c7

Updated the report on the training process with the V1_small dataset

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -20,7 +20,7 @@ You can access the different model configurations and results [here](https://wan
20
  To understand the following discussion is important to check the structure of the [nelson2424/Chess_openings_dataset](https://huggingface.co/datasets/nelson2424/Chess_openings_dataset) dataset the V1_small version.
21
 
22
  During the training process, multiple challenges arose.
23
- The first problem was the low accuracy results it was getting, to mitigate that problem I tried the following:
24
  - <b>learning rate:</b>
25
  The first approach to solve this problem was to modify the learning rate. <br>
26
  A step further involved changing the lr scheduler from a linear configuration to a polynomial decay configuration.
@@ -66,7 +66,7 @@ You can access the different model configurations and results [here](https://wan
66
  <br> model the problem at hand due to limited computational resources, I decided to narrow the scope of the problem.
67
  <br> Instead of trying to generate the whole context, the model would only learn to generate moves and the effect they have on the board based on a rich context.
68
  <br> This allowed the model to have a rich representation of the game and predict moves more accurately.
69
- <br> As a result, the data was modified to only mask move predictions and their corresponding effects on the board.
70
  <br> The data now looks as follows:
71
 
72
  ~~~
 
20
  To understand the following discussion is important to check the structure of the [nelson2424/Chess_openings_dataset](https://huggingface.co/datasets/nelson2424/Chess_openings_dataset) dataset the V1_small version.
21
 
22
  During the training process, multiple challenges arose.
23
+ -The first problem was the <b>low accuracy</b> in the results the model was getting, to mitigate that problem, I tried the following:
24
  - <b>learning rate:</b>
25
  The first approach to solve this problem was to modify the learning rate. <br>
26
  A step further involved changing the lr scheduler from a linear configuration to a polynomial decay configuration.
 
66
  <br> model the problem at hand due to limited computational resources, I decided to narrow the scope of the problem.
67
  <br> Instead of trying to generate the whole context, the model would only learn to generate moves and the effect they have on the board based on a rich context.
68
  <br> This allowed the model to have a rich representation of the game and predict moves more accurately.
69
+ <br> As a result, the data was modified only to mask move predictions and their corresponding effects on the board.
70
  <br> The data now looks as follows:
71
 
72
  ~~~