Edit model card

Chess Language Model

This model is a GPT-2-based language model trained on chess game sequences using our unique xLAN notation, focusing on predicting legal chess moves and understanding game dynamics. For more information see GitHub

Model Details

Model Description

  • Developed by: Schmid Lars, Maag Jerome
  • Model type: GPT-2 adaptation for chess language understanding
  • Language(s) (NLP): xLAN (Chess Notation)

Model Sources

Uses

Direct Use

The model is intended for predicting chess moves, analyzing game positions, and studying chess strategies.

Out-of-Scope Use

This model is not designed for general language understanding or tasks unrelated to chess.

Bias, Risks, and Limitations

The model reflects the strategies and styles present in the training dataset and may not encompass all possible chess scenarios.

How to Get Started with the Model

To use the model use the Notebooks on GitHub

Training Details

Training Data

The model was trained on the "Leon-LLM/Leon-Chess-Dataset-71k" dataset, sourced from Lichess database (September 2023).

Training Procedure

Preprocessing

Chess games from Lichess were converted from PGN into xLAN as part of the preprocessing.

Training Hyperparameters

  • Batch Size: 23
  • Epochs: 4
  • Learning Rate: 0.0001

Evaluation

Metrics

The model was tested on 3 metrics:

  1. Average Number of Correct Plies: Measures the model's ability to simulate chess games, evaluating the average number of correctly generated plies in 100 games.

  2. Hard Position Accuracy: Assesses the model's handling of 67 challenging chess positions, including unusual castling, pawn promotions, and checkmate scenarios. Success is measured by the model's ability to generate legal moves in these complex positions.

  3. Legal Piece Moves Accuracy: Examines the model's proficiency in keeping the board state various situations, including checks, pinned pieces, and pawn promotions. The metric focuses on the model's understanding of the board state.

Results

image/png image/png image/png

Model Architecture

GPT2 config with the following changes:

  • VOCAB_SIZE = 75
  • N_POSITION = 512
  • PAD_TOKEN_ID = 0
  • EOS_TOKEN_ID = 74
Downloads last month
5

Dataset used to train Leon-LLM/Leon-Chess-71k

Collection including Leon-LLM/Leon-Chess-71k