austindavis
/

chess-gpt2-uci-8x8x512

@@ -1,199 +1,128 @@
 ---
-library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
+base_model: austindavis/gpt2-lichess-uci-2016-01_11
+tags:
+- generated_from_trainer
+model-index:
+- name: gpt2-lichess-uci-202306
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# gpt2-lichess-uci-202306
+This model is a fine-tuned version of [austindavis/gpt2-lichess-uci-2016-01_11](https://huggingface.co/austindavis/gpt2-lichess-uci-2016-01_11) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.8839
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.002
+- train_batch_size: 20
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step    | Validation Loss |
+|:-------------:|:------:|:-------:|:---------------:|
+| 1.022         | 0.1323 | 165000  | 1.0013          |
+| 1.0204        | 0.1443 | 180000  | 1.0001          |
+| 1.0186        | 0.1563 | 195000  | 0.9973          |
+| 1.0155        | 0.1684 | 210000  | 0.9954          |
+| 1.0133        | 0.1804 | 225000  | 0.9935          |
+| 1.0118        | 0.1924 | 240000  | 0.9924          |
+| 1.0092        | 0.2044 | 255000  | 0.9893          |
+| 1.007         | 0.2165 | 270000  | 0.9881          |
+| 1.0057        | 0.2285 | 285000  | 0.9868          |
+| 1.0035        | 0.2405 | 300000  | 0.9879          |
+| 1.004         | 0.2525 | 315000  | 0.9843          |
+| 1.0005        | 0.2646 | 330000  | 0.9807          |
+| 0.9986        | 0.2766 | 345000  | 0.9805          |
+| 0.9983        | 0.2886 | 360000  | 0.9776          |
+| 0.9965        | 0.3006 | 375000  | 0.9781          |
+| 0.9935        | 0.3127 | 390000  | 0.9754          |
+| 0.9935        | 0.3247 | 405000  | 0.9761          |
+| 0.9916        | 0.3367 | 420000  | 0.9743          |
+| 0.989         | 0.3487 | 435000  | 0.9712          |
+| 0.988         | 0.3608 | 450000  | 0.9702          |
+| 0.9862        | 0.3728 | 465000  | 0.9703          |
+| 0.9837        | 0.3848 | 480000  | 0.9680          |
+| 0.983         | 0.3968 | 495000  | 0.9643          |
+| 0.9816        | 0.4089 | 510000  | 0.9634          |
+| 0.9796        | 0.4209 | 525000  | 0.9628          |
+| 0.9777        | 0.4329 | 540000  | 0.9612          |
+| 0.9744        | 0.4449 | 555000  | 0.9587          |
+| 0.9733        | 0.4570 | 570000  | 0.9590          |
+| 0.97          | 0.4690 | 585000  | 0.9566          |
+| 0.9693        | 0.4810 | 600000  | 0.9539          |
+| 0.9684        | 0.4930 | 615000  | 0.9532          |
+| 0.9652        | 0.5051 | 630000  | 0.9509          |
+| 0.9644        | 0.5171 | 645000  | 0.9501          |
+| 0.9614        | 0.5291 | 660000  | 0.9479          |
+| 0.9606        | 0.5411 | 675000  | 0.9466          |
+| 0.9597        | 0.5532 | 690000  | 0.9444          |
+| 0.9556        | 0.5652 | 705000  | 0.9416          |
+| 0.9541        | 0.5772 | 720000  | 0.9413          |
+| 0.9522        | 0.5892 | 735000  | 0.9382          |
+| 0.9491        | 0.6013 | 750000  | 0.9367          |
+| 0.9471        | 0.6133 | 765000  | 0.9354          |
+| 0.9459        | 0.6253 | 780000  | 0.9321          |
+| 0.9416        | 0.6373 | 795000  | 0.9309          |
+| 0.9401        | 0.6494 | 810000  | 0.9287          |
+| 0.9383        | 0.6614 | 825000  | 0.9265          |
+| 0.9375        | 0.6734 | 840000  | 0.9238          |
+| 0.9354        | 0.6854 | 855000  | 0.9225          |
+| 0.9323        | 0.6975 | 870000  | 0.9196          |
+| 0.9291        | 0.7095 | 885000  | 0.9189          |
+| 0.9276        | 0.7215 | 900000  | 0.9165          |
+| 0.9266        | 0.7335 | 915000  | 0.9142          |
+| 0.9221        | 0.7456 | 930000  | 0.9130          |
+| 0.9216        | 0.7576 | 945000  | 0.9106          |
+| 0.9191        | 0.7696 | 960000  | 0.9084          |
+| 0.9152        | 0.7816 | 975000  | 0.9062          |
+| 0.9127        | 0.7937 | 990000  | 0.9039          |
+| 0.9133        | 0.8057 | 1005000 | 0.9014          |
+| 0.9086        | 0.8177 | 1020000 | 0.8997          |
+| 0.9078        | 0.8297 | 1035000 | 0.8978          |
+| 0.9054        | 0.8418 | 1050000 | 0.8955          |
+| 0.9037        | 0.8538 | 1065000 | 0.8943          |
+| 0.9015        | 0.8658 | 1080000 | 0.8926          |
+| 0.9006        | 0.8778 | 1095000 | 0.8912          |
+| 0.8991        | 0.8899 | 1110000 | 0.8897          |
+| 0.897         | 0.9019 | 1125000 | 0.8885          |
+| 0.8971        | 0.9139 | 1140000 | 0.8873          |
+| 0.894         | 0.9259 | 1155000 | 0.8864          |
+| 0.8938        | 0.9380 | 1170000 | 0.8854          |
+| 0.893         | 0.9500 | 1185000 | 0.8848          |
+| 0.8922        | 0.9620 | 1200000 | 0.8844          |
+| 0.8936        | 0.9740 | 1215000 | 0.8841          |
+| 0.8923        | 0.9861 | 1230000 | 0.8840          |
+| 0.8922        | 0.9981 | 1245000 | 0.8839          |
+### Framework versions
+- Transformers 4.40.1
+- Pytorch 2.3.0
+- Datasets 2.19.1
+- Tokenizers 0.19.1

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8ccbb70be6735c24a2b94dd04e824e7192a519380ac3a73b87084cf76d3c9772
 size 102086376

 version https://git-lfs.github.com/spec/v1
+oid sha256:12ce7e4f8c5ca4c632850af3d296fe8a118132f199f2853b0b7484336d6f97ff
 size 102086376