metadata
tags:
- generated_from_trainer
datasets:
- nilq/small-lua-stack
metrics:
- accuracy
model-index:
- name: lua-mistral-1L-mini
results:
- task:
name: Causal Language Modeling
type: text-generation
dataset:
name: nilq/small-lua-stack
type: nilq/small-lua-stack
metrics:
- name: Accuracy
type: accuracy
value: 0.4208221928842605
lua-mistral-1L-mini
This model is a mini single-layer Mistral model pre-trained on on the nilq/small-lua-stack
dataset.
It achieves the following results on the evaluation set:
- Loss: 3.0245
- Accuracy: 0.4208
Model description
This model might contain some very simple model of Lua.
Intended uses & limitations
Let's see if we can find some interesting stuff inside this model.
Training and evaluation data
Trained on the Lua subset of The Stack.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0006
- train_batch_size: 64
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 3.0
Training results
- Loss: 3.016
Framework versions
- Transformers 4.38.1
- Pytorch 2.2.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2