Not-so-bright-AGI-Llama3-8B-UC200k-v1

Model Type: Fine-Tuned

Model Base: meta-llama/Meta-Llama-3-8B

Datasets Used: HuggingFaceH4/ultrachat_200k

Author: Yuri Achermann

Date: July 29, 2024

Training procedure

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 8
eval_batch_size: 8
seed: 100
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.05

Framework versions

PEFT==0.11.1
Transformers==4.41.2
Pytorch==2.1.0.post0+cxx11.abi
Datasets==2.19.2
Tokenizers==0.19.1

Intended uses & limitations

Primary Use Case: The model is intended for generating human-like responses in conversational applications, like chatbots or virtual assistants.

Limitations: The model may generate inaccurate or biased content as it reflects the data it was trained on. It is essential to evaluate the generated responses in context and use the model responsibly.

Evaluation

The evaluation platform consists of Gaudi Accelerators and Xeon CPUs running benchmarks from the Eleuther AI Language Model Evaluation Harness

Average	ARC	HellaSwag	MMLU	TruthfulQA	Winogrande
64.166	55.89	75.6	65.79	52.28	71.27

Ethical Considerations

The model may inherit biases present in the training data. It is crucial to use the model in a way that promotes fairness and mitigates potential biases.

Acknowledgments

This fine-tuning effort was made possible by the support of Intel, that provided the computing resources, and Eduardo Alvarez. Additional shout-out to the creators of the Meta-Llama-3-8B model and the contributors to the databricks-dolly-15k dataset.

Contact Information

For questions or feedback about this model, please contact Yuri Achermann.

License

This model is distributed under Apache 2.0 License.

yuriachermann
/

Not-so-bright-AGI-Llama3-8B-UC200k-v1

Not-so-bright-AGI-Llama3-8B-UC200k-v1

Training procedure

Training Hyperparameters

Framework versions

Intended uses & limitations

Evaluation

Ethical Considerations

Acknowledgments

Contact Information

License

Model tree for yuriachermann/Not-so-bright-AGI-Llama3-8B-UC200k-v1

Dataset used to train yuriachermann/Not-so-bright-AGI-Llama3-8B-UC200k-v1

Collection including yuriachermann/Not-so-bright-AGI-Llama3-8B-UC200k-v1

yuriachermann's Intel LLM Leaderboard models

Evaluation results