LlamaCorn-1.1B / README.md

Adding Evaluation Results (#1)

faae4d4 verified 8 months ago

4.16 kB

	---
	license: apache-2.0
	tags:
	- alignment-handbook
	- generated_from_trainer
	- trl
	- sft
	- generated_from_trainer
	datasets:
	- jan-hq/bagel_sft_binarized
	- jan-hq/dolphin_binarized
	- jan-hq/openhermes_binarized
	base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
	model-index:
	- name: LlamaCorn-sft-adapter
	results: []
	---
	<!-- header start -->
	<!-- 200823 -->

	<div style="width: auto; margin-left: auto; margin-right: auto"
	>
	<img src="https://github.com/janhq/jan/assets/89722390/35daac7d-b895-487c-a6ac-6663daaad78e" alt="Jan banner"
	style="width: 100%; min-width: 400px; display: block; margin: auto;">
	</div>

	<p align="center">
	<a href="https://jan.ai/">Jan</a
	>
	- <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
	</p>
	<!-- header end -->

	# Prompt template

	ChatML
	```
	<\|im_start\|>system
	{system_message}<\|im_end\|>
	<\|im_start\|>user
	{prompt}<\|im_end\|>
	<\|im_start\|>assistant

	```

	# Run this model
	You can run this model using [Jan Desktop](https://jan.ai/) on Mac, Windows, or Linux.

	Jan is an open source, ChatGPT alternative that is:

	- 💻 100% offline on your machine: Your conversations remain confidential, and visible only to you.
	- 🗂️ **
	An Open File Format**: Conversations and model settings stay on your computer and can be exported or deleted at any time.
	- 🌐 OpenAI Compatible: Local server on port `1337` with OpenAI compatible endpoints

	- 🌍 Open Source & Free: We build in public; check out our [Github](https://github.com/janhq)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/65713d70f56f9538679e5a56/r7VmEBLGXpPLTu2MImM7S.png)


	# About Jan
	Jan believes in the need for an open-source AI ecosystem and is building the infra and tooling to allow open-source AIs to compete on a level playing field with proprietary ones.

	Jan's long-term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life.
	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# LlamaCorn-sft-adapter

	This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on the jan-hq/bagel_sft_binarized, the jan-hq/dolphin_binarized and the jan-hq/openhermes_binarized datasets.
	It achieves the following results on the evaluation set:
	- Loss: 0.9638

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 7e-05
	- train_batch_size: 8
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 2
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 64
	- total_eval_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 1.038 \| 1.0 \| 6606 \| 1.0506 \|
	\| 0.876 \| 2.0 \| 13212 \| 0.9648 \|
	\| 0.7713 \| 3.0 \| 19818 \| 0.9638 \|


	### Framework versions

	- Transformers 4.36.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.14.6
	- Tokenizers 0.15.0

	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_jan-hq__LlamaCorn-1.1B)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|36.94\|
	\|AI2 Reasoning Challenge (25-Shot)\|34.13\|
	\|HellaSwag (10-Shot) \|59.33\|
	\|MMLU (5-Shot) \|29.01\|
	\|TruthfulQA (0-shot) \|36.78\|
	\|Winogrande (5-shot) \|61.96\|
	\|GSM8k (5-shot) \| 0.45\|