metadata
license: apache-2.0
base_model: Intel/neural-chat-7b-v3-3
tags:
- alignment-handbook
- generated_from_trainer
datasets:
- allenai/ultrafeedback_binarized_cleaned
model-index:
- name: una-neural-chat-v3-3
results: []
--- . ____ ----- ______ ----- .
___ / \ ..................... ____ / \
.' '. -- ..:::::''''''''''''''''':::::.. .' '.
--- | ^ ^ | .::::'''' (_ ''''::::. -- | ^ ^ '
| ^ ^ | .::'' _) ''::. | ^ ^ | --
____ '...' .::' .-. (_ '::. '...'
.-.!_ .::' _) / \ '::. ! ____
/ / `-`.:' '-.-' _) ':..""".
-- ' | '.|:' _) .'. (_ ':/' | \
| | |'. _/^---^\_ | . --
___ \ . '| \-------../ (_ \ '.'
' : ' _) '.\:::/.' (_ )_ |' || ___
| | .| _( | | |'| / ' . |
-- | '. | \ '.\ /.' '. | |--
|'. '| |[ ]| (_ | .' |____
__ .'\ | .'\ '.^.' \ |. .
.'-.\'. | | _) (:) | ||| |
.' \'..' . _..--'''--.._ (_ /'-._.-'| ---
| `-..'. .-' '-. | .-'.
\ `-. .' .. .. '. .'-._.-' `.
-- ) `-./ '::. .::' \ _.-' /
'._/-.. / '::. .::' \-' .-'
::.`-. '' ':: ::' '' _..-\_.'
::: '._ | \ ' ' / | .-' .:: _____
____ ::: `-.| ' .----..___..----. ' | .-' :::
::: \ | _..--. .--.._ | /-' ::: ---
::: _) | ' / | | \ ' | ( :::
-- ::: ) | _.' '._ | ( )_ :::____
____ ::: /'. \_.' )\ /( '._/ .'\ (_ :::
::: .-'| `-->-@ / \ @->--' |-. :::
::: .-' \ | / \ | / `-. ::: ---
---- '' _.-' | )/ \( | `-. ::: _____
_.-=--..-' . \ /\ /\ / `-. ''
/.._ `. .-' .\ '-.\.\\.//./.-' /.`-. `---.._
| `. \ .-' | '. .' | `-. \
\ _\. `.-' | '-././.\.\.-' | `. |
`.-' | /::::::::::: \ /::::::::`. ,-. /
- | / /LGB ---- '-. .-' ---- `. | \_.'
__ \ | .' _____ '-._._._._.-' ____ | | |
`--' `-. '._ / --
`...-'
MESS WITH THE BEST, DIE LIKE THE REST
--=- D*D - R****1911 - F***L***T - P***D*X -=--
THE WORLD NEED US BACK :)
OMA, OneManArmy presents, una-neural-chat-v3-3
. Powered by UNA (Uniform Neural Alignment), using zephyr trainer, allenai/ultrafeedback cleaned.. and JUST THAT.
Outperforming its base model, not adding any data.. just UNA Algorythm on Transformers Lib.
UNA Settings:
- MLP : 0.05
- ATT : 0.03
- LNOR : 0.02
una-neural-chat-v3-3
This model is a fine-tuned version of Intel/neural-chat-7b-v3-3 on the allenai/ultrafeedback_binarized_cleaned dataset. It achieves the following results on the evaluation set:
- Loss: 0.4524
- Rewards/chosen: -0.7101
- Rewards/rejected: -2.0953
- Rewards/accuracies: 0.7831
- Rewards/margins: 1.3852
- Logps/rejected: -321.5471
- Logps/chosen: -327.5048
- Logits/rejected: -2.6445
- Logits/chosen: -2.6674
Training results
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.5431 | 0.2 | 380 | 0.4900 | -0.6823 | -1.6613 | 0.7607 | 0.9790 | -317.2069 | -327.2263 | -2.6478 | -2.6651 |
0.4369 | 0.4 | 760 | 0.4783 | -0.7562 | -2.1298 | 0.7719 | 1.3737 | -321.8924 | -327.9652 | -2.7370 | -2.7562 |
0.4005 | 0.6 | 1140 | 0.4697 | -0.6913 | -2.0134 | 0.7770 | 1.3221 | -320.7278 | -327.3167 | -2.7067 | -2.7224 |
0.3759 | 0.8 | 1520 | 0.4568 | -0.7387 | -2.0643 | 0.7882 | 1.3256 | -321.2370 | -327.7909 | -2.6626 | -2.6829 |
0.5213 | 1.0 | 1900 | 0.4524 | -0.7101 | -2.0953 | 0.7831 | 1.3852 | -321.5471 | -327.5048 | -2.6445 | -2.6674 |
Framework versions
- Transformers 4.35.0-UNA
- Pytorch 2.1.0
- Datasets 2.14.6
- Tokenizers 0.14.1