--- license: apache-2.0 base_model: one-man-army/una-neural-chat-v3-3-P1-OMA tags: - alignment-handbook - generated_from_trainer datasets: - allenai/ultrafeedback_binarized_cleaned model-index: - name: una-neural-chat-v3-3-P2 results: [] --- ``` --- . ____ ----- ______ ----- . ___ / \ ..................... ____ / \ .' '. -- ..:::::''''''''''''''''':::::.. .' '. --- | ^ ^ | .::::'''' (_ ''''::::. -- | ^ ^ ' | ^ ^ | .::'' _) ''::. | ^ ^ | -- ____ '...' .::' .-. (_ '::. '...' .-.!_ .::' _) / \ '::. ! ____ / / `-`.:' '-.-' _) ':..""". -- ' | '.|:' _) .'. (_ ':/' | \ | | |'. _/^---^\_ | . -- ___ \ . '| \-------../ (_ \ '.' ' : ' _) '.\:::/.' (_ )_ |' || ___ | | .| _( | | |'| / ' . | -- | '. | \ '.\ /.' '. | |-- |'. '| |[ ]| (_ | .' |____ __ .'\ | .'\ '.^.' \ |. . .'-.\'. | | _) (:) | ||| | .' \'..' . _..--'''--.._ (_ /'-._.-'| --- | `-..'. .-' '-. | .-'. \ `-. .' .. .. '. .'-._.-' `. -- ) `-./ '::. .::' \ _.-' / '._/-.. / '::. .::' \-' .-' ::.`-. '' ':: ::' '' _..-\_.' ::: '._ | \ ' ' / | .-' .:: _____ ____ ::: `-.| ' .----..___..----. ' | .-' ::: ::: \ | _..--. .--.._ | /-' ::: --- ::: _) | ' / | | \ ' | ( ::: -- ::: ) | _.' '._ | ( )_ :::____ ____ ::: /'. \_.' )\ /( '._/ .'\ (_ ::: ::: .-'| `-->-@ / \ @->--' |-. ::: ::: .-' \ | / \ | / `-. ::: --- ---- '' _.-' | )/ \( | `-. ::: _____ _.-=--..-' . \ /\ /\ / `-. '' /.._ `. .-' .\ '-.\.\\.//./.-' /.`-. `---.._ | `. \ .-' | '. .' | `-. \ \ _\. `.-' | '-././.\.\.-' | `. | `.-' | /::::::::::: \ /::::::::`. ,-. / - | / /LGB ---- '-. .-' ---- `. | \_.' __ \ | .' _____ '-._._._._.-' ____ | | | `--' `-. '._ / -- `...-' ``` MESS WITH THE BEST, DIE LIKE THE REST --=- D*D - R****1911 - F***L***T - P***D*X -=-- THE WORLD NEED US BACK :) OMA, OneManArmy presents, `una-neural-chat-v3-3` **PHASE 2**. Powered by UNA (Uniform Neural Alignment), using zephyr trainer, allenai/ultrafeedback cleaned.. and JUST THAT. Outperforming its base model, not adding any data.. just UNA Algorythm on Transformers Lib. UNA Settings: * MLP : 0.05 * ATT : 0.03 * LNOR : 0.02 # una-neural-chat-v3-3-phase2 This model is a fine-tuned version of [Intel/neural-chat-7b-v3-3](https://huggingface.co/Intel/neural-chat-7b-v3-3) on the allenai/ultrafeedback_binarized_cleaned dataset. It achieves the following results on the evaluation set: - Loss: 0.4524 - Rewards/chosen: -0.7101 - Rewards/rejected: -2.0953 - Rewards/accuracies: 0.7831 - Rewards/margins: 1.3852 - Logps/rejected: -321.5471 - Logps/chosen: -327.5048 - Logits/rejected: -2.6445 - Logits/chosen: -2.6674 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:| | 0.5431 | 0.2 | 380 | 0.4900 | -0.6823 | -1.6613 | 0.7607 | 0.9790 | -317.2069 | -327.2263 | -2.6478 | -2.6651 | | 0.4369 | 0.4 | 760 | 0.4783 | -0.7562 | -2.1298 | 0.7719 | 1.3737 | -321.8924 | -327.9652 | -2.7370 | -2.7562 | | 0.4005 | 0.6 | 1140 | 0.4697 | -0.6913 | -2.0134 | 0.7770 | 1.3221 | -320.7278 | -327.3167 | -2.7067 | -2.7224 | | 0.3759 | 0.8 | 1520 | 0.4568 | -0.7387 | -2.0643 | 0.7882 | 1.3256 | -321.2370 | -327.7909 | -2.6626 | -2.6829 | | 0.5213 | 1.0 | 1900 | 0.4524 | -0.7101 | -2.0953 | 0.7831 | 1.3852 | -321.5471 | -327.5048 | -2.6445 | -2.6674 | ### Framework versions - Transformers 4.35.0-UNA - Pytorch 2.1.0 - Datasets 2.14.6 - Tokenizers 0.14.1