fblgit's picture
Upload folder using huggingface_hub
0146003
|
raw
history blame
5.43 kB
metadata
license: apache-2.0
base_model: Intel/neural-chat-7b-v3-3
tags:
  - alignment-handbook
  - generated_from_trainer
datasets:
  - allenai/ultrafeedback_binarized_cleaned
model-index:
  - name: una-neural-chat-v3-3
    results: []
     ---   .    ____        -----      ______   -----        .
  ___     / \             .....................      ____   / \
        .'   '.  --  ..:::::''''''''''''''''':::::..      .'   '.
  ---   | ^ ^ |    .::::''''          (_     ''''::::. -- | ^ ^ '
        | ^ ^ |  .::''                       _)    ''::.  | ^ ^ | --
____     '...'  .::'              .-.      (_        '::.  '...'
        .-.!_  .::'       _)     /   \                '::.   ! ____
       / / `-`.:'                '-.-'            _)    ':..""".
 --    ' |  '.|:'      _)         .'.       (_          ':/' |  \
       | |   |'.               _/^---^\_                  |     . --
 ___    \ .  '|               \-------../         (_      \   '.'
        ' :   '        _)      '.\:::/.'       (_   )_    |'   || ___
        | |  .|      _(         | | |'|                   / ' . |
    --  | '. | \                '.\ /.'                   '.  | |--
        |'.   '|                 |[ ]|           (_       | .'  |____
__    .'\ |  .'\                 '.^.'                    \ |.  .
     .'-.\'. | |        _)        (:)                     | ||| |
   .'    \'..' .             _..--'''--.._      (_       /'-._.-'| ---
   |       `-..'.         .-'             '-.           |      .-'.
    \            `-.    .'  ..            .. '.        .'-._.-'    `.
--   )              `-./    '::.        .::'   \   _.-'             /
     '._/-..          /       '::.    .::'      \-'              .-'
         ::.`-.      ''        '::   ::'        ''       _..-\_.'
         :::   '._   | \         '   '         / |    .-'   .:: _____
____     :::      `-.|  '  .----..___..----.  '  | .-'      :::
         :::          \ |  _..--.     .--.._  | /-'         ::: ---
         :::   _)     | ' /     |     |     \ ' |  (        :::
   --    :::          )   |   _.'     '._   |   (   )_      :::____
    ____ :::          /'. \_.'   )\ /(   '._/ .'\     (_    :::
         :::       .-'|  `-->-@ /     \ @->--'  |-.         :::
         :::    .-'   \         | / \ |         /  `-.      :::  ---
 ----    '' _.-'       |        )/   \(        |      `-.   :::  _____
  _.-=--..-'          . \ /\               /\ /          `-. ''
 /.._    `.        .-'   .\ '-.\.\\.//./.-' /.`-.           `---.._
|    `.    \    .-'      | '.             .' |   `-.                \ 
 \    _\.   `.-'         |   '-././.\.\.-'   |      `.               |
  `.-'  |   /::::::::::: \                   /::::::::`.      ,-.    /
 - |   /   /LGB     ----  '-.             .-'     ----  `.    |  \_.'
__ \   | .'     _____        '-._._._._.-'     ____      |    |   |
    `--'                                                 `-.  '._ / --
                                                            `...-'
            MESS WITH THE BEST, DIE LIKE THE REST
      --=-  D*D - R****1911 - F***L***T - P***D*X -=--
                  THE WORLD NEED US BACK :)

OMA, OneManArmy presents, una-neural-chat-v3-3. Powered by UNA (Uniform Neural Alignment), using zephyr trainer, allenai/ultrafeedback cleaned.. and JUST THAT. Outperforming its base model, not adding any data.. just UNA Algorythm on Transformers Lib. UNA Settings:

  • MLP : 0.05
  • ATT : 0.03
  • LNOR : 0.02

una-neural-chat-v3-3

This model is a fine-tuned version of Intel/neural-chat-7b-v3-3 on the allenai/ultrafeedback_binarized_cleaned dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4524
  • Rewards/chosen: -0.7101
  • Rewards/rejected: -2.0953
  • Rewards/accuracies: 0.7831
  • Rewards/margins: 1.3852
  • Logps/rejected: -321.5471
  • Logps/chosen: -327.5048
  • Logits/rejected: -2.6445
  • Logits/chosen: -2.6674

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.5431 0.2 380 0.4900 -0.6823 -1.6613 0.7607 0.9790 -317.2069 -327.2263 -2.6478 -2.6651
0.4369 0.4 760 0.4783 -0.7562 -2.1298 0.7719 1.3737 -321.8924 -327.9652 -2.7370 -2.7562
0.4005 0.6 1140 0.4697 -0.6913 -2.0134 0.7770 1.3221 -320.7278 -327.3167 -2.7067 -2.7224
0.3759 0.8 1520 0.4568 -0.7387 -2.0643 0.7882 1.3256 -321.2370 -327.7909 -2.6626 -2.6829
0.5213 1.0 1900 0.4524 -0.7101 -2.0953 0.7831 1.3852 -321.5471 -327.5048 -2.6445 -2.6674

Framework versions

  • Transformers 4.35.0-UNA
  • Pytorch 2.1.0
  • Datasets 2.14.6
  • Tokenizers 0.14.1