|
--- |
|
language: |
|
- en |
|
license: mit |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- tomekkorbak/detoxify-pile-chunk3-50000-100000 |
|
- tomekkorbak/detoxify-pile-chunk3-100000-150000 |
|
- tomekkorbak/detoxify-pile-chunk3-150000-200000 |
|
- tomekkorbak/detoxify-pile-chunk3-200000-250000 |
|
- tomekkorbak/detoxify-pile-chunk3-250000-300000 |
|
- tomekkorbak/detoxify-pile-chunk3-300000-350000 |
|
- tomekkorbak/detoxify-pile-chunk3-350000-400000 |
|
- tomekkorbak/detoxify-pile-chunk3-400000-450000 |
|
- tomekkorbak/detoxify-pile-chunk3-450000-500000 |
|
- tomekkorbak/detoxify-pile-chunk3-500000-550000 |
|
- tomekkorbak/detoxify-pile-chunk3-550000-600000 |
|
- tomekkorbak/detoxify-pile-chunk3-600000-650000 |
|
- tomekkorbak/detoxify-pile-chunk3-650000-700000 |
|
- tomekkorbak/detoxify-pile-chunk3-700000-750000 |
|
- tomekkorbak/detoxify-pile-chunk3-750000-800000 |
|
- tomekkorbak/detoxify-pile-chunk3-800000-850000 |
|
- tomekkorbak/detoxify-pile-chunk3-850000-900000 |
|
- tomekkorbak/detoxify-pile-chunk3-900000-950000 |
|
- tomekkorbak/detoxify-pile-chunk3-950000-1000000 |
|
- tomekkorbak/detoxify-pile-chunk3-1000000-1050000 |
|
- tomekkorbak/detoxify-pile-chunk3-1050000-1100000 |
|
- tomekkorbak/detoxify-pile-chunk3-1100000-1150000 |
|
- tomekkorbak/detoxify-pile-chunk3-1150000-1200000 |
|
- tomekkorbak/detoxify-pile-chunk3-1200000-1250000 |
|
- tomekkorbak/detoxify-pile-chunk3-1250000-1300000 |
|
- tomekkorbak/detoxify-pile-chunk3-1300000-1350000 |
|
- tomekkorbak/detoxify-pile-chunk3-1350000-1400000 |
|
- tomekkorbak/detoxify-pile-chunk3-1400000-1450000 |
|
- tomekkorbak/detoxify-pile-chunk3-1450000-1500000 |
|
- tomekkorbak/detoxify-pile-chunk3-1500000-1550000 |
|
- tomekkorbak/detoxify-pile-chunk3-1550000-1600000 |
|
- tomekkorbak/detoxify-pile-chunk3-1600000-1650000 |
|
- tomekkorbak/detoxify-pile-chunk3-1650000-1700000 |
|
- tomekkorbak/detoxify-pile-chunk3-1700000-1750000 |
|
- tomekkorbak/detoxify-pile-chunk3-1750000-1800000 |
|
- tomekkorbak/detoxify-pile-chunk3-1800000-1850000 |
|
model-index: |
|
- name: kejian/cpsc-quark10-log5 |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# kejian/cpsc-quark10-log5 |
|
|
|
This model was trained from scratch on the tomekkorbak/detoxify-pile-chunk3-50000-100000, the tomekkorbak/detoxify-pile-chunk3-100000-150000, the tomekkorbak/detoxify-pile-chunk3-150000-200000, the tomekkorbak/detoxify-pile-chunk3-200000-250000, the tomekkorbak/detoxify-pile-chunk3-250000-300000, the tomekkorbak/detoxify-pile-chunk3-300000-350000, the tomekkorbak/detoxify-pile-chunk3-350000-400000, the tomekkorbak/detoxify-pile-chunk3-400000-450000, the tomekkorbak/detoxify-pile-chunk3-450000-500000, the tomekkorbak/detoxify-pile-chunk3-500000-550000, the tomekkorbak/detoxify-pile-chunk3-550000-600000, the tomekkorbak/detoxify-pile-chunk3-600000-650000, the tomekkorbak/detoxify-pile-chunk3-650000-700000, the tomekkorbak/detoxify-pile-chunk3-700000-750000, the tomekkorbak/detoxify-pile-chunk3-750000-800000, the tomekkorbak/detoxify-pile-chunk3-800000-850000, the tomekkorbak/detoxify-pile-chunk3-850000-900000, the tomekkorbak/detoxify-pile-chunk3-900000-950000, the tomekkorbak/detoxify-pile-chunk3-950000-1000000, the tomekkorbak/detoxify-pile-chunk3-1000000-1050000, the tomekkorbak/detoxify-pile-chunk3-1050000-1100000, the tomekkorbak/detoxify-pile-chunk3-1100000-1150000, the tomekkorbak/detoxify-pile-chunk3-1150000-1200000, the tomekkorbak/detoxify-pile-chunk3-1200000-1250000, the tomekkorbak/detoxify-pile-chunk3-1250000-1300000, the tomekkorbak/detoxify-pile-chunk3-1300000-1350000, the tomekkorbak/detoxify-pile-chunk3-1350000-1400000, the tomekkorbak/detoxify-pile-chunk3-1400000-1450000, the tomekkorbak/detoxify-pile-chunk3-1450000-1500000, the tomekkorbak/detoxify-pile-chunk3-1500000-1550000, the tomekkorbak/detoxify-pile-chunk3-1550000-1600000, the tomekkorbak/detoxify-pile-chunk3-1600000-1650000, the tomekkorbak/detoxify-pile-chunk3-1650000-1700000, the tomekkorbak/detoxify-pile-chunk3-1700000-1750000, the tomekkorbak/detoxify-pile-chunk3-1750000-1800000 and the tomekkorbak/detoxify-pile-chunk3-1800000-1850000 datasets. |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0005 |
|
- train_batch_size: 32 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 2 |
|
- total_train_batch_size: 64 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.01 |
|
- training_steps: 42724 |
|
- mixed_precision_training: Native AMP |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.23.0 |
|
- Pytorch 1.13.0+cu116 |
|
- Datasets 2.0.0 |
|
- Tokenizers 0.12.1 |
|
|
|
|
|
# Full config |
|
{'dataset': {'conditional_training_config': {'aligned_prefix': '<|aligned|>', |
|
'drop_token_fraction': 0.02, |
|
'misaligned_prefix': '<|misaligned|>', |
|
'prefix_2': '<|2|>', |
|
'prefix_3': '<|3|>', |
|
'prefix_4': '<|4|>', |
|
'prefix_5': '<|5|>', |
|
'prefix_6': '<|6|>', |
|
'prefix_7': '<|7|>', |
|
'prefix_8': '<|8|>', |
|
'prefix_9': '<|9|>', |
|
'threshold1': 0.0005623, |
|
'threshold10': 0.06, |
|
'threshold2': 0.000573, |
|
'threshold3': 0.000586, |
|
'threshold4': 0.000604, |
|
'threshold5': 0.00063, |
|
'threshold6': 0.000671, |
|
'threshold7': 0.000752, |
|
'threshold8': 0.000973, |
|
'threshold9': 0.006}, |
|
'datasets': ['tomekkorbak/detoxify-pile-chunk3-50000-100000', |
|
'tomekkorbak/detoxify-pile-chunk3-100000-150000', |
|
'tomekkorbak/detoxify-pile-chunk3-150000-200000', |
|
'tomekkorbak/detoxify-pile-chunk3-200000-250000', |
|
'tomekkorbak/detoxify-pile-chunk3-250000-300000', |
|
'tomekkorbak/detoxify-pile-chunk3-300000-350000', |
|
'tomekkorbak/detoxify-pile-chunk3-350000-400000', |
|
'tomekkorbak/detoxify-pile-chunk3-400000-450000', |
|
'tomekkorbak/detoxify-pile-chunk3-450000-500000', |
|
'tomekkorbak/detoxify-pile-chunk3-500000-550000', |
|
'tomekkorbak/detoxify-pile-chunk3-550000-600000', |
|
'tomekkorbak/detoxify-pile-chunk3-600000-650000', |
|
'tomekkorbak/detoxify-pile-chunk3-650000-700000', |
|
'tomekkorbak/detoxify-pile-chunk3-700000-750000', |
|
'tomekkorbak/detoxify-pile-chunk3-750000-800000', |
|
'tomekkorbak/detoxify-pile-chunk3-800000-850000', |
|
'tomekkorbak/detoxify-pile-chunk3-850000-900000', |
|
'tomekkorbak/detoxify-pile-chunk3-900000-950000', |
|
'tomekkorbak/detoxify-pile-chunk3-950000-1000000', |
|
'tomekkorbak/detoxify-pile-chunk3-1000000-1050000', |
|
'tomekkorbak/detoxify-pile-chunk3-1050000-1100000', |
|
'tomekkorbak/detoxify-pile-chunk3-1100000-1150000', |
|
'tomekkorbak/detoxify-pile-chunk3-1150000-1200000', |
|
'tomekkorbak/detoxify-pile-chunk3-1200000-1250000', |
|
'tomekkorbak/detoxify-pile-chunk3-1250000-1300000', |
|
'tomekkorbak/detoxify-pile-chunk3-1300000-1350000', |
|
'tomekkorbak/detoxify-pile-chunk3-1350000-1400000', |
|
'tomekkorbak/detoxify-pile-chunk3-1400000-1450000', |
|
'tomekkorbak/detoxify-pile-chunk3-1450000-1500000', |
|
'tomekkorbak/detoxify-pile-chunk3-1500000-1550000', |
|
'tomekkorbak/detoxify-pile-chunk3-1550000-1600000', |
|
'tomekkorbak/detoxify-pile-chunk3-1600000-1650000', |
|
'tomekkorbak/detoxify-pile-chunk3-1650000-1700000', |
|
'tomekkorbak/detoxify-pile-chunk3-1700000-1750000', |
|
'tomekkorbak/detoxify-pile-chunk3-1750000-1800000', |
|
'tomekkorbak/detoxify-pile-chunk3-1800000-1850000'], |
|
'is_split_by_sentences': True}, |
|
'generation': {'force_call_on': [21362], |
|
'metrics_configs': [{}, {'n': 1}, {'n': 2}, {'n': 5}], |
|
'scenario_configs': [{'generate_kwargs': {'bad_words_ids': [[50257], |
|
[50258], |
|
[50259], |
|
[50260], |
|
[50261], |
|
[50262], |
|
[50263], |
|
[50264], |
|
[50265], |
|
[50266]], |
|
'do_sample': True, |
|
'max_length': 128, |
|
'min_length': 10, |
|
'temperature': 0.7, |
|
'top_k': 0, |
|
'top_p': 0.9}, |
|
'name': 'unconditional', |
|
'num_samples': 2048, |
|
'prefix': '<|aligned|>'}, |
|
{'generate_kwargs': {'bad_words_ids': [[50257], |
|
[50258], |
|
[50259], |
|
[50260], |
|
[50261], |
|
[50262], |
|
[50263], |
|
[50264], |
|
[50265], |
|
[50266]], |
|
'do_sample': True, |
|
'max_length': 128, |
|
'min_length': 10, |
|
'temperature': 0.7, |
|
'top_k': 0, |
|
'top_p': 0.9}, |
|
'name': 'challenging_rtp', |
|
'num_samples': 1024, |
|
'prefix': '<|aligned|>', |
|
'prompt_before_control': True, |
|
'prompts_path': 'resources/challenging_rtp.jsonl'}, |
|
{'generate_kwargs': {'bad_words_ids': [[50257], |
|
[50258], |
|
[50259], |
|
[50260], |
|
[50261], |
|
[50262], |
|
[50263], |
|
[50264], |
|
[50265], |
|
[50266]], |
|
'do_sample': True, |
|
'max_length': 128, |
|
'min_length': 10, |
|
'temperature': 0.7, |
|
'top_k': 0, |
|
'top_p': 0.9}, |
|
'name': 'challenging_rtp-bad-control', |
|
'num_samples': 1024, |
|
'prefix': '<|misaligned|>', |
|
'prompt_before_control': True, |
|
'prompts_path': 'resources/challenging_rtp.jsonl'}], |
|
'scorer_config': {'device': 'cuda:0'}}, |
|
'kl_gpt3_callback': {'force_call_on': [21362], |
|
'gpt3_kwargs': {'model_name': 'davinci'}, |
|
'max_tokens': 64, |
|
'num_samples': 2048, |
|
'prefix': '<|aligned|>', |
|
'should_insert_prefix': True}, |
|
'model': {'from_scratch': True, |
|
'gpt2_config_kwargs': {'reorder_and_upcast_attn': True, |
|
'scale_attn_by': True}, |
|
'num_additional_tokens': 10, |
|
'path_or_name': 'gpt2'}, |
|
'objective': {'name': 'MLE'}, |
|
'tokenizer': {'path_or_name': 'gpt2', |
|
'special_tokens': ['<|aligned|>', |
|
'<|2|>', |
|
'<|3|>', |
|
'<|4|>', |
|
'<|5|>', |
|
'<|6|>', |
|
'<|7|>', |
|
'<|8|>', |
|
'<|9|>', |
|
'<|misaligned|>']}, |
|
'training': {'dataloader_num_workers': 0, |
|
'effective_batch_size': 64, |
|
'evaluation_strategy': 'no', |
|
'fp16': True, |
|
'hub_model_id': 'kejian/cpsc-quark10-log5', |
|
'hub_strategy': 'all_checkpoints', |
|
'learning_rate': 0.0005, |
|
'logging_first_step': True, |
|
'logging_steps': 20, |
|
'num_tokens': 2800000000.0, |
|
'output_dir': 'training_output_10_log5', |
|
'per_device_train_batch_size': 16, |
|
'push_to_hub': True, |
|
'remove_unused_columns': False, |
|
'save_steps': 21362, |
|
'save_strategy': 'steps', |
|
'seed': 42, |
|
'warmup_ratio': 0.01, |
|
'weight_decay': 0.1}} |
|
|
|
# Wandb URL: |
|
https://wandb.ai/kejian/uncategorized/runs/ool7nsry |