Text Generation
Transformers
Safetensors
English
deberta
reward_model
reward-model
RLHF
evaluation
llm
instruction
reranking
Inference Endpoints
Dongfu Jiang commited on
Commit
5e3cb95
1 Parent(s): 773d066

Create config.json

Browse files
Files changed (1) hide show
  1. config.json +19 -0
config.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "ranker_type": "pairranker",
3
+ "model_type": "deberta",
4
+ "model_name": "microsoft/deberta-v3-large",
5
+ "cache_dir": "./hf_models/deberta-v3-large/",
6
+ "load_checkpoint": null,
7
+ "source_maxlength": 1224,
8
+ "candidate_maxlength": 412,
9
+ "n_tasks": 1,
10
+ "num_pos": 5,
11
+ "num_neg": 5,
12
+ "sub_sampling_mode": "all_pair",
13
+ "sub_sampling_ratio": 0.4,
14
+ "loss_type": "instructgpt",
15
+ "reduce_type": "linear",
16
+ "inference_mode": "bubble",
17
+ "drop_out": 0.05,
18
+ "fp16": true
19
+ }