Florian Leuerer commited on
Commit
510e903
1 Parent(s): 9631045
Files changed (2) hide show
  1. README.md +41 -1
  2. app.py +1 -1
README.md CHANGED
@@ -10,4 +10,44 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license: mit
11
  ---
12
 
13
+ # Dataset
14
+
15
+ The dataset usesd is https://huggingface.co/datasets/lmsys/chatbot_arena_conversations
16
+
17
+ Preprocessing:
18
+ - filtered german conversations
19
+ - took first user prompt
20
+ - deleted short prompts (less than 70 chars)
21
+
22
+ ```python
23
+ dataset = load_dataset('lmsys/chatbot_arena_conversations')
24
+
25
+ def get_message(x):
26
+ x['message'] = [x['conversation_a'][0]]
27
+ return x
28
+
29
+ dataset = dataset.filter(lambda x: x['language'] == 'German')
30
+ dataset = dataset['train'].map(get_message)
31
+ dataset = dataset.filter(lambda x: len(x['message'][0]['content']) > 70)
32
+ ```
33
+
34
+ # Generation
35
+
36
+ I rely on the huggingface `conversational` pipeline to generate the outputs. There are some issues with the chat template (esp. for the non-instruction tuned models) i'll fix later.
37
+
38
+ ```python
39
+ messages = json.loads(Path('messages.json').read_text())
40
+ outputs = []
41
+ pipe = pipeline(
42
+ "conversational",
43
+ model=model_name,
44
+ torch_dtype="auto",
45
+ device_map=device,
46
+ max_new_tokens=1024,
47
+ trust_remote_code=True
48
+ )
49
+
50
+ for message in tqdm(messages):
51
+ output = pipe([message])
52
+ outputs.append(output)
53
+ ```
app.py CHANGED
@@ -26,7 +26,7 @@ with gr.Blocks() as iface:
26
  drop_model1 = gr.Dropdown(models, label='Model 1', value=random.choice(models))
27
  drop_model2 = gr.Dropdown(models, label='Model 2', value=random.choice(models))
28
  with gr.Row():
29
- btn = gr.Button("Run")
30
  with gr.Row():
31
  out_message = gr.TextArea(label='Prompt')
32
  with gr.Row():
 
26
  drop_model1 = gr.Dropdown(models, label='Model 1', value=random.choice(models))
27
  drop_model2 = gr.Dropdown(models, label='Model 2', value=random.choice(models))
28
  with gr.Row():
29
+ btn = gr.Button("Show Outputs")
30
  with gr.Row():
31
  out_message = gr.TextArea(label='Prompt')
32
  with gr.Row():