davanstrien HF staff commited on
Commit
863586c
1 Parent(s): b8bab75

Add BERTopic model

Browse files
Files changed (4) hide show
  1. README.md +77 -0
  2. config.json +14 -0
  3. topic_embeddings.safetensors +3 -0
  4. topics.json +930 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - bertopic
5
+ library_name: bertopic
6
+ ---
7
+
8
+ # BERTopic_model_card_bias
9
+
10
+ This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
11
+ BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
12
+
13
+ ## Usage
14
+
15
+ To use this model, please install BERTopic:
16
+
17
+ ```
18
+ pip install -U bertopic
19
+ ```
20
+
21
+ You can use the model as follows:
22
+
23
+ ```python
24
+ from bertopic import BERTopic
25
+ topic_model = BERTopic.load("davanstrien/BERTopic_model_card_bias")
26
+
27
+ topic_model.get_topic_info()
28
+ ```
29
+
30
+ ## Topic overview
31
+
32
+ * Number of topics: 11
33
+ * Number of training documents: 1271
34
+
35
+ <details>
36
+ <summary>Click here for an overview of all topics.</summary>
37
+
38
+ | Topic ID | Topic Keywords | Topic Frequency | Label |
39
+ |----------|----------------|-----------------|-------|
40
+ | -1 | evaluation - claim - reasoning - parameters - university | 13 | -1_evaluation_claim_reasoning_parameters |
41
+ | 0 | checkpoint - fairly - characterized - even - sectionhttpshuggingfacecobertbaseuncased | 13 | 0_checkpoint_fairly_characterized_even |
42
+ | 1 | generative - research - uses - processes - artistic | 137 | 1_generative_research_uses_processes |
43
+ | 2 | checkpoint - try - snippet - sectionhttpshuggingfacecobertbaseuncased - limitation | 48 | 2_checkpoint_try_snippet_sectionhttpshuggingfacecobertbaseuncased |
44
+ | 3 | meant - technical - sociotechnical - convey - needed | 32 | 3_meant_technical_sociotechnical_convey |
45
+ | 4 | gpt2 - team - their - cardhttpsgithubcomopenaigpt2blobmastermodelcardmd - worked | 32 | 4_gpt2_team_their_cardhttpsgithubcomopenaigpt2blobmastermodelcardmd |
46
+ | 5 | datasets - internet - unfiltered - therefore - lot | 27 | 5_datasets_internet_unfiltered_therefore |
47
+ | 6 | dacy - danish - pipelines - transformer - bert | 25 | 6_dacy_danish_pipelines_transformer |
48
+ | 7 | your - pythia - branch - checkpoints - provide | 20 | 7_your_pythia_branch_checkpoints |
49
+ | 8 | opt - trained - large - software - code | 15 | 8_opt_trained_large_software |
50
+ | 9 | al - et - identity - occupational - groups | 15 | 9_al_et_identity_occupational |
51
+
52
+ </details>
53
+
54
+ ## Training hyperparameters
55
+
56
+ * calculate_probabilities: False
57
+ * language: english
58
+ * low_memory: False
59
+ * min_topic_size: 10
60
+ * n_gram_range: (1, 1)
61
+ * nr_topics: None
62
+ * seed_topic_list: None
63
+ * top_n_words: 10
64
+ * verbose: False
65
+
66
+ ## Framework versions
67
+
68
+ * Numpy: 1.22.4
69
+ * HDBSCAN: 0.8.29
70
+ * UMAP: 0.5.3
71
+ * Pandas: 1.5.3
72
+ * Scikit-Learn: 1.2.2
73
+ * Sentence-transformers: 2.2.2
74
+ * Transformers: 4.29.0
75
+ * Numba: 0.56.4
76
+ * Plotly: 5.13.1
77
+ * Python: 3.10.11
config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "calculate_probabilities": false,
3
+ "language": "english",
4
+ "low_memory": false,
5
+ "min_topic_size": 10,
6
+ "n_gram_range": [
7
+ 1,
8
+ 1
9
+ ],
10
+ "nr_topics": null,
11
+ "seed_topic_list": null,
12
+ "top_n_words": 10,
13
+ "verbose": false
14
+ }
topic_embeddings.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bce073d3c03d316910124db4abd74c3ec33a0c59e4ea3b8dca8d643bff27bf88
3
+ size 16984
topics.json ADDED
@@ -0,0 +1,930 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "topic_representations": {
3
+ "-1": [
4
+ [
5
+ "evaluation",
6
+ 0.6230688553227475
7
+ ],
8
+ [
9
+ "claim",
10
+ 0.5968246831891744
11
+ ],
12
+ [
13
+ "reasoning",
14
+ 0.5754221015746908
15
+ ],
16
+ [
17
+ "parameters",
18
+ 0.517542883360015
19
+ ],
20
+ [
21
+ "university",
22
+ 0.5135697637359796
23
+ ],
24
+ [
25
+ "argumentative",
26
+ 0.5135697637359796
27
+ ],
28
+ [
29
+ "repositoryhttpsgithubcomhuntlaboratorylanguagemodeloptimization",
30
+ 0.5135697637359796
31
+ ],
32
+ [
33
+ "review",
34
+ 0.5135697637359796
35
+ ],
36
+ [
37
+ "gptneo27bhttpshuggingfacecoeleutheraigptneo27b",
38
+ 0.5135697637359796
39
+ ],
40
+ [
41
+ "projecthttpsgithubcomhuntlaboratorylanguagemodeloptimization",
42
+ 0.5135697637359796
43
+ ]
44
+ ],
45
+ "0": [
46
+ [
47
+ "checkpoint",
48
+ 0.37175879363307746
49
+ ],
50
+ [
51
+ "fairly",
52
+ 0.3515890274807403
53
+ ],
54
+ [
55
+ "characterized",
56
+ 0.3515890274807403
57
+ ],
58
+ [
59
+ "even",
60
+ 0.35086147648416083
61
+ ],
62
+ [
63
+ "sectionhttpshuggingfacecobertbaseuncased",
64
+ 0.3479922000487333
65
+ ],
66
+ [
67
+ "snippet",
68
+ 0.3479922000487333
69
+ ],
70
+ [
71
+ "try",
72
+ 0.3479922000487333
73
+ ],
74
+ [
75
+ "limitation",
76
+ 0.34725276087685114
77
+ ],
78
+ [
79
+ "particular",
80
+ 0.3465181958462063
81
+ ],
82
+ [
83
+ "could",
84
+ 0.3452033650501046
85
+ ]
86
+ ],
87
+ "1": [
88
+ [
89
+ "generative",
90
+ 0.548511275172796
91
+ ],
92
+ [
93
+ "research",
94
+ 0.5179454603309872
95
+ ],
96
+ [
97
+ "uses",
98
+ 0.4725663936926501
99
+ ],
100
+ [
101
+ "processes",
102
+ 0.47110219358638483
103
+ ],
104
+ [
105
+ "artistic",
106
+ 0.47110219358638483
107
+ ],
108
+ [
109
+ "probing",
110
+ 0.47110219358638483
111
+ ],
112
+ [
113
+ "creative",
114
+ 0.47110219358638483
115
+ ],
116
+ [
117
+ "design",
118
+ 0.47110219358638483
119
+ ],
120
+ [
121
+ "tools",
122
+ 0.47110219358638483
123
+ ],
124
+ [
125
+ "educational",
126
+ 0.47110219358638483
127
+ ]
128
+ ],
129
+ "2": [
130
+ [
131
+ "checkpoint",
132
+ 0.3814770760817889
133
+ ],
134
+ [
135
+ "try",
136
+ 0.3570891912912861
137
+ ],
138
+ [
139
+ "snippet",
140
+ 0.3570891912912861
141
+ ],
142
+ [
143
+ "sectionhttpshuggingfacecobertbaseuncased",
144
+ 0.3570891912912861
145
+ ],
146
+ [
147
+ "limitation",
148
+ 0.3563304221698531
149
+ ],
150
+ [
151
+ "particular",
152
+ 0.3555766546063644
153
+ ],
154
+ [
155
+ "fairly",
156
+ 0.35261038076997664
157
+ ],
158
+ [
159
+ "characterized",
160
+ 0.35261038076997664
161
+ ],
162
+ [
163
+ "even",
164
+ 0.3518807162643131
165
+ ],
166
+ [
167
+ "present",
168
+ 0.35043527354806714
169
+ ]
170
+ ],
171
+ "3": [
172
+ [
173
+ "meant",
174
+ 0.9976049477912707
175
+ ],
176
+ [
177
+ "technical",
178
+ 0.9976049477912707
179
+ ],
180
+ [
181
+ "sociotechnical",
182
+ 0.9976049477912707
183
+ ],
184
+ [
185
+ "convey",
186
+ 0.9976049477912707
187
+ ],
188
+ [
189
+ "needed",
190
+ 0.9872038703943972
191
+ ],
192
+ [
193
+ "section",
194
+ 0.9712653235792772
195
+ ],
196
+ [
197
+ "both",
198
+ 0.936739855710452
199
+ ],
200
+ [
201
+ "risks",
202
+ 0.9068075576218514
203
+ ],
204
+ [
205
+ "information",
206
+ 0.9018883381886229
207
+ ],
208
+ [
209
+ "more",
210
+ 0.8122384634629694
211
+ ]
212
+ ],
213
+ "4": [
214
+ [
215
+ "gpt2",
216
+ 0.4932675297731254
217
+ ],
218
+ [
219
+ "team",
220
+ 0.4582824401382136
221
+ ],
222
+ [
223
+ "their",
224
+ 0.4041671222778528
225
+ ],
226
+ [
227
+ "cardhttpsgithubcomopenaigpt2blobmastermodelcardmd",
228
+ 0.4027027523328499
229
+ ],
230
+ [
231
+ "worked",
232
+ 0.4000615700284105
233
+ ],
234
+ [
235
+ "man",
236
+ 0.4000615700284105
237
+ ],
238
+ [
239
+ "examples",
240
+ 0.3826810158367596
241
+ ],
242
+ [
243
+ "card",
244
+ 0.37841251284183997
245
+ ],
246
+ [
247
+ "releasing",
248
+ 0.37020691048768467
249
+ ],
250
+ [
251
+ "generatedtext",
252
+ 0.36633590684014183
253
+ ]
254
+ ],
255
+ "5": [
256
+ [
257
+ "datasets",
258
+ 0.4655852272500585
259
+ ],
260
+ [
261
+ "internet",
262
+ 0.4632180977092728
263
+ ],
264
+ [
265
+ "unfiltered",
266
+ 0.4632180977092728
267
+ ],
268
+ [
269
+ "therefore",
270
+ 0.4572786367109269
271
+ ],
272
+ [
273
+ "lot",
274
+ 0.45052751090806786
275
+ ],
276
+ [
277
+ "far",
278
+ 0.44843349146591505
279
+ ],
280
+ [
281
+ "least",
282
+ 0.43181001148070325
283
+ ],
284
+ [
285
+ "from",
286
+ 0.4317049782136603
287
+ ],
288
+ [
289
+ "spanish",
290
+ 0.4228812607169984
291
+ ],
292
+ [
293
+ "contains",
294
+ 0.4189869183810361
295
+ ]
296
+ ],
297
+ "6": [
298
+ [
299
+ "dacy",
300
+ 0.5585722925848415
301
+ ],
302
+ [
303
+ "danish",
304
+ 0.5448223053975801
305
+ ],
306
+ [
307
+ "pipelines",
308
+ 0.4762154576109096
309
+ ],
310
+ [
311
+ "transformer",
312
+ 0.45909551554311984
313
+ ],
314
+ [
315
+ "bert",
316
+ 0.4560723670964845
317
+ ],
318
+ [
319
+ "stateoftheart",
320
+ 0.43890761608742057
321
+ ],
322
+ [
323
+ "vectors",
324
+ 0.4171033873896881
325
+ ],
326
+ [
327
+ "entropybased",
328
+ 0.4171033873896881
329
+ ],
330
+ [
331
+ "morphologizer",
332
+ 0.4171033873896881
333
+ ],
334
+ [
335
+ "ner",
336
+ 0.4171033873896881
337
+ ]
338
+ ],
339
+ "7": [
340
+ [
341
+ "your",
342
+ 0.5779547008577203
343
+ ],
344
+ [
345
+ "pythia",
346
+ 0.533302725435212
347
+ ],
348
+ [
349
+ "branch",
350
+ 0.533302725435212
351
+ ],
352
+ [
353
+ "checkpoints",
354
+ 0.533302725435212
355
+ ],
356
+ [
357
+ "provide",
358
+ 0.5255179253814468
359
+ ],
360
+ [
361
+ "you",
362
+ 0.5017001021320695
363
+ ],
364
+ [
365
+ "face",
366
+ 0.49279688086107165
367
+ ],
368
+ [
369
+ "hugging",
370
+ 0.49279688086107165
371
+ ],
372
+ [
373
+ "intended",
374
+ 0.4649625117440713
375
+ ],
376
+ [
377
+ "use",
378
+ 0.457852805651761
379
+ ]
380
+ ],
381
+ "8": [
382
+ [
383
+ "opt",
384
+ 0.3938333445473251
385
+ ],
386
+ [
387
+ "trained",
388
+ 0.3929995606746999
389
+ ],
390
+ [
391
+ "large",
392
+ 0.3894606240300861
393
+ ],
394
+ [
395
+ "software",
396
+ 0.37368561490751695
397
+ ],
398
+ [
399
+ "code",
400
+ 0.3692783616071311
401
+ ],
402
+ [
403
+ "impact",
404
+ 0.35450930158449734
405
+ ],
406
+ [
407
+ "to",
408
+ 0.3501577946670958
409
+ ],
410
+ [
411
+ "limited",
412
+ 0.3497691863778163
413
+ ],
414
+ [
415
+ "aim",
416
+ 0.3497691863778163
417
+ ],
418
+ [
419
+ "while",
420
+ 0.34819943887361066
421
+ ]
422
+ ],
423
+ "9": [
424
+ [
425
+ "al",
426
+ 0.8638378408615067
427
+ ],
428
+ [
429
+ "et",
430
+ 0.8578829364103318
431
+ ],
432
+ [
433
+ "identity",
434
+ 0.742895984959117
435
+ ],
436
+ [
437
+ "occupational",
438
+ 0.742895984959117
439
+ ],
440
+ [
441
+ "groups",
442
+ 0.742895984959117
443
+ ],
444
+ [
445
+ "protected",
446
+ 0.742895984959117
447
+ ],
448
+ [
449
+ "characteristics",
450
+ 0.742895984959117
451
+ ],
452
+ [
453
+ "across",
454
+ 0.7323536580412874
455
+ ],
456
+ [
457
+ "social",
458
+ 0.7323536580412874
459
+ ],
460
+ [
461
+ "classes",
462
+ 0.7323536580412874
463
+ ]
464
+ ]
465
+ },
466
+ "topics": [
467
+ 1,
468
+ 1,
469
+ 1,
470
+ 0,
471
+ 4,
472
+ 3,
473
+ 2,
474
+ 8,
475
+ 1,
476
+ 8,
477
+ 0,
478
+ 0,
479
+ 1,
480
+ 7,
481
+ 4,
482
+ 0,
483
+ 1,
484
+ 2,
485
+ 5,
486
+ 1,
487
+ 8,
488
+ 4,
489
+ 4,
490
+ 1,
491
+ 1,
492
+ 0,
493
+ 8,
494
+ 5,
495
+ 6,
496
+ 0,
497
+ 5,
498
+ 0,
499
+ 0,
500
+ 5,
501
+ 0,
502
+ 0,
503
+ -1,
504
+ 0,
505
+ 8,
506
+ 0,
507
+ 7,
508
+ 2,
509
+ 0,
510
+ -1,
511
+ 4,
512
+ 0,
513
+ 0,
514
+ 3,
515
+ 0,
516
+ 0,
517
+ 8,
518
+ 0,
519
+ 2,
520
+ 5,
521
+ 3,
522
+ 8,
523
+ 1,
524
+ 0,
525
+ 0,
526
+ 0,
527
+ 9,
528
+ 8,
529
+ 6,
530
+ 1,
531
+ 3,
532
+ 0,
533
+ 0,
534
+ 7,
535
+ 5,
536
+ 0,
537
+ 6,
538
+ 4,
539
+ 0,
540
+ 6,
541
+ 1,
542
+ 1,
543
+ 0,
544
+ 4,
545
+ 8,
546
+ 0,
547
+ 1,
548
+ 3,
549
+ 3,
550
+ 1,
551
+ 8,
552
+ -1,
553
+ 2,
554
+ 2,
555
+ 5,
556
+ 1,
557
+ 2,
558
+ 4,
559
+ 0,
560
+ 0,
561
+ 2,
562
+ 1,
563
+ 0,
564
+ 0,
565
+ 0,
566
+ 0,
567
+ 6,
568
+ 0,
569
+ 0,
570
+ 0,
571
+ 0,
572
+ -1,
573
+ 1,
574
+ 1,
575
+ 0,
576
+ 0,
577
+ 9,
578
+ 0,
579
+ 8,
580
+ 5,
581
+ 1,
582
+ 3,
583
+ 0,
584
+ 0,
585
+ 7,
586
+ 4,
587
+ 0,
588
+ 5,
589
+ 9,
590
+ 1,
591
+ 3,
592
+ 7,
593
+ 7,
594
+ 0,
595
+ 1,
596
+ 0,
597
+ 2,
598
+ 0,
599
+ 2,
600
+ 4,
601
+ 7,
602
+ 0,
603
+ 0,
604
+ 8,
605
+ 0,
606
+ 0,
607
+ 6,
608
+ -1,
609
+ 0,
610
+ 0,
611
+ 1,
612
+ 3,
613
+ 5,
614
+ 0,
615
+ 4,
616
+ 0,
617
+ 0,
618
+ 1,
619
+ 4,
620
+ 7,
621
+ 3,
622
+ 1,
623
+ 0,
624
+ 4,
625
+ 8,
626
+ 0,
627
+ 0,
628
+ 0,
629
+ 6,
630
+ -1,
631
+ 0,
632
+ 1,
633
+ 9,
634
+ 2,
635
+ 1,
636
+ 0,
637
+ 6,
638
+ 0,
639
+ 0,
640
+ 4,
641
+ 1,
642
+ 0,
643
+ 9,
644
+ 1,
645
+ 1,
646
+ 6,
647
+ 3,
648
+ 5,
649
+ 2,
650
+ 2,
651
+ 2,
652
+ 6,
653
+ -1,
654
+ 2,
655
+ -1,
656
+ 2,
657
+ 0,
658
+ 5,
659
+ 2,
660
+ 4,
661
+ 2,
662
+ 5,
663
+ 6,
664
+ 0,
665
+ 3,
666
+ 0,
667
+ 0,
668
+ 9,
669
+ 5,
670
+ 0,
671
+ 0,
672
+ 1,
673
+ 3,
674
+ 0,
675
+ 4,
676
+ 2,
677
+ 0,
678
+ 0,
679
+ 0,
680
+ 4,
681
+ 9,
682
+ 3,
683
+ 0,
684
+ 7,
685
+ 0,
686
+ 0,
687
+ 4,
688
+ 0,
689
+ 3,
690
+ 8,
691
+ 0,
692
+ 0,
693
+ 1,
694
+ 1,
695
+ 3,
696
+ 0,
697
+ 3,
698
+ 6,
699
+ 3,
700
+ -1,
701
+ 0,
702
+ 1,
703
+ 2,
704
+ 0,
705
+ 0,
706
+ 0,
707
+ 1,
708
+ 0,
709
+ 6,
710
+ 3,
711
+ 4,
712
+ 4,
713
+ 0,
714
+ 7,
715
+ -1,
716
+ 6,
717
+ 0,
718
+ 1,
719
+ 2,
720
+ 0,
721
+ 1,
722
+ 7,
723
+ 9,
724
+ 4,
725
+ 1,
726
+ -1,
727
+ 0,
728
+ 0,
729
+ 1,
730
+ 7,
731
+ 0,
732
+ 0,
733
+ 0,
734
+ 5,
735
+ 0,
736
+ 9,
737
+ 4,
738
+ 1,
739
+ 7,
740
+ 4,
741
+ 1,
742
+ 0,
743
+ 0,
744
+ 5,
745
+ 0,
746
+ 2,
747
+ 0,
748
+ 0,
749
+ 8,
750
+ -1,
751
+ 0,
752
+ 9,
753
+ 0,
754
+ 6,
755
+ 0,
756
+ 0,
757
+ 0,
758
+ 3,
759
+ 6,
760
+ 9,
761
+ 0,
762
+ 0,
763
+ 3,
764
+ 3,
765
+ 0,
766
+ 1,
767
+ 9,
768
+ 0,
769
+ 3,
770
+ 3,
771
+ 0,
772
+ 5,
773
+ 4,
774
+ 0,
775
+ 5,
776
+ 3,
777
+ 1,
778
+ 5,
779
+ 6,
780
+ 0,
781
+ 0,
782
+ 0,
783
+ 0,
784
+ 0,
785
+ 0,
786
+ 0,
787
+ 3,
788
+ 0,
789
+ -1,
790
+ 5,
791
+ 3,
792
+ 2,
793
+ 0,
794
+ 6,
795
+ 2,
796
+ 2,
797
+ 9,
798
+ 0,
799
+ 0,
800
+ 0,
801
+ 0,
802
+ 3,
803
+ 1,
804
+ 0,
805
+ 5,
806
+ 4,
807
+ 0,
808
+ 5,
809
+ 6,
810
+ 0,
811
+ 4,
812
+ 0,
813
+ 3,
814
+ 4,
815
+ 1,
816
+ 0,
817
+ 7,
818
+ 2,
819
+ 2,
820
+ 5,
821
+ 7,
822
+ 2,
823
+ 3,
824
+ 2,
825
+ 2,
826
+ 2,
827
+ 0,
828
+ 0,
829
+ 1,
830
+ 6,
831
+ 1,
832
+ 0,
833
+ 5,
834
+ 0,
835
+ 3,
836
+ 0,
837
+ 1,
838
+ 0,
839
+ 0,
840
+ 3,
841
+ 5,
842
+ 2,
843
+ 0
844
+ ],
845
+ "topic_sizes": {
846
+ "0": 137,
847
+ "1": 48,
848
+ "2": 32,
849
+ "3": 32,
850
+ "4": 27,
851
+ "5": 25,
852
+ "6": 20,
853
+ "7": 15,
854
+ "8": 15,
855
+ "-1": 13,
856
+ "9": 13
857
+ },
858
+ "topic_mapper": [
859
+ [
860
+ -1,
861
+ -1,
862
+ -1
863
+ ],
864
+ [
865
+ 0,
866
+ 0,
867
+ 0
868
+ ],
869
+ [
870
+ 1,
871
+ 1,
872
+ 2
873
+ ],
874
+ [
875
+ 2,
876
+ 2,
877
+ 7
878
+ ],
879
+ [
880
+ 3,
881
+ 3,
882
+ 9
883
+ ],
884
+ [
885
+ 4,
886
+ 4,
887
+ 6
888
+ ],
889
+ [
890
+ 5,
891
+ 5,
892
+ 5
893
+ ],
894
+ [
895
+ 6,
896
+ 6,
897
+ 3
898
+ ],
899
+ [
900
+ 7,
901
+ 7,
902
+ 1
903
+ ],
904
+ [
905
+ 8,
906
+ 8,
907
+ 8
908
+ ],
909
+ [
910
+ 9,
911
+ 9,
912
+ 4
913
+ ]
914
+ ],
915
+ "topic_labels": {
916
+ "-1": "-1_evaluation_claim_reasoning_parameters",
917
+ "0": "0_checkpoint_fairly_characterized_even",
918
+ "1": "1_generative_research_uses_processes",
919
+ "2": "2_checkpoint_try_snippet_sectionhttpshuggingfacecobertbaseuncased",
920
+ "3": "3_meant_technical_sociotechnical_convey",
921
+ "4": "4_gpt2_team_their_cardhttpsgithubcomopenaigpt2blobmastermodelcardmd",
922
+ "5": "5_datasets_internet_unfiltered_therefore",
923
+ "6": "6_dacy_danish_pipelines_transformer",
924
+ "7": "7_your_pythia_branch_checkpoints",
925
+ "8": "8_opt_trained_large_software",
926
+ "9": "9_al_et_identity_occupational"
927
+ },
928
+ "custom_labels": null,
929
+ "_outliers": 1
930
+ }