Make it clear
Browse files
README.md
CHANGED
@@ -1,48 +1,15 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
- en
|
4 |
-
- fr
|
5 |
-
- ro
|
6 |
-
- de
|
7 |
-
- multilingual
|
8 |
-
|
9 |
-
widget:
|
10 |
-
- text: "Translate to German: My name is Arthur"
|
11 |
-
example_title: "Translation"
|
12 |
-
- text: "Please answer to the following question. Who is going to be the next Ballon d'or?"
|
13 |
-
example_title: "Question Answering"
|
14 |
-
- text: "Q: Can Geoffrey Hinton have a conversation with George Washington? Give the rationale before answering."
|
15 |
-
example_title: "Logical reasoning"
|
16 |
-
- text: "Please answer the following question. What is the boiling point of Nitrogen?"
|
17 |
-
example_title: "Scientific knowledge"
|
18 |
-
- text: "Answer the following yes/no question. Can you write a whole Haiku in a single tweet?"
|
19 |
-
example_title: "Yes/no question"
|
20 |
-
- text: "Answer the following yes/no question by reasoning step-by-step. Can you write a whole Haiku in a single tweet?"
|
21 |
-
example_title: "Reasoning task"
|
22 |
-
- text: "Q: ( False or not False or False ) is? A: Let's think step by step"
|
23 |
-
example_title: "Boolean Expressions"
|
24 |
-
- text: "The square root of x is the cube root of y. What is y to the power of 2, if x = 4?"
|
25 |
-
example_title: "Math reasoning"
|
26 |
-
- text: "Premise: At my age you will probably have learnt one lesson. Hypothesis: It's not certain how many lessons you'll learn by your thirties. Does the premise entail the hypothesis?"
|
27 |
-
example_title: "Premise and hypothesis"
|
28 |
-
|
29 |
tags:
|
30 |
- text2text-generation
|
31 |
-
|
|
|
|
|
32 |
datasets:
|
33 |
-
-
|
34 |
-
- taskmaster2
|
35 |
-
- djaym7/wiki_dialog
|
36 |
-
- deepmind/code_contests
|
37 |
-
- lambada
|
38 |
-
- gsm8k
|
39 |
-
- aqua_rat
|
40 |
-
- esnli
|
41 |
-
- quasc
|
42 |
-
- qed
|
43 |
-
|
44 |
-
|
45 |
license: apache-2.0
|
|
|
46 |
---
|
47 |
|
48 |
|
@@ -123,8 +90,11 @@ The model is intended for research purposes only.
|
|
123 |
|
124 |
Check out https://github.com/amazon-science/text_generation_diffusion_llm_topic
|
125 |
|
126 |
-
|
|
|
|
|
127 |
|
|
|
128 |
@inproceedings{xu-etal-2023-detime,
|
129 |
title = "{D}e{T}i{ME}: Diffusion-Enhanced Topic Modeling using Encoder-decoder based {LLM}",
|
130 |
author = "Xu, Weijie and
|
@@ -144,4 +114,4 @@ Check out https://github.com/amazon-science/text_generation_diffusion_llm_topic
|
|
144 |
pages = "9040--9057",
|
145 |
abstract = "In the burgeoning field of natural language processing, Neural Topic Models (NTMs) and Large Language Models (LLMs) have emerged as areas of significant research interest. Despite this, NTMs primarily utilize contextual embeddings from LLMs, which are not optimal for clustering or capable for topic generation. Our study addresses this gap by introducing a novel framework named Diffusion-Enhanced Topic Modeling using Encoder-Decoder-based LLMs (DeTiME). DeTiME leverages Encoder-Decoder-based LLMs to produce highly clusterable embeddings that could generate topics that exhibit both superior clusterability and enhanced semantic coherence compared to existing methods. Additionally, by exploiting the power of diffusion, our framework also provides the capability to generate content relevant to the identified topics. This dual functionality allows users to efficiently produce highly clustered topics and related content simultaneously. DeTiME{'}s potential extends to generating clustered embeddings as well. Notably, our proposed framework proves to be efficient to train and exhibits high adaptability, demonstrating its potential for a wide array of applications.",
|
146 |
}
|
147 |
-
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
- en
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
tags:
|
5 |
- text2text-generation
|
6 |
+
- topic-modeling
|
7 |
+
- diffusion
|
8 |
+
- text-diffusion
|
9 |
datasets:
|
10 |
+
- xwjzds/paraphrase_collections_enhanced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
license: apache-2.0
|
12 |
+
pipeline_tag: text2text-generation
|
13 |
---
|
14 |
|
15 |
|
|
|
90 |
|
91 |
Check out https://github.com/amazon-science/text_generation_diffusion_llm_topic
|
92 |
|
93 |
+
# Citation
|
94 |
+
|
95 |
+
**BibTeX:**
|
96 |
|
97 |
+
```bibtex
|
98 |
@inproceedings{xu-etal-2023-detime,
|
99 |
title = "{D}e{T}i{ME}: Diffusion-Enhanced Topic Modeling using Encoder-decoder based {LLM}",
|
100 |
author = "Xu, Weijie and
|
|
|
114 |
pages = "9040--9057",
|
115 |
abstract = "In the burgeoning field of natural language processing, Neural Topic Models (NTMs) and Large Language Models (LLMs) have emerged as areas of significant research interest. Despite this, NTMs primarily utilize contextual embeddings from LLMs, which are not optimal for clustering or capable for topic generation. Our study addresses this gap by introducing a novel framework named Diffusion-Enhanced Topic Modeling using Encoder-Decoder-based LLMs (DeTiME). DeTiME leverages Encoder-Decoder-based LLMs to produce highly clusterable embeddings that could generate topics that exhibit both superior clusterability and enhanced semantic coherence compared to existing methods. Additionally, by exploiting the power of diffusion, our framework also provides the capability to generate content relevant to the identified topics. This dual functionality allows users to efficiently produce highly clustered topics and related content simultaneously. DeTiME{'}s potential extends to generating clustered embeddings as well. Notably, our proposed framework proves to be efficient to train and exhibits high adaptability, demonstrating its potential for a wide array of applications.",
|
116 |
}
|
117 |
+
```
|