xwjzds commited on
Commit
92ee2de
·
verified ·
1 Parent(s): d8bd233

Update basic read me

Browse files
Files changed (1) hide show
  1. README.md +147 -0
README.md ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - ro
6
+ - de
7
+ - multilingual
8
+
9
+ widget:
10
+ - text: "Translate to German: My name is Arthur"
11
+ example_title: "Translation"
12
+ - text: "Please answer to the following question. Who is going to be the next Ballon d'or?"
13
+ example_title: "Question Answering"
14
+ - text: "Q: Can Geoffrey Hinton have a conversation with George Washington? Give the rationale before answering."
15
+ example_title: "Logical reasoning"
16
+ - text: "Please answer the following question. What is the boiling point of Nitrogen?"
17
+ example_title: "Scientific knowledge"
18
+ - text: "Answer the following yes/no question. Can you write a whole Haiku in a single tweet?"
19
+ example_title: "Yes/no question"
20
+ - text: "Answer the following yes/no question by reasoning step-by-step. Can you write a whole Haiku in a single tweet?"
21
+ example_title: "Reasoning task"
22
+ - text: "Q: ( False or not False or False ) is? A: Let's think step by step"
23
+ example_title: "Boolean Expressions"
24
+ - text: "The square root of x is the cube root of y. What is y to the power of 2, if x = 4?"
25
+ example_title: "Math reasoning"
26
+ - text: "Premise: At my age you will probably have learnt one lesson. Hypothesis: It's not certain how many lessons you'll learn by your thirties. Does the premise entail the hypothesis?"
27
+ example_title: "Premise and hypothesis"
28
+
29
+ tags:
30
+ - text2text-generation
31
+
32
+ datasets:
33
+ - svakulenk0/qrecc
34
+ - taskmaster2
35
+ - djaym7/wiki_dialog
36
+ - deepmind/code_contests
37
+ - lambada
38
+ - gsm8k
39
+ - aqua_rat
40
+ - esnli
41
+ - quasc
42
+ - qed
43
+
44
+
45
+ license: apache-2.0
46
+ ---
47
+
48
+
49
+ # DeTiME
50
+
51
+ <!-- Provide a quick summary of what the model is/does. -->
52
+
53
+
54
+ DeTiME is a novel framework for topic modeling that leverages Encoder-Decoder-based Large Language Models (LLMs) to produce highly clusterable embeddings,
55
+ enabling the generation of topics with superior clusterability and enhanced semantic coherence. It also utilizes diffusion processes to generate content relevant to the identified topics,
56
+ allowing for the efficient production of highly clustered topics and related content simultaneously. DeTiME is efficient to train and highly adaptable,
57
+ making it suitable for a broad range of applications
58
+
59
+ ## Model Details
60
+
61
+ ### Model Description
62
+
63
+ DeTiME is a text to text generation model that can generate a text given a text prompt.
64
+
65
+ - **Developed by:** Amazon
66
+ - **Funded by:** Amazon
67
+ - **Model type:** Generative text-to-text model
68
+
69
+ ### Model Sources
70
+
71
+ For research purposes, we recommend our `DeTiME` Github repository (https://github.com/amazon-science/text_generation_diffusion_llm_topic).
72
+
73
+ - **Repository:** https://github.com/amazon-science/text_generation_diffusion_llm_topic
74
+ - **Paper:** https://aclanthology.org/2023.findings-emnlp.606.pdf
75
+
76
+ ### Model Overview
77
+ DeTiME is can extract the text input to 4096 dimension and reconstruct the original sentence
78
+
79
+
80
+
81
+ ## Code Example
82
+
83
+
84
+
85
+
86
+
87
+ ```python
88
+ # Load model directly
89
+ from transformers import AutoModel, AutoTokenizer
90
+ tokenizer = AutoTokenizer.from_pretrained('google/flan-t5-large')
91
+ model = AutoModel.from_pretrained("xwjzds/detime", trust_remote_code=True)
92
+ model.eval()
93
+ text = """
94
+ Repeat: U.S. prosecutors have arrested more than 130 individuals and have seized more than $17 million in a continuing crackdown on Internet fraud and abuse.""" #make sure to add Repeat at the beginning
95
+
96
+ inputs = tokenizer(text, return_tensors="pt", padding='max_length', max_length = 512).input_ids.cuda()
97
+ am = tokenizer(text, return_tensors="pt", padding='max_length', max_length = 512).attention_mask.cuda()
98
+ outputs = model.cuda().generate(inputs, am, max_length = 512)
99
+
100
+ #Now decoder_output will output low quality text generation
101
+ ```
102
+
103
+ ## Uses
104
+
105
+ ### Direct Use
106
+
107
+ The model is intended for research purposes for now. Possible research areas and tasks include
108
+
109
+ - Benchmark on text2text generation quality.
110
+ - Generate embeddings that can be used by diffuser to generate high quality text.
111
+ - Generate embeddings that can be used for topic modeling.
112
+ - Identify similar text or relevant topics.
113
+
114
+ Excluded uses are described below.
115
+
116
+
117
+
118
+ ### Recommendations
119
+
120
+ The model is intended for research purposes only.
121
+
122
+ ## How to Get Started with the Model
123
+
124
+ Check out https://github.com/amazon-science/text_generation_diffusion_llm_topic
125
+
126
+ ## Citation
127
+
128
+ @inproceedings{xu-etal-2023-detime,
129
+ title = "{D}e{T}i{ME}: Diffusion-Enhanced Topic Modeling using Encoder-decoder based {LLM}",
130
+ author = "Xu, Weijie and
131
+ Hu, Wenxiang and
132
+ Wu, Fanyou and
133
+ Sengamedu, Srinivasan",
134
+ editor = "Bouamor, Houda and
135
+ Pino, Juan and
136
+ Bali, Kalika",
137
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
138
+ month = dec,
139
+ year = "2023",
140
+ address = "Singapore",
141
+ publisher = "Association for Computational Linguistics",
142
+ url = "https://aclanthology.org/2023.findings-emnlp.606",
143
+ doi = "10.18653/v1/2023.findings-emnlp.606",
144
+ pages = "9040--9057",
145
+ abstract = "In the burgeoning field of natural language processing, Neural Topic Models (NTMs) and Large Language Models (LLMs) have emerged as areas of significant research interest. Despite this, NTMs primarily utilize contextual embeddings from LLMs, which are not optimal for clustering or capable for topic generation. Our study addresses this gap by introducing a novel framework named Diffusion-Enhanced Topic Modeling using Encoder-Decoder-based LLMs (DeTiME). DeTiME leverages Encoder-Decoder-based LLMs to produce highly clusterable embeddings that could generate topics that exhibit both superior clusterability and enhanced semantic coherence compared to existing methods. Additionally, by exploiting the power of diffusion, our framework also provides the capability to generate content relevant to the identified topics. This dual functionality allows users to efficiently produce highly clustered topics and related content simultaneously. DeTiME{'}s potential extends to generating clustered embeddings as well. Notably, our proposed framework proves to be efficient to train and exhibits high adaptability, demonstrating its potential for a wide array of applications.",
146
+ }
147
+ ---