Spaces:
Sleeping
Sleeping
abdullahmeda
commited on
Commit
•
ffcb078
1
Parent(s):
6abdf31
Update app.py
Browse files
app.py
CHANGED
@@ -31,11 +31,7 @@ conforms to common characteristics.
|
|
31 |
I used all variants of the open-source GPT-2 model except xl size to compute the PPL (both text-level and sentence-level PPLs) of the collected \
|
32 |
texts. It is observed that, regardless of whether it is at the text level or the sentence level, the content generated by LLMs have relatively \
|
33 |
lower PPLs compared to the text written by humans. LLM captured common patterns and structures in the text it was trained on, and is very good at \
|
34 |
-
reproducing them. As a result, text generated by LLMs have relatively concentrated low PPLs
|
35 |
-
|
36 |
-
Humans have the ability to express themselves in a wide variety of ways, depending on the context, audience, and purpose of the text they are \
|
37 |
-
writing. This can include using creative or imaginative elements, such as metaphors, similes, and unique word choices, which can make it more \
|
38 |
-
difficult for GPT2 to predict. The PPL distributions of text written by humans and text generated by LLMs are shown in the figure below.\
|
39 |
"""
|
40 |
|
41 |
|
@@ -124,11 +120,11 @@ with gr.Blocks() as demo:
|
|
124 |
## Detect text generated using LLMs 🤖
|
125 |
|
126 |
Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
|
127 |
-
texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle.
|
128 |
|
|
|
129 |
- Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
|
130 |
- Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
|
131 |
-
- Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
|
132 |
|
133 |
### Linguistic Analysis: Language Model Perplexity
|
134 |
The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \
|
|
|
31 |
I used all variants of the open-source GPT-2 model except xl size to compute the PPL (both text-level and sentence-level PPLs) of the collected \
|
32 |
texts. It is observed that, regardless of whether it is at the text level or the sentence level, the content generated by LLMs have relatively \
|
33 |
lower PPLs compared to the text written by humans. LLM captured common patterns and structures in the text it was trained on, and is very good at \
|
34 |
+
reproducing them. As a result, text generated by LLMs have relatively concentrated low PPLs.\
|
|
|
|
|
|
|
|
|
35 |
"""
|
36 |
|
37 |
|
|
|
120 |
## Detect text generated using LLMs 🤖
|
121 |
|
122 |
Linguistic features such as Perplexity and other SOTA methods such as GLTR were used to classify between Human written and LLM Generated \
|
123 |
+
texts. This solution scored an ROC of 0.956 and 8th position in the DAIGT LLM Competition on Kaggle.
|
124 |
|
125 |
+
- Source & Credits: [https://github.com/Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)
|
126 |
- Competition: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/leaderboard)
|
127 |
- Solution WriteUp: [https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224](https://www.kaggle.com/competitions/llm-detect-ai-generated-text/discussion/470224)
|
|
|
128 |
|
129 |
### Linguistic Analysis: Language Model Perplexity
|
130 |
The perplexity (PPL) is commonly used as a metric for evaluating the performance of language models (LM). It is defined as the exponential \
|