Spaces:

abdullahmeda
/

detect-ai-text

Running

App Files Files Community

abdullahmeda commited on Jan 24, 2024

Commit

b3c492a

verified ·

1 Parent(s): b76af8a

Update app.py

Browse files

Files changed (1) hide show

app.py +9 -0

app.py CHANGED Viewed

@@ -136,6 +136,15 @@ with gr.Blocks() as demo:
             predictions, and is therefore considered to be a better model. The training of LMs is carried out on large-scale text corpora, it can \
             be considered that it has learned some common language patterns and text structures. Therefore, PPL can be used to measure how \
             well a text conforms to common characteristics.
             ### GLTR: Giant Language Model Test Room
             This idea originates from the following paper: arxiv.org/pdf/1906.04043.pdf. It studies 3 tests to compute features of an input text. Their \

             predictions, and is therefore considered to be a better model. The training of LMs is carried out on large-scale text corpora, it can \
             be considered that it has learned some common language patterns and text structures. Therefore, PPL can be used to measure how \
             well a text conforms to common characteristics.
+            I used all variants of the open-source GPT-2 model except xl size to compute the PPL (both text-level and sentence-level PPLs) of the \
+            collected texts. It is observed that, regardless of whether it is at the text level or the sentence level, the content generated by LLMs \
+            have relatively lower PPLs compared to the text written by humans. LLM captured common patterns and structures in the text it was trained on, \
+            and is very good at reproducing them. As a result, text generated by LLMs have relatively concentrated low PPLs.
+            Humans have the ability to express themselves in a wide variety of ways, depending on the context, audience, and purpose of the text they are \
+            writing. This can include using creative or imaginative elements, such as metaphors, similes, and unique word choices, which can make it more \
+            difficult for GPT2 to predict.
             ### GLTR: Giant Language Model Test Room
             This idea originates from the following paper: arxiv.org/pdf/1906.04043.pdf. It studies 3 tests to compute features of an input text. Their \