Update README.md
Browse files
README.md
CHANGED
@@ -8,31 +8,41 @@ pipeline_tag: text-generation
|
|
8 |
widget:
|
9 |
- text: Introduction to Vertex AI Feature Store
|
10 |
example_title: Example 1
|
11 |
-
- text:
|
12 |
exmaple_title: Example 2
|
13 |
tags:
|
14 |
- Text-Generation
|
15 |
-
- Scripts
|
16 |
---
|
17 |
-
# Script_GPT
|
18 |
-
## Model Details
|
19 |
-
The Script_GPT is a language model developed using the Hugging Face Transformers library. It is trained on a custom dataset of YouTube scripts and can be used to generate new scripts for YouTube videos.
|
20 |
-
The model is based on the GPT architecture and has a total of 117M parameters.
|
21 |
|
22 |
-
|
23 |
-
The Script_GPT model is intended to be used for generating scripts for YouTube videos. It can be used by content creators, marketers, and other individuals who want to produce high-quality scripts for their YouTube channels.
|
24 |
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
|
29 |
-
##
|
30 |
-
|
31 |
-
|
|
|
|
|
32 |
|
33 |
-
|
34 |
-
|
35 |
-
To use
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
```python
|
37 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
38 |
|
@@ -40,11 +50,31 @@ tokenizer = AutoTokenizer.from_pretrained("SRDdev/Script_GPT")
|
|
40 |
model = AutoModelForCausalLM.from_pretrained("SRDdev/Script_GPT")
|
41 |
```
|
42 |
|
43 |
-
|
44 |
-
To generate scripts using the Script_GPT model, you can use the following code:
|
45 |
```python
|
46 |
from transformers import pipeline
|
47 |
-
generator = pipeline('text-generation', model=
|
48 |
-
|
49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
```
|
|
|
8 |
widget:
|
9 |
- text: Introduction to Vertex AI Feature Store
|
10 |
example_title: Example 1
|
11 |
+
- text: What are Kubeflow Components?
|
12 |
exmaple_title: Example 2
|
13 |
tags:
|
14 |
- Text-Generation
|
|
|
15 |
---
|
|
|
|
|
|
|
|
|
16 |
|
17 |
+
# SCRIPTGPT
|
|
|
18 |
|
19 |
+
Pretrained model on the English language using a causal language modeling (CLM) objective. It was introduced in
|
20 |
+
[this paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)
|
21 |
+
and first released at [this page](https://openai.com/blog/better-language-models/).
|
22 |
|
23 |
+
## Model description
|
24 |
+
ScriptGPT is a language model trained on a dataset of 5,000 YouTube videos that explain artificial intelligence (AI) concepts.
|
25 |
+
ScriptGPT is a Causal language transformer. The model resembles the GPT2 architecture,
|
26 |
+
the model is a Causal Language model meaning it predicts the probability of a sequence of words based on the preceding words in the sequence.
|
27 |
+
It generates a probability distribution over the next word given the previous words, without incorporating future words.
|
28 |
|
29 |
+
The goal of ScriptGPT is to generate scripts for AI videos that are coherent, informative, and engaging.
|
30 |
+
This can be useful for content creators who are looking for inspiration or who want to automate the process of generating video scripts.
|
31 |
+
To use ScriptGPT, users can provide a prompt or a starting sentence, and the model will generate a sequence of words that follow the context and style of the training data.
|
32 |
+
|
33 |
+
The current model is the smallest one with 124 million parameters (ScriptGPT)
|
34 |
+
|
35 |
+
More models are coming soon...
|
36 |
+
|
37 |
+
## Intended uses
|
38 |
+
The intended uses of ScriptGPT include generating scripts for videos that explain artificial intelligence concepts, providing inspiration for content creators, and
|
39 |
+
automating the process of generating video scripts.
|
40 |
+
|
41 |
+
|
42 |
+
## How to use
|
43 |
+
You can use this model directly with a pipeline for text generation.
|
44 |
+
|
45 |
+
1. __Load Model__
|
46 |
```python
|
47 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
48 |
|
|
|
50 |
model = AutoModelForCausalLM.from_pretrained("SRDdev/Script_GPT")
|
51 |
```
|
52 |
|
53 |
+
2. __Pipeline__
|
|
|
54 |
```python
|
55 |
from transformers import pipeline
|
56 |
+
generator = pipeline('text-generation', model= model , tokenizer=tokenizer)
|
57 |
+
|
58 |
+
context = "Introduction to Vertex AI Feature Store"
|
59 |
+
length_to_generate = 200
|
60 |
+
|
61 |
+
script = generator(context, max_length=length_to_generate, do_sample=True)[0]['generated_text']
|
62 |
+
```
|
63 |
+
<p style="opacity: 0.8">Keeping the context more technical and related to AI will generate better outputs</p>
|
64 |
+
|
65 |
+
## Limitations and bias
|
66 |
+
> The model is trained on Youtube Scripts and will work better for that. It may also generate random information and users should be aware of that and cross-validate the results.
|
67 |
+
|
68 |
+
The used is linked [here](https://www.kaggle.com/datasets/jfcaro/5000-transcripts-of-youtube-ai-related-videos)
|
69 |
+
|
70 |
+
## Citations
|
71 |
+
```
|
72 |
+
@model{
|
73 |
+
Name=Shreyas Dixit
|
74 |
+
framework=Pytorch
|
75 |
+
Year=Jan 2023
|
76 |
+
Pipeline=text-generation
|
77 |
+
Github=https://github.com/SRDdev
|
78 |
+
LinkedIn=https://www.linkedin.com/in/srddev
|
79 |
+
}
|
80 |
```
|