theblackcat102 commited on
Commit
1015917
1 Parent(s): 11909cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -34
README.md CHANGED
@@ -1,8 +1,17 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
- # Model Card for Model ID (WIP)
6
 
7
  <!-- Provide a quick summary of what the model is/does. -->
8
 
@@ -15,49 +24,32 @@ This modelcard aims to be a base template for new models. It has been generated
15
  <!-- Provide a longer summary of what this model is. -->
16
 
17
 
18
-
19
- - **Developed by:** [More Information Needed]
20
- - **Shared by [optional]:** [More Information Needed]
21
- - **Model type:** [More Information Needed]
22
- - **Language(s) (NLP):** [More Information Needed]
23
- - **License:** [More Information Needed]
24
- - **Finetuned from model [optional]:** [More Information Needed]
25
 
26
  ## Model Sources [optional]
27
 
28
  <!-- Provide the basic links for the model. -->
29
 
30
- - **Repository:** [More Information Needed]
31
- - **Paper [optional]:** [More Information Needed]
32
- - **Demo [optional]:** [More Information Needed]
33
 
34
  # Uses
35
 
36
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
37
 
 
38
  ## Direct Use
39
 
40
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
41
-
42
- [More Information Needed]
43
-
44
- ## Downstream Use [optional]
45
-
46
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
47
-
48
- [More Information Needed]
49
-
50
- ## Out-of-Scope Use
51
-
52
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
53
-
54
- [More Information Needed]
55
 
56
  # Bias, Risks, and Limitations
57
 
58
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
59
 
60
- [More Information Needed]
61
 
62
  ## Recommendations
63
 
@@ -69,7 +61,29 @@ Users (both direct and downstream) should be made aware of the risks, biases and
69
 
70
  Use the code below to get started with the model.
71
 
72
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
 
74
  # Training Details
75
 
@@ -77,15 +91,11 @@ Use the code below to get started with the model.
77
 
78
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
79
 
80
- [More Information Needed]
81
-
82
  ## Training Procedure
83
 
84
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
85
-
86
- ### Preprocessing [optional]
87
-
88
- [More Information Needed]
89
 
90
 
91
  ### Training Hyperparameters
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - sft
7
+ pipeline_tag: text-generation
8
+ widget:
9
+ - text: <prefix>You are a helpful assistant model trained by LAION called Aki</prefix><human>Hi, how are you?<bot>
10
+ - text: <human>What's the Earth total population<bot>
11
+ - text: <human>Write a story about future of AI development<bot>
12
  ---
13
 
14
+ # Pythia 3B SFT model
15
 
16
  <!-- Provide a quick summary of what the model is/does. -->
17
 
 
24
  <!-- Provide a longer summary of what this model is. -->
25
 
26
 
27
+ - **Developed by:** Open Assistant
28
+ - **Model type:** Pythia
29
+ - **Language(s) (NLP):** English
30
+ - **License:** Apache-2.0
 
 
 
31
 
32
  ## Model Sources [optional]
33
 
34
  <!-- Provide the basic links for the model. -->
35
 
36
+ - **Repository:** [Open Assistant](https://github.com/LAION-AI/Open-Assistant)
 
 
37
 
38
  # Uses
39
 
40
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
41
 
42
+
43
  ## Direct Use
44
 
45
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
46
+ See the example on the right
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
  # Bias, Risks, and Limitations
49
 
50
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
51
 
52
+ [just read pythia](https://huggingface.co/EleutherAI/pythia-12b#out-of-scope-use)
53
 
54
  ## Recommendations
55
 
 
61
 
62
  Use the code below to get started with the model.
63
 
64
+ ```python
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer
66
+
67
+ model_name = "theblackcat102/pythia-3b-deduped-sft"
68
+
69
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
70
+ model = AutoModelForCausalLM.from_pretrained(model_name).half().eval().cuda()
71
+
72
+ input_text = "<human>What's the earth population?<bot>"
73
+ inputs = tokenizer(input_text, return_tensors="pt", padding=True).to(0)
74
+ outputs = model.generate(
75
+ **inputs,
76
+ early_stopping=True,
77
+ max_new_tokens=args.max_new_tokens,
78
+ do_sample=True,
79
+ top_k=args.top_k,
80
+ temperature=args.temperature,
81
+ pad_token_id=tokenizer.eos_token_id,
82
+ # dialogue_collator.py line 36
83
+ )
84
+ output = tokenizer.decode(outputs[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"])
85
+ print(output)
86
+ ```
87
 
88
  # Training Details
89
 
 
91
 
92
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
93
 
 
 
94
  ## Training Procedure
95
 
96
+ ```
97
+ deepspeed trainer_sft.py --configs defaults pythia-3b --deepspeed
98
+ ```
 
 
99
 
100
 
101
  ### Training Hyperparameters