Files changed (1) hide show
  1. README.md +131 -0
README.md CHANGED
@@ -1,3 +1,134 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ datasets:
6
+ - snorkelai/snorkel-curated-instruction-tuning
7
+ inference:
8
+ parameters:
9
+ temperature: 0.7
10
+ top_p: 0.7
11
+ top_k: 50
12
+ max_new_tokens: 512
13
+ pipeline_tag: text-generation
14
  ---
15
+
16
+ # RedPajama-7B-Chat-Curated
17
+
18
+ The model is created by fine-tuning the RedPajama Base model on [snorkelai/snorkel-curated-instruction-tuning](https://huggingface.co/datasets/snorkelai/snorkel-curated-instruction-tuning) to enhance chatting ability further with high-quality instruction-response pairs.
19
+
20
+ For a more comprehensive understanding of our methodology, please visit our blog - [How we built a better GenAI with programmatic data development](snorkel.ai/how-we-built-a-better-genai-with-programmatic-data-development).
21
+
22
+ - Chat-Curated Version: [snorkelai/RedPajama-7B-Chat-Curated](https://huggingface.co/snorkelai/RedPajama-7B-Chat-Curated)
23
+ - Instruction Tuning Dataset: [snorkelai/snorkel-curated-instruction-tuning](https://huggingface.co/datasets/snorkelai/snorkel-curated-instruction-tuning)
24
+ - Base Model: [RedPajama-INCITE-7B-Base](https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Base)
25
+
26
+ ## Model Details
27
+ - **Developed by**: Snorkel AI.
28
+ - **Model type**: Language Model
29
+ - **Language(s)**: English
30
+ - **License**: Apache 2.0
31
+ - **Model Description**: A 6.9B parameter pretrained language model.
32
+
33
+ # Quick Start
34
+
35
+ Please note that the model requires `transformers` version >= 4.25.1.
36
+
37
+ To prompt the chat model, use the following format:
38
+ ```
39
+ <human>: [Chat]
40
+ <bot>:
41
+ ```
42
+
43
+ ## GPU Inference
44
+
45
+ This requires a GPU with 16GB memory.
46
+
47
+ ```python
48
+ import torch
49
+ import transformers
50
+ from transformers import AutoTokenizer, AutoModelForCausalLM
51
+
52
+ MIN_TRANSFORMERS_VERSION = '4.25.1'
53
+
54
+ # check transformers version
55
+ assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.'
56
+
57
+ # init
58
+ tokenizer = AutoTokenizer.from_pretrained("snorkelai/RedPajama-7B-Chat-Curated")
59
+ model = AutoModelForCausalLM.from_pretrained("snorkelai/RedPajama-7B-Chat-Curated", torch_dtype=torch.float16)
60
+ model = model.to('cuda:0')
61
+ # infer
62
+ prompt = "<human>: Who is Alan Turing?\n<bot>:"
63
+ inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
64
+ input_length = inputs.input_ids.shape[1]
65
+ outputs = model.generate(
66
+ **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True
67
+ )
68
+ token = outputs.sequences[0, input_length:]
69
+ output_str = tokenizer.decode(token)
70
+ print(output_str)
71
+ """
72
+ Alan Mathison Turing (23 June 1912 7 June 1954) was an English computer scientist, mathematician, logician, cryptanalyst, philosopher, mathematician, and theoretical biologist.
73
+ """
74
+ ```
75
+
76
+ To do GPU Inference in Int8 or CPU inference, please refer to the `togethercomputer/RedPajama-INCITE-7B-Chat` documentation.
77
+
78
+ # Uses
79
+
80
+ ## Direct Use
81
+
82
+ Excluded uses are described below.
83
+
84
+ ### Misuse, Malicious Use, and Out-of-Scope Use
85
+
86
+ It is the responsibility of the end user to ensure that the model is used in a responsible and ethical manner.
87
+
88
+ #### Out-of-Scope Use
89
+
90
+ `RedPajama-7B-Chat-Curated` is a language model and may not perform well for other use cases outside of its intended scope.
91
+ For example, it may not be suitable for use in safety-critical applications or for making decisions that have a significant impact on individuals or society.
92
+ It is important to consider the limitations of the model and to only use it for its intended purpose.
93
+
94
+ #### Misuse and Malicious Use
95
+
96
+ `RedPajama-7B-Chat-Curated` is designed for language modeling.
97
+ Misuse of the model, such as using it to engage in illegal or unethical activities, is strictly prohibited and goes against the principles of the project.
98
+
99
+ Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
100
+
101
+ - Generating fake news, misinformation, or propaganda
102
+ - Promoting hate speech, discrimination, or violence against individuals or groups
103
+ - Impersonating individuals or organizations without their consent
104
+ - Engaging in cyberbullying or harassment
105
+ - Defamatory content
106
+ - Spamming or scamming
107
+ - Sharing confidential or sensitive information without proper authorization
108
+ - Violating the terms of use of the model or the data used to train it
109
+ - Creating automated bots for malicious purposes such as spreading malware, phishing scams, or spamming
110
+
111
+ ## Limitations
112
+
113
+ `RedPajama-7B-Chat-Curated`, like other language models, has limitations that should be taken into consideration.
114
+ For example, the model may not always provide accurate or relevant answers, particularly for questions that are complex, ambiguous, or outside of its training data.
115
+ We therefore welcome contributions from individuals and organizations, and encourage collaboration towards creating a more robust and inclusive chatbot.
116
+
117
+ ## Training
118
+
119
+ **Training Data**
120
+
121
+ Please refer to [snorkelai/snorkel-curated-instruction-tuning](https://huggingface.co/datasets/snorkelai/snorkel-curated-instruction-tuning)
122
+
123
+ **Training Procedure**
124
+
125
+ - **Hardware:** 8 A100
126
+ - **Optimizer:** Adam
127
+ - **Gradient Accumulations**: 1
128
+ - **Num of Tokens:** 3.8M tokens
129
+ - **Learning rate:** 1e-5
130
+ - **Batch size:** 64
131
+
132
+ ## Community
133
+
134
+ Join us on [Snorkel AI Slack](snorkel.ai/slack)